Omni-Fake: Benchmarking Unified Multimodal Social Media Deepfake Detection
Authors: Tianxiao Li, Zhenglin Huang, Haiquan Wen, Yiwei He, Xinze Li, Bingyu Zhu, Wuhui Duan, Congang Chen, Zeyu Fu, Yi Dong, Baoyuan Wu, Jason Li, Guangliang Cheng
Published: 2026-05-02 22:56:17+00:00
Comment: Accepted to CVPR 2026
AI Summary
This paper introduces Omni-Fake, a unified omni-dataset for comprehensive multimodal deepfake detection in social-media settings, comprising Omni-Fake-Set (1M+ samples) and Omni-Fake-OOD (200k+ samples). On top of this benchmark, the authors propose Omni-Fake-R1, a reinforcement-learning-driven multimodal detector that adaptively integrates visual and auditory cues for joint detection, localization, and natural-language explanations. Extensive experiments demonstrate significant gains in detection accuracy, cross-modal generalization, and explainability compared to state-of-the-art baselines.
Abstract
Multimodal deepfakes are proliferating on social media and threaten authenticity, information integrity, and digital forensics. Existing benchmarks are constrained by their single-modality scope, simplified manipulations, or unrealistic distributions, which limit their ability to assess real-world robustness. To address these limitations, we present Omni-Fake, a unified omni-dataset for comprehensive multimodal deepfake detection in social-media settings. It comprises Omni-Fake-Set, a large-scale, high-quality dataset with 1M+ samples, and Omni-Fake-OOD, an out-of-distribution benchmark with 200k+ samples intentionally excluded from training to evaluate generalization. Omni-Fake spans four modalities (image, audio, video, and audio-video talking head) and supports a joint detection-localization-explanation protocol. On top of Omni-Fake, we further propose Omni-Fake-R1, a reinforcement-learning-driven multimodal detector that adaptively integrates visual and auditory cues and outputs structured decisions, localization, and natural-language explanations. Extensive experiments show significant gains in detection accuracy, cross-modal generalization, and explainability over state-of-the-art baselines. Project page: https://tianxiao1201.github.io/omni-fake-project-page/