A Sanity Check for Multi-In-Domain Face Forgery Detection in the Real World

View on arXiv ← Back to list

Authors: Jikang Cheng, Renye Yan, Zhiyuan Yan, Yaozhong Gan, Xueyi Zhang, Zhongyuan Wang, Wei Peng, Ling Liang

Published: 2025-12-04 14:21:08+00:00

AI Summary

The paper introduces the Multi-In-Domain Face Forgery Detection (MID-FFD) paradigm, which addresses the real-world challenge where domain discrepancies overshadow subtle real/fake features, leading to poor definitive classification accuracy. The authors propose DevDet, a model-agnostic framework featuring the Face Forgery Developer (FFDev) and a Dose-Adaptive Fine-Tuning (DAFT) strategy to amplify forgery traces and ensure real/fake distinctions dominate the feature space.

Abstract

Existing methods for deepfake detection aim to develop generalizable detectors. Although generalizable is the ultimate target once and for all, with limited training forgeries and domains, it appears idealistic to expect generalization that covers entirely unseen variations, especially given the diversity of real-world deepfakes. Therefore, introducing large-scale multi-domain data for training can be feasible and important for real-world applications. However, within such a multi-domain scenario, the differences between multiple domains, rather than the subtle real/fake distinctions, dominate the feature space. As a result, despite detectors being able to relatively separate real and fake within each domain (i.e., high AUC), they struggle with single-image real/fake judgments in domain-unspecified conditions (i.e., low ACC). In this paper, we first define a new research paradigm named Multi-In-Domain Face Forgery Detection (MID-FFD), which includes sufficient volumes of real-fake domains for training. Then, the detector should provide definitive real-fake judgments to the domain-unspecified inputs, which simulate the frame-by-frame independent detection scenario in the real world. Meanwhile, to address the domain-dominant issue, we propose a model-agnostic framework termed DevDet (Developer for Detector) to amplify real/fake differences and make them dominant in the feature space. DevDet consists of a Face Forgery Developer (FFDev) and a Dose-Adaptive detector Fine-Tuning strategy (DAFT). Experiments demonstrate our superiority in predicting real-fake under the MID-FFD scenario while maintaining original generalization ability to unseen data.

Key findings

The DevDet framework achieves superior performance in the MID-FFD scenario, significantly improving binary real/fake classification accuracy (ACC) compared to existing generalized detectors, with up to an 11.80% improvement on Protocol 1. The approach successfully mitigates the dominance of domain discrepancies in the feature space while effectively preserving the original generalization capability of the base detectors to extreme out-of-domain data.

Approach

DevDet is a two-stage process. Stage 1 trains the FFDev (a Developer Generator) to expose forgery traces, optimized using hard-fake and easy-real samples. Stage 2 employs Dose-Adaptive Fine-Tuning (DAFT), which utilizes a Dose Dictionary (DoseDict) to adaptively adjust the strength (dose) of the FFDev applied to the input image before detection, enhancing MID performance while preserving generalization ability.

Datasets

FaceForensics++ (FF++), Celeb-DF-v2 (CDF), DeepFake Detection Challenge Preview (DFDCP), WildDeekfake (WDF), DF40, Celeb-DF++ (CDF3), DiffusionFace (DiffFace). Specific methods mentioned include BlendFace, SimSwap, DiT, SiT, AniTalker, FLOAT.

Model(s)

Model-Agnostic framework (DevDet: FFDev and DoseDict/DAFT). Base detectors tested include Xception, Capsule, Effnb4, F3Net, CLIP, SPSL, SBI, IID, ProDet, and Effort.

Author countries

China, USA

← Previous