Towards Generalizable Deepfake Detection via Real Distribution Bias Correction
Authors: Ming-Hui Liu, Harry Cheng, Xin Luo, Xin-Shun Xu, Mohan S. Kankanhalli
Published: 2026-03-14 16:11:00+00:00
Comment: First Version
AI Summary
This paper introduces the Real Distribution Bias Correction (RDBC) framework to enhance deepfake detector generalizability by exploiting the invariance of real data. It leverages the fixed population distribution and inherent Gaussianity of real images, employing two modules to estimate real data statistics and amplify the Gaussianity gap between real and fake samples. This approach allows detectors to effectively generalize to unseen target domains, demonstrating state-of-the-art performance in both in-domain and cross-domain settings.
Abstract
To generalize deepfake detectors to future unseen forgeries, most existing methods attempt to simulate the dynamically evolving forgery types using available source domain data. However, predicting an unbounded set of future manipulations from limited prior examples is infeasible. To overcome this limitation, we propose to exploit the invariance of \\textbf{real data} from two complementary perspectives: the fixed population distribution of the entire real class and the inherent Gaussianity of individual real images. Building on these properties, we introduce the Real Distribution Bias Correction (RDBC) framework, which consists of two key components: the Real Population Distribution Estimation module and the Distribution-Sampled Feature Whitening module. The former utilizes the independent and identically distributed (\\iid) property of real samples to derive the normal distribution form of their statistics, from which the distribution parameters can be estimated using limited source domain data. Based on the learned population distribution, the latter utilizes the inherent Gaussianity of real data as a discriminative prior and performs a sampling-based whitening operation to amplify the Gaussianity gap between real and fake samples. Through synergistic coupling of the two modules, our model captures the real-world properties of real samples, thereby enhancing its generalizability to unseen target domains. Extensive experiments demonstrate that RDBC achieves state-of-the-art performance in both in-domain and cross-domain deepfake detection.