Deepfake Detection Generalization with Diffusion Noise
Authors: Hongyuan Qi, Wenjin Hou, Hehe Fan, Jun Xiao
Published: 2026-04-16 03:02:04+00:00
Comment: 17 pages
AI Summary
This paper introduces Attention-guided Noise Learning (ANL), a novel framework designed to enhance deepfake detection generalization, particularly for content generated by diffusion models. ANL integrates a pre-trained diffusion model into the detection pipeline to leverage subtle diffusion noise characteristics, guiding a forensic classifier to learn more robust and globally distributed features through an attention mechanism. Extensive experiments demonstrate that ANL significantly outperforms existing methods, achieving state-of-the-art accuracy and strong generalization to unseen forgery types and generative models without additional inference overhead.
Abstract
Deepfake detectors face growing challenges in generalization as new image synthesis techniques emerge. In particular, deepfakes generated by diffusion models are highly photorealistic and often evade detectors trained on GAN-based forgeries. This paper addresses the generalization problem in deepfake detection by leveraging diffusion noise characteristics. We propose an Attention-guided Noise Learning (ANL) framework that integrates a pre-trained diffusion model into the deepfake detection pipeline to guide the learning of more robust features. Specifically, our method uses the diffusion model's denoising process to expose subtle artifacts: the detector is trained to predict the noise contained in an input image at a given diffusion step, forcing it to capture discrepancies between real and synthetic images, while an attention-guided mechanism derived from the predicted noise is introduced to encourage the model to focus on globally distributed discrepancies rather than local patterns. By harnessing the frozen diffusion model's learned distribution of natural images, the ANL method acts as a form of regularization, improving the detector's generalization to unseen forgery types. Extensive experiments demonstrate that ANL significantly outperforms existing methods on multiple benchmarks, achieving state-of-the-art accuracy in detecting diffusion-generated deepfakes. Notably, the proposed framework boosts generalization performance (e.g., improving ACC/AP by a substantial margin on unseen models) without introducing additional overhead during inference. Our results highlight that diffusion noise provides a powerful signal for generalizable deepfake detection.