Adaptive Frequency Learning in Two-branch Face Forgery Detection

Authors: Neng Wang, Yang Bai, Kun Yu, Yong Jiang, Shu-tao Xia, Yan Wang

Published: 2022-03-27 14:25:52+00:00

Comment: Deepfake Detection

AI Summary

This paper introduces Adaptive Frequency Learning in Two-branch Detection (AFD) for enhanced face forgery detection. AFD adaptively learns frequency decomposition through optimized soft masks with heterogeneity constraints and uses an attention module to integrate frequency features with spatial clues. Furthermore, it replaces fixed frequency transforms with learnable, data- and task-dependent layers, leading to improved detection performance.

Abstract

Face forgery has attracted increasing attention in recent applications of computer vision. Existing detection techniques using the two-branch framework benefit a lot from a frequency perspective, yet are restricted by their fixed frequency decomposition and transform. In this paper, we propose to Adaptively learn Frequency information in the two-branch Detection framework, dubbed AFD. To be specific, we automatically learn decomposition in the frequency domain by introducing heterogeneity constraints, and propose an attention-based module to adaptively incorporate frequency features into spatial clues. Then we liberate our network from the fixed frequency transforms, and achieve better performance with our data- and task-dependent transform layers. Extensive experiments show that AFD generally outperforms.


Key findings
AFD consistently outperforms baseline methods and other existing techniques for face forgery detection, demonstrating significant improvements in AUC. For instance, it raised AUC from 91.75% to 94.24% compared to the baseline Xception backbone. The method also showed superior and consistent performance across various domain-specific test sets, like DeepFakes, Face2Face, FaceSwap, and NeuralTexture.
Approach
The proposed AFD framework addresses fixed frequency decomposition and transforms by introducing Adaptive Frequency Decomposition (AdaD) via optimizing soft masks with triplet loss. It then adaptively fuses frequency features with spatial cues using an attention-based module (EFF). Finally, Adaptive Frequency Transform (AdaT) replaces fixed transforms with fine-tuned learnable layers.
Datasets
FaceForensics++ (specifically the C40 data set with low quality)
Model(s)
The core architecture is a two-branch detection framework using a pre-trained Xception as its backbone. The AFD method integrates adaptive frequency decomposition, an attention-based feature fusion module, and adaptive frequency transform layers.
Author countries
China