Fair and Interpretable Deepfake Detection in Videos

Authors: Akihito Yoshii, Ryosuke Sonoda, Ramya Srinivasan

Published: 2025-10-20 07:50:22+00:00

AI Summary

This paper proposes a fairness-aware deepfake detection framework that addresses existing biases and lack of transparency by integrating temporal feature learning and demographic-aware data augmentation. The method uses sequence-based clustering for temporal modeling and concept extraction for interpretability. It aims to achieve the best tradeoff between accuracy and fairness across different demographic groups.

Abstract

Existing deepfake detection methods often exhibit bias, lack transparency, and fail to capture temporal information, leading to biased decisions and unreliable results across different demographic groups. In this paper, we propose a fairness-aware deepfake detection framework that integrates temporal feature learning and demographic-aware data augmentation to enhance fairness and interpretability. Our method leverages sequence-based clustering for temporal modeling of deepfake videos and concept extraction to improve detection reliability while also facilitating interpretable decisions for non-expert users. Additionally, we introduce a demography-aware data augmentation method that balances underrepresented groups and applies frequency-domain transformations to preserve deepfake artifacts, thereby mitigating bias and improving generalization. Extensive experiments on FaceForensics++, DFD, Celeb-DF, and DFDC datasets using state-of-the-art (SoTA) architectures (Xception, ResNet) demonstrate the efficacy of the proposed method in obtaining the best tradeoff between fairness and accuracy when compared to SoTA.


Key findings
The proposed method consistently outperformed state-of-the-art baselines, achieving the best balance between fairness metrics (FEO, FFPR, FTPR) and detection accuracy (AUC). Using the Xception architecture, the framework achieved the highest AUC scores across all four deepfake detection datasets tested. Ablation studies validated that both the temporal clustering and the frequency-aware data augmentation significantly contribute to improved performance and fairness generalization.
Approach
The framework uses sequence-based clustering, incorporating temporal feature differences between frames, coupled with Concept Sensitivity Scores (CSS) to identify demographic biases. It introduces a frequency-aware data augmentation method that blends low-frequency components of bias-aware sampled image pairs while carefully preserving high-frequency deepfake artifacts.
Datasets
FaceForensics++ (FF++), Deepfake Detection (DFD), Celeb-DF, Deepfake Detection Challenge (DFDC).
Model(s)
Xception, ResNet-34.
Author countries
Japan, USA