Fair and Interpretable Deepfake Detection in Videos

Authors: Akihito Yoshii, Ryosuke Sonoda, Ramya Srinivasan

Published: 2025-10-20 07:50:22+00:00

Comment: 10 pages (including References)

AI Summary

This paper proposes a fairness-aware deepfake detection framework that integrates temporal feature learning and demographic-aware data augmentation to enhance fairness and interpretability in video deepfake detection. The method uses sequence-based clustering for temporal modeling and concept extraction for interpretable decisions, alongside a demography-aware data augmentation method that balances underrepresented groups and preserves deepfake artifacts through frequency-domain transformations. Experiments show the approach achieves a better tradeoff between fairness and accuracy compared to state-of-the-art methods.

Abstract

Existing deepfake detection methods often exhibit bias, lack transparency, and fail to capture temporal information, leading to biased decisions and unreliable results across different demographic groups. In this paper, we propose a fairness-aware deepfake detection framework that integrates temporal feature learning and demographic-aware data augmentation to enhance fairness and interpretability. Our method leverages sequence-based clustering for temporal modeling of deepfake videos and concept extraction to improve detection reliability while also facilitating interpretable decisions for non-expert users. Additionally, we introduce a demography-aware data augmentation method that balances underrepresented groups and applies frequency-domain transformations to preserve deepfake artifacts, thereby mitigating bias and improving generalization. Extensive experiments on FaceForensics++, DFD, Celeb-DF, and DFDC datasets using state-of-the-art (SoTA) architectures (Xception, ResNet) demonstrate the efficacy of the proposed method in obtaining the best tradeoff between fairness and accuracy when compared to SoTA.


Key findings
The proposed method consistently outperforms state-of-the-art baselines across multiple datasets and architectures, achieving the best tradeoff between fairness (lower FEO, FFPR, FTPR) and accuracy (higher AUC). Ablation studies confirm the effectiveness of the proposed clustering with temporal information, bias-aware sampling, and frequency-aware data augmentation in improving both fairness and detection performance.
Approach
The framework leverages sequence-based clustering to model temporal information in deepfake videos and extracts concepts for interpretability. It introduces a demography-aware data augmentation method that balances underrepresented groups and applies frequency-domain transformations, selectively mixing low-frequency components while preserving high-frequency deepfake artifacts to mitigate bias and improve generalization.
Datasets
FaceForensics++, DFD, Celeb-DF, DFDC
Model(s)
Xception, ResNet34
Author countries
Japan, USA