SpectraNet: FFT-assisted Deep Learning Classifier for Deepfake Face Detection

View on arXiv ← Back to list

Authors: Nithira Jayarathne, Naveen Basnayake, Keshawa Jayasundara, Pasindu Dodampegama, Praveen Wijesinghe, Hirushika Pelagewatta, Kavishka Abeywardana, Sandushan Ranaweera, Chamira Edussooriya

Published: 2025-11-24 14:54:00+00:00

AI Summary

This paper proposes a lightweight binary classification model based on fine-tuned EfficientNet-B6 for facial deepfake image detection. The approach utilizes robust preprocessing, oversampling, and optimization strategies to address severe class imbalance and ensure high accuracy and generalization. Although the authors investigated incorporating Fourier Transform features (SpectraNet), the optimized EfficientNet-B6 alone provided the best performance.

Abstract

Detecting deepfake images is crucial in combating misinformation. We present a lightweight, generalizable binary classification model based on EfficientNet-B6, fine-tuned with transformation techniques to address severe class imbalances. By leveraging robust preprocessing, oversampling, and optimization strategies, our model achieves high accuracy, stability, and generalization. While incorporating Fourier transform-based phase and amplitude features showed minimal impact, our proposed framework helps non-experts to effectively identify deepfake images, making significant strides toward accessible and reliable deepfake detection.

Key findings

The optimized EfficientNet-B6 model achieved the best results, attaining an accuracy and AUC of 0.9102 and 0.9104, respectively. The integration of Fourier transform phase and amplitude information in a hybrid model actually led to performance degradation (0.8981 ACC) and increased evaluation time (3.48s vs 2.55s), suggesting limited utility for frequency-based information in this context.

Approach

The core approach involves fine-tuning a pre-trained EfficientNet-B6 model by modifying the final layers for binary classification. They employed a balanced batching strategy using image transformation techniques to mitigate severe class imbalance in the training data. The model utilized the Adam optimizer, ReduceLROnPlateau scheduler, and Mixed Precision Training for efficient optimization.

Datasets

A private dataset of 262,160 images (42,690 real, 219,470 fake) was used for training. Cited relevant datasets include DeepFakeBench, Celeb-DF, FaceForensics++, and DFDC Preview Dataset.

Model(s)

EfficientNet-B6

Author countries

Sri Lanka, Australia

← Previous