AI-Powered Deepfake Detection Using CNN and Vision Transformer Architectures
Authors: Sifatullah Sheikh Urmi, Kirtonia Nuzath Tabassum Arthi, Md Al-Imran
Published: 2026-01-03 20:44:50+00:00
AI Summary
This paper investigates the effectiveness of four AI models, including three CNNs and a Vision Transformer, for classifying real versus fake face images (deepfake detection). Utilizing preprocessing and augmentation, the study compares the robustness of these architectures on a large dataset. The proposed Vision Fake Detection Network (VFDNET), based on the Vision Transformer, achieved the highest accuracy, demonstrating reliable deepfake detection capabilities.
Abstract
The increasing use of artificial intelligence generated deepfakes creates major challenges in maintaining digital authenticity. Four AI-based models, consisting of three CNNs and one Vision Transformer, were evaluated using large face image datasets. Data preprocessing and augmentation techniques improved model performance across different scenarios. VFDNET demonstrated superior accuracy with MobileNetV3, showing efficient performance, thereby demonstrating AI's capabilities for dependable deepfake detection.