DiffFace-Edit: A Diffusion-Based Facial Dataset for Forgery-Semantic Driven Deepfake Detection Analysis

Authors: Feng Ding, Wenhui Yi, Xinan He, Mengyao Xiao, Jianfeng Xu, Jianqiang Du

Published: 2026-01-20 03:21:43+00:00

AI Summary

This paper introduces DiffFace-Edit, a large-scale facial dataset designed to analyze the impact of fine-grained regional manipulations and detector-evasive samples on deepfake detection models. The dataset features over two million AI-generated fake images with edits across eight facial regions, including single and multi-region manipulations, and specifically addresses splice attacks between real and manipulated samples. It aims to fill a gap in existing datasets by providing detailed annotations and challenging samples to facilitate more robust forgery detection and localization.

Abstract

Generative models now produce imperceptible, fine-grained manipulated faces, posing significant privacy risks. However, existing AI-generated face datasets generally lack focus on samples with fine-grained regional manipulations. Furthermore, no researchers have yet studied the real impact of splice attacks, which occur between real and manipulated samples, on detectors. We refer to these as detector-evasive samples. Based on this, we introduce the DiffFace-Edit dataset, which has the following advantages: 1) It contains over two million AI-generated fake images. 2) It features edits across eight facial regions (e.g., eyes, nose) and includes a richer variety of editing combinations, such as single-region and multi-region edits. Additionally, we specifically analyze the impact of detector-evasive samples on detection models. We conduct a comprehensive analysis of the dataset and propose a cross-domain evaluation that combines IMDL methods. Dataset will be available at https://github.com/ywh1093/DiffFace-Edit.


Key findings
The study found that existing IMDL methods, despite performing well on 'SC-samples' (strictly compliant with IMDL definitions), show significant performance drops on 'SA-samples' (semantically ambiguous/detector-evasive samples). Removal-type manipulations were found to be particularly challenging for detectors, exhibiting counterintuitive behavior with higher model misdetection rates. Furthermore, increasing the diversity and number of tampered regions in training samples negatively impacts the localization accuracy of detection models.
Approach
The authors introduce a new dataset, DiffFace-Edit, which contains over two million images with partial facial edits generated by six different diffusion models. They categorize manipulations into splicing (single/multi-region), removal, and copy-move. They then use this dataset to conduct a comprehensive cross-domain evaluation, analyzing the performance of existing Image Manipulation Detection & Localization (IMDL) methods against various types of forgeries, especially 'detector-evasive' or 'semantically ambiguous' samples.
Datasets
DiffFace-Edit (newly introduced), CelebAMask-HQ (for pristine data collection). Comparisons are made against CIFAKE, DiffusionForensics-LSUN, DiffusionForensics-General, GenImage, ForgeryNet, DiffusionDeepfake, and DiFF.
Model(s)
Bayar-resnet, SegFormer-B2, SPAN, PSCC-Net, MVSS-Net, Mesoscopic Insights, MMFusion, IML-Vit (all used as IMDL locators in benchmark settings).
Author countries
China