Deepfake: Definitions, Performance Metrics and Standards, Datasets and Benchmarks, and a Meta-Review

Authors: Enes Altuncu, Virginia N. L. Franqueira, Shujun Li

Published: 2022-08-21 17:31:31+00:00

Comment: 31 pages; study completed by end of July 2021

Journal Ref: Frontiers in Big Data, Sec. Cybersecurity and Privacy 7 (2024) 1-23

AI Summary

This paper offers a comprehensive overview of deepfake technology, drawing from both English and Chinese research literature and resources. It systematically covers different definitions, performance metrics and standards, and deepfake-related datasets, challenges, competitions, and benchmarks. Additionally, the paper presents a meta-review of 12 selected deepfake-related survey papers to analyze key challenges and recommendations within the field.

Abstract

Recent advancements in AI, especially deep learning, have contributed to a significant increase in the creation of new realistic-looking synthetic media (video, image, and audio) and manipulation of existing media, which has led to the creation of the new term ``deepfake''. Based on both the research literature and resources in English and in Chinese, this paper gives a comprehensive overview of deepfake, covering multiple important aspects of this emerging concept, including 1) different definitions, 2) commonly used performance metrics and standards, and 3) deepfake-related datasets, challenges, competitions and benchmarks. In addition, the paper also reports a meta-review of 12 selected deepfake-related survey papers published in 2020 and 2021, focusing not only on the mentioned aspects, but also on the analysis of key challenges and recommendations. We believe that this paper is the most comprehensive review of deepfake in terms of aspects covered, and the first one covering both the English and Chinese literature and sources.


Key findings
The paper highlights a lack of consensus in deepfake definitions, suggesting 'deep synthesis' for a more inclusive term covering both malicious and positive applications. It emphasizes the critical need for standardized subjective quality assessment of deepfake media to ensure fair and reliable performance evaluation of detection methods. Developing robust, scalable, generalizable, and explainable deepfake detection methods is identified as a primary challenge, underscoring the importance of comprehensive benchmarks on high-quality, diverse datasets for meaningful comparisons.
Approach
The paper conducts a comprehensive literature review of deepfake research from both English and Chinese sources, systematically categorizing and analyzing definitions, performance metrics, standards, datasets, challenges, competitions, and benchmarks. It further performs a meta-review of 12 existing deepfake survey papers to identify common themes, gaps, and future directions. The review focuses on synthesizing existing knowledge rather than proposing a new method.
Datasets
The paper reviews numerous deepfake-related datasets, categorized into image, video, audio/speech, and hybrid types. Prominent examples include FaceForensics++, Celeb-DF v2, DFDC full dataset, Voice Conversion Challenge datasets (2016, 2018, 2020), and ASVspoof datasets (2019, 2021). It also covers datasets like DeeperForensics-1.0, CelebA-Spoof, and ForgeryNet, among many others.
Model(s)
UNKNOWN
Author countries
UK