The Deepfakes We Missed: We Built Detectors for a Threat That Didn't Arrive

Authors: Shaina Raza

Published: 2026-05-12 13:02:13+00:00

AI Summary

This position paper argues that deepfake detection research is severely misaligned with real-world harms, having overwhelmingly focused on public-figure video manipulation while actual incidents are dominated by peer-generated non-consensual intimate imagery (NCII) and voice-clone scams. The paper empirically quantifies this misalignment in research effort versus observed harm, diagnoses its structural causes, and proposes three new technical research agendas. It asserts that this misalignment, rather than model capability, is the primary bottleneck for effective deepfake defense.

Abstract

Nearly a decade of Machine Learning (ML) research on deepfake detection has been organized around a threat model inherited from 2017--2019, revolving around face-swap and talking-head manipulation of public figures, motivated by concerns about large-scale misinformation and video-evidence fraud. This position paper argues that the threat the field prepared for did not arrive, and the threats that did arrive are substantially different. An accounting of deepfake incidents in 2022--2026 shows that the dominant observed harms are peer-generated Non-Consensual Intimate Imagery (NCII), voice-clone scam calls targeting families and finance workers, and emotional-manipulation fraud. The predicted large-scale public-figure deepfake catastrophe did not materialize during the 2024 global information environment despite extensive preparation. Meanwhile, research effort, benchmarks, and detection methods remain concentrated on the inherited threat model. The central claim of this paper is that this misalignment is now the dominant bottleneck on real-world deepfake defense, not model capability. We argue the ML research community should substantially rebalance its research agenda toward the harm categories that are actually growing. We support this position with empirical accounting of research effort and harm distribution, identify the structural reasons the misalignment persists, and outline three concrete technical research agendas for the under-defended harm categories.

Key findings

The predicted large-scale public-figure deepfake catastrophe did not materialize, and the research community's focus on this threat model represents 71% of detection papers. Conversely, the dominant observed harms are peer-generated NCII and voice-clone scam calls, which are severely under-researched, with less than 1% of papers addressing them. This misalignment, perpetuated by benchmark inheritance, dataset ethics, and media salience, is the primary bottleneck on real-world deepfake defense and necessitates a substantial rebalancing of research agendas.

Approach

The paper conducts an empirical analysis by classifying a corpus of 438 deepfake detection papers (2017-2025) according to a five-category threat taxonomy. It then synthesizes real-world deepfake harm data from various public sources (e.g., FBI, IWF) and documented incidents. By comparing these distributions, the paper identifies a significant misalignment between research focus and actual harm, diagnoses its underlying causes, and proposes three concrete research agendas for under-defended harm categories.

Datasets

A corpus of 438 deepfake detection papers (2017-2025), FBI Internet Crime Complaint Center (IC3) annual reports, Internet Watch Foundation (IWF) reports, AI Incident Database (AIID), academic victim-prevalence surveys, and documented high-profile incidents.

Model(s)

UNKNOWN

Author countries

Canada

← Previous