Replay Attacks Against Audio Deepfake Detection
Authors: Nicolas Müller, Piotr Kawa, Wei-Herng Choong, Adriana Stan, Aditya Tirumala Bukkapatnam, Karla Pizzi, Alexander Wagner, Philip Sperl
Published: 2025-05-20 19:46:36+00:00
Journal Ref: Interspeech 2025
AI Summary
This paper demonstrates how replay attacks undermine audio deepfake detection by playing and re-recording deepfake audio, making spoofed samples appear authentic to detection models. The authors introduce ReplayDF, a dataset of such recordings across diverse acoustic conditions, languages, and TTS models. Their analysis shows significant vulnerability in existing detection models, with performance dropping considerably even after adaptive retraining.
Abstract
We show how replay attacks undermine audio deepfake detection: By playing and re-recording deepfake audio through various speakers and microphones, we make spoofed samples appear authentic to the detection model. To study this phenomenon in more detail, we introduce ReplayDF, a dataset of recordings derived from M-AILABS and MLAAD, featuring 109 speaker-microphone combinations across six languages and four TTS models. It includes diverse acoustic conditions, some highly challenging for detection. Our analysis of six open-source detection models across five datasets reveals significant vulnerability, with the top-performing W2V2-AASIST model's Equal Error Rate (EER) surging from 4.7% to 18.2%. Even with adaptive Room Impulse Response (RIR) retraining, performance remains compromised with an 11.0% EER. We release ReplayDF for non-commercial research use.