Continual Audio Deepfake Detection via Universal Adversarial Perturbation

Authors: Wangjie Li, Lin Li, Qingyang Hong

Published: 2025-11-25 06:41:11+00:00

AI Summary

The paper proposes a novel framework for continual audio deepfake detection (ADD) to overcome catastrophic forgetting against evolving spoofing attacks. It leverages Universal Adversarial Perturbation (UAP) integrated into the model fine-tuning process, allowing the system to retain knowledge of historical spoofing distributions without storing past training data. This approach aims to provide an efficient and robust solution for continual learning in ADD by utilizing pseudo-spoofed samples and knowledge distillation.

Abstract

The rapid advancement of speech synthesis and voice conversion technologies has raised significant security concerns in multimedia forensics. Although current detection models demonstrate impressive performance, they struggle to maintain effectiveness against constantly evolving deepfake attacks. Additionally, continually fine-tuning these models using historical training data incurs substantial computational and storage costs. To address these limitations, we propose a novel framework that incorporates Universal Adversarial Perturbation (UAP) into audio deepfake detection, enabling models to retain knowledge of historical spoofing distribution without direct access to past data. Our method integrates UAP seamlessly with pre-trained self-supervised audio models during fine-tuning. Extensive experiments validate the effectiveness of our approach, showcasing its potential as an efficient solution for continual learning in audio deepfake detection.


Key findings
The UAP-based continual learning framework effectively preserved historical knowledge, demonstrating an average performance improvement of up to 48% relative to standard sequence fine-tuning, thereby significantly mitigating catastrophic forgetting. Feature-level UAP significantly outperformed waveform-level UAP in retaining prior knowledge, confirming its effectiveness in maintaining model stability across different domains. The approach proves robust and efficient for continually updating audio deepfake detectors.
Approach
The method involves generating a Universal Adversarial Perturbation (UAP) vector from the previous stage's model, which is then added to the current stage's bona fide audio features to create pseudo-spoofed samples. These pseudo samples are used, alongside new data, to fine-tune the detection model. Knowledge distillation is applied on both pseudo-spoofed and real features to maintain prior domain consistency and stabilize learning; feature-level UAP is preferred over waveform-level UAP.
Datasets
ASVspoof 2019 LA, CFAD, ASVspoof 5, ASVspoof 2021 LA, ASVspoof 2021 DF
Model(s)
WavLM (pre-trained self-supervised audio model), Transformer layers, Fully Connected Layer
Author countries
China