Generalizable Speech Deepfake Detection via Information Bottleneck Enhanced Adversarial Alignment

Authors: Pu Huang, Shouguang Wang, Siya Yao, Mengchu Zhou

Published: 2025-09-28 03:48:49+00:00

AI Summary

The paper introduces Information Bottleneck enhanced Confidence-Aware Adversarial Network (IB-CAAN) for generalizable speech deepfake detection. This method employs confidence-guided adversarial alignment to suppress attack-specific artifacts and an information bottleneck to remove nuisance variability, thereby preserving transferable discriminative features. Experiments demonstrate that IB-CAAN consistently outperforms baselines and achieves state-of-the-art performance on many benchmarks, addressing distribution shifts across spoofing methods and other variabilities.

Abstract

Neural speech synthesis techniques have enabled highly realistic speech deepfakes, posing major security risks. Speech deepfake detection is challenging due to distribution shifts across spoofing methods and variability in speakers, channels, and recording conditions. We explore learning shared discriminative features as a path to robust detection and propose Information Bottleneck enhanced Confidence-Aware Adversarial Network (IB-CAAN). Confidence-guided adversarial alignment adaptively suppresses attack-specific artifacts without erasing discriminative cues, while the information bottleneck removes nuisance variability to preserve transferable features. Experiments on ASVspoof 2019/2021, ASVspoof 5, and In-the-Wild demonstrate that IB-CAAN consistently outperforms baseline and achieves state-of-the-art performance on many benchmarks.


Key findings
IB-CAAN consistently outperforms ERM baselines, demonstrating significant improvements in generalizability across various spoofing detection tasks and datasets including ASVspoof 2019/2021, ASVspoof 5, and In-the-Wild. The model achieves state-of-the-art performance on several benchmarks, notably on In-the-Wild and ASVspoof 5 open condition. Ablation studies confirm that both the Information Bottleneck and Confidence-Aware Adversarial Network components are crucial and complementary for enhancing generalization.
Approach
The proposed IB-CAAN framework tackles distribution shifts by integrating a Variational Information Bottleneck (VIB) and a Confidence-Aware Adversarial Network (CAAN). VIB compresses irrelevant input information to mitigate covariate shift, while CAAN uses classifier confidence as an auxiliary adversarial signal to adaptively suppress attack-specific artifacts without erasing discriminative cues, aiming to learn attack-invariant, generalizable features.
Datasets
ASVspoof 2019, ASVspoof 2021, ASVspoof 5, In-the-Wild
Model(s)
IB-CAAN (with backbones: RawBMamba, XLSR+Linear, XLSR+MLP)
Author countries
China, USA