Vulnerabilities of Audio-Based Biometric Authentication Systems Against Deepfake Speech Synthesis
Authors: Mengze Hong, Di Jiang, Zeying Xie, Weiwei Zhao, Guan Wang, Chen Jason Zhang
Published: 2026-01-06 10:55:32+00:00
AI Summary
This paper systematically evaluates the vulnerability of state-of-the-art audio biometric authentication systems against contemporary deepfake speech synthesis models. It reveals that commercial speaker verification systems are easily bypassed by voice clones trained on minimal data, and critically, anti-spoofing detectors suffer a massive performance drop (up to 30x) when encountering synthesis methods unseen during training, demonstrating a critical failure in generalization. These findings urge a move towards architectural innovations and multi-factor authentication for robust security.
Abstract
As audio deepfakes transition from research artifacts to widely available commercial tools, robust biometric authentication faces pressing security threats in high-stakes industries. This paper presents a systematic empirical evaluation of state-of-the-art speaker authentication systems based on a large-scale speech synthesis dataset, revealing two major security vulnerabilities: 1) modern voice cloning models trained on very small samples can easily bypass commercial speaker verification systems; and 2) anti-spoofing detectors struggle to generalize across different methods of audio synthesis, leading to a significant gap between in-domain performance and real-world robustness. These findings call for a reconsideration of security measures and stress the need for architectural innovations, adaptive defenses, and the transition towards multi-factor authentication.