Physics-Guided Deepfake Detection for Voice Authentication Systems
Authors: Alireza Mohammadi, Keshav Sood, Dhananjay Thiruvady, Asef Nazari
Published: 2025-12-04 23:37:18+00:00
AI Summary
The paper presents a framework to address the dual threats of sophisticated deepfake attacks and control-plane poisoning in networked voice authentication systems. This system fuses interpretable physics features modeling vocal tract dynamics with representations from a self-supervised learning module via a Multi-Modal Ensemble Architecture. The framework utilizes a Bayesian ensemble to provide uncertainty estimates, enhancing robustness against both advanced synthesis attacks and malicious updates in federated edge learning protocols.
Abstract
Voice authentication systems deployed at the network edge face dual threats: a) sophisticated deepfake synthesis attacks and b) control-plane poisoning in distributed federated learning protocols. We present a framework coupling physics-guided deepfake detection with uncertainty-aware in edge learning. The framework fuses interpretable physics features modeling vocal tract dynamics with representations coming from a self-supervised learning module. The representations are then processed via a Multi-Modal Ensemble Architecture, followed by a Bayesian ensemble providing uncertainty estimates. Incorporating physics-based characteristics evaluations and uncertainty estimates of audio samples allows our proposed framework to remain robust to both advanced deepfake attacks and sophisticated control-plane poisoning, addressing the complete threat model for networked voice authentication.