Unmasking Puppeteers: Leveraging Biometric Leakage to Disarm Impersonation in AI-based Videoconferencing
Authors: Danial Samadi Vahdati, Tai Duc Nguyen, Ekta Prashnani, Koki Nagano, David Luebke, Orazio Gallo, Matthew Stamm
Published: 2025-10-03 22:37:03+00:00
AI Summary
AI-based videoconferencing systems are highly vulnerable to puppeteering attacks where an attacker hijacks a victim's identity by manipulating the transmitted pose-expression latent embedding. This paper introduces the first biometric leakage defense that operates entirely in the latent domain, exploiting identity cues inadvertently contained within these embeddings. By using a pose-conditioned, large-margin contrastive encoder, the method successfully isolates persistent identity cues from transient pose and expression, enabling real-time detection of illicit identity swaps.
Abstract
AI-based talking-head videoconferencing systems reduce bandwidth by sending a compact pose-expression latent and re-synthesizing RGB at the receiver, but this latent can be puppeteered, letting an attacker hijack a victim's likeness in real time. Because every frame is synthetic, deepfake and synthetic video detectors fail outright. To address this security problem, we exploit a key observation: the pose-expression latent inherently contains biometric information of the driving identity. Therefore, we introduce the first biometric leakage defense without ever looking at the reconstructed RGB video: a pose-conditioned, large-margin contrastive encoder that isolates persistent identity cues inside the transmitted latent while cancelling transient pose and expression. A simple cosine test on this disentangled embedding flags illicit identity swaps as the video is rendered. Our experiments on multiple talking-head generation models show that our method consistently outperforms existing puppeteering defenses, operates in real-time, and shows strong generalization to out-of-distribution scenarios.