Uncovering and Mitigating Destructive Multi-Embedding Attacks in Deepfake Proactive Forensics

Authors: Lixin Jia, Haiyang Sun, Zhiqing Guo, Yunfeng Diao, Dan Ma, Gaobo Yang

Published: 2025-08-24 07:57:32+00:00

AI Summary

This paper introduces Multi-Embedding Attacks (MEA), a novel attack on deepfake proactive forensics that involves embedding multiple watermarks, destroying the original forensic watermark. To counter this, they propose Adversarial Interference Simulation (AIS), a training paradigm that simulates MEAs during training to improve watermark robustness.

Abstract

With the rapid evolution of deepfake technologies and the wide dissemination of digital media, personal privacy is facing increasingly serious security threats. Deepfake proactive forensics, which involves embedding imperceptible watermarks to enable reliable source tracking, serves as a crucial defense against these threats. Although existing methods show strong forensic ability, they rely on an idealized assumption of single watermark embedding, which proves impractical in real-world scenarios. In this paper, we formally define and demonstrate the existence of Multi-Embedding Attacks (MEA) for the first time. When a previously protected image undergoes additional rounds of watermark embedding, the original forensic watermark can be destroyed or removed, rendering the entire proactive forensic mechanism ineffective. To address this vulnerability, we propose a general training paradigm named Adversarial Interference Simulation (AIS). Rather than modifying the network architecture, AIS explicitly simulates MEA scenarios during fine-tuning and introduces a resilience-driven loss function to enforce the learning of sparse and stable watermark representations. Our method enables the model to maintain the ability to extract the original watermark correctly even after a second embedding. Extensive experiments demonstrate that our plug-and-play AIS training paradigm significantly enhances the robustness of various existing methods against MEA.


Key findings
Experiments demonstrate that existing deepfake proactive forensics methods are highly vulnerable to MEAs, with bit error rates approaching 50%. The proposed AIS significantly enhances the robustness of these methods against MEAs, maintaining low bit error rates even after multiple embeddings. AIS achieves this without sacrificing visual quality.
Approach
The authors propose Adversarial Interference Simulation (AIS), a training paradigm that simulates multi-embedding attacks during the fine-tuning process. AIS uses a resilience-driven loss function to encourage the learning of sparse and stable watermark representations, making the watermarks resistant to being overwritten.
Datasets
CelebA-HQ
Model(s)
SepMark, LampMark, WaveGuard, EditGuard, MBRS, HiDDeN (various encoder-decoder architectures used as baselines)
Author countries
China