Are Watermarks Bugs for Deepfake Detectors? Rethinking Proactive Forensics
Authors: Xiaoshuai Wu, Xin Liao, Bo Ou, Yuling Liu, Zheng Qin
Published: 2024-04-27 11:20:49+00:00
Comment: Accepted by IJCAI 2024
AI Summary
This paper introduces AdvMark, a novel adversarial watermarking technique designed to prevent existing robust watermarks from degrading Deepfake detector performance. AdvMark fine-tunes robust watermarking models to embed adversarial watermarks that enhance the detectability of forged images by passive Deepfake detectors, while still allowing for provenance tracking. This plug-and-play solution improves detection accuracy without requiring modifications to deployed Deepfake detectors.
Abstract
AI-generated content has accelerated the topic of media synthesis, particularly Deepfake, which can manipulate our portraits for positive or malicious purposes. Before releasing these threatening face images, one promising forensics solution is the injection of robust watermarks to track their own provenance. However, we argue that current watermarking models, originally devised for genuine images, may harm the deployed Deepfake detectors when directly applied to forged images, since the watermarks are prone to overlap with the forgery signals used for detection. To bridge this gap, we thus propose AdvMark, on behalf of proactive forensics, to exploit the adversarial vulnerability of passive detectors for good. Specifically, AdvMark serves as a plug-and-play procedure for fine-tuning any robust watermarking into adversarial watermarking, to enhance the forensic detectability of watermarked images; meanwhile, the watermarks can still be extracted for provenance tracking. Extensive experiments demonstrate the effectiveness of the proposed AdvMark, leveraging robust watermarking to fool Deepfake detectors, which can help improve the accuracy of downstream Deepfake detection without tuning the in-the-wild detectors. We believe this work will shed some light on the harmless proactive forensics against Deepfake.