DeMark: A Query-Free Black-Box Attack on Deepfake Watermarking Defenses

Authors: Wei Song, Zhenchang Xing, Liming Zhu, Yulei Sui, Jingling Xue

Published: 2026-01-23 06:04:43+00:00

AI Summary

This paper introduces DeMark, a query-free black-box attack framework designed to undermine defensive image watermarking schemes used for deepfake detection. DeMark exploits latent-space vulnerabilities in encoder-decoder watermarking models through a compressive sensing-based sparsification process. It significantly reduces watermark detection accuracy while preserving perceptual and structural realism, outperforming existing attacks and highlighting the fragility of current deepfake watermarking defenses.

Abstract

The rapid proliferation of realistic deepfakes has raised urgent concerns over their misuse, motivating the use of defensive watermarks in synthetic images for reliable detection and provenance tracking. However, this defense paradigm assumes such watermarks are inherently resistant to removal. We challenge this assumption with DeMark, a query-free black-box attack framework that targets defensive image watermarking schemes for deepfakes. DeMark exploits latent-space vulnerabilities in encoder-decoder watermarking models through a compressive sensing based sparsification process, suppressing watermark signals while preserving perceptual and structural realism appropriate for deepfakes. Across eight state-of-the-art watermarking schemes, DeMark reduces watermark detection accuracy from 100% to 32.9% on average while maintaining natural visual quality, outperforming existing attacks. We further evaluate three defense strategies, including image super resolution, sparse watermarking, and adversarial training, and find them largely ineffective. These results demonstrate that current encoder decoder watermarking schemes remain vulnerable to latent-space manipulations, underscoring the need for more robust watermarking methods to safeguard against deepfakes.


Key findings
DeMark achieved a significant reduction in watermark detection accuracy from 100% to 32.9% on average across eight state-of-the-art watermarking schemes, outperforming prior attacks while maintaining high visual quality. The framework demonstrated high computational efficiency compared to other deep learning-based attacks. Furthermore, tested mitigation strategies like image super-resolution, sparse watermarking, and adversarial training proved largely ineffective, underscoring the persistent vulnerability of current encoder-decoder watermarking schemes to latent-space manipulations.
Approach
DeMark utilizes a CNN-based sparse encoder (TCNN) to transform a watermarked image into a sparse latent representation, enforcing sparsity to suppress watermark signals via a dispersal effect. A CNN-based image reconstruction module (RCNN) then recovers the image from this sparse latent representation. This process is optimized using a combined sparse encoding loss and a structural-perceptual loss to maintain high visual fidelity while disrupting embedded watermarks.
Datasets
OpenImage, COCO
Model(s)
UNKNOWN
Author countries
Australia