High-Fidelity Face Content Recovery via Tamper-Resilient Versatile Watermarking

Authors: Peipeng Yu, Jinfeng Xie, Chengfu Ou, Xiaoyu Zhou, Jianwei Fei, Yunshu Dai, Zhihua Xia, Chip Hong Chang

Published: 2026-03-25 05:02:48+00:00

AI Summary

VeriFi is a novel versatile watermarking framework designed to combat AIGC-driven deepfakes by unifying robust copyright protection, pixel-level manipulation localization, and high-fidelity face content recovery. It addresses the fidelity-functionality trade-off by embedding a compact semantic latent watermark for faithful restoration and achieving localization without explicit payloads. The framework's robustness is further enhanced by an AIGC attack simulator modeling realistic deepfake pipelines.

Abstract

The proliferation of AIGC-driven face manipulation and deepfakes poses severe threats to media provenance, integrity, and copyright protection. Prior versatile watermarking systems typically rely on embedding explicit localization payloads, which introduces a fidelity--functionality trade-off: larger localization signals degrade visual quality and often reduce decoding robustness under strong generative edits. Moreover, existing methods rarely support content recovery, limiting their forensic value when original evidence must be reconstructed. To address these challenges, we present VeriFi, a versatile watermarking framework that unifies copyright protection, pixel-level manipulation localization, and high-fidelity face content recovery. VeriFi makes three key contributions: (1) it embeds a compact semantic latent watermark that serves as an content-preserving prior, enabling faithful restoration even after severe manipulations; (2) it achieves fine-grained localization without embedding localization-specific artifacts by correlating image features with decoded provenance signals; and (3) it introduces an AIGC attack simulator that combines latent-space mixing with seamless blending to improve robustness to realistic deepfake pipelines. Extensive experiments on CelebA-HQ and FFHQ show that VeriFi consistently outperforms strong baselines in watermark robustness, localization accuracy, and recovery quality, providing a practical and verifiable defense for deepfake forensics.


Key findings
Extensive experiments on CelebA-HQ and FFHQ demonstrate VeriFi's superior performance across all metrics. It achieves state-of-the-art tamper localization accuracy (F1 scores up to 0.989), high-fidelity content recovery (best PSNR/SSIM across diverse tampering types), and robust watermark extraction (97.70% average accuracy) while maintaining visual imperceptibility. The AIGC attack simulator and watermark-guided mechanisms were found crucial for enhancing robustness and accuracy.
Approach
VeriFi embeds a compact semantic latent watermark and an ownership code into images. Manipulation localization is achieved by a watermark-guided network that detects spatial inconsistencies in the decoded watermark signal, eliminating the need for separate localization payloads. For high-fidelity face recovery, a dual-stream Transformer fuses the manipulated image with a VAE-decoded content proxy from the semantic watermark and the predicted manipulation map, selectively restoring tampered regions. Robustness is significantly improved by an AIGC Attack Simulator that simulates deepfake perturbations through latent mixing and Poisson blending during training.
Datasets
CelebA-HQ, FFHQ
Model(s)
Variational Autoencoder (VAE), Swin-Unet, Transformer (dual-stream Transformer-based reconstructor)
Author countries
China, Singapore