Towards Robust Defense against Customization via Protective Perturbation Resistant to Diffusion-based Purification

View on arXiv ← Back to list

Authors: Wenkui Yang, Jie Cao, Junxian Duan, Ran He

Published: 2025-09-17 11:30:13+00:00

AI Summary

This paper formalizes the anti-purification task in the context of diffusion models and proposes AntiPure, a novel protective perturbation method resistant to diffusion-based purification. AntiPure leverages patch-wise frequency guidance and erroneous timestep guidance to embed imperceptible perturbations that persist even after purification, effectively hindering malicious customization.

Abstract

Diffusion models like Stable Diffusion have become prominent in visual synthesis tasks due to their powerful customization capabilities, which also introduce significant security risks, including deepfakes and copyright infringement. In response, a class of methods known as protective perturbation emerged, which mitigates image misuse by injecting imperceptible adversarial noise. However, purification can remove protective perturbations, thereby exposing images again to the risk of malicious forgery. In this work, we formalize the anti-purification task, highlighting challenges that hinder existing approaches, and propose a simple diagnostic protective perturbation named AntiPure. AntiPure exposes vulnerabilities of purification within the purification-customization workflow, owing to two guidance mechanisms: 1) Patch-wise Frequency Guidance, which reduces the model's influence over high-frequency components in the purified image, and 2) Erroneous Timestep Guidance, which disrupts the model's denoising strategy across different timesteps. With additional guidance, AntiPure embeds imperceptible perturbations that persist under representative purification settings, achieving effective post-customization distortion. Experiments show that, as a stress test for purification, AntiPure achieves minimal perceptual discrepancy and maximal distortion, outperforming other protective perturbation methods within the purification-customization workflow.

Key findings

AntiPure outperforms existing protective perturbation methods in resisting diffusion-based purification, achieving minimal perceptual discrepancy while maximizing output distortion. Its effectiveness is demonstrated across different datasets and customization methods (DreamBooth and LoRA). AntiPure's robustness increases with more purification iterations, unlike other methods.

Approach

AntiPure addresses the vulnerability of existing protective perturbations to diffusion-based purification by directly targeting the purification model. It uses patch-wise frequency guidance to modulate high-frequency components and erroneous timestep guidance to disrupt the denoising strategy, ensuring that imperceptible perturbations remain after purification.

Datasets

CelebA-HQ, VGGFace2

Model(s)

Stable Diffusion (SD), DDPMs (for purification), Inception v3 (for FID), ArcFace (for ISM), RetinaFace (for FDFR), BRISQUE (for image quality), AlexNet and VGG (for LPIPS)

Author countries

China

← Previous