Towards Imperceptible Adversarial Defense: A Gradient-Driven Shield against Facial Manipulations

Authors: Yue Li, Linying Xue, Dongdong Lin, Qiushi Li, Hui Tian, Hongxia Wang

Published: 2025-10-02 06:09:46+00:00

AI Summary

The paper introduces GRASP (Gradient-projection-based AdverSarial Proactive defense), a method to generate imperceptible adversarial perturbations to counter facial deepfake manipulation. GRASP enhances perturbation imperceptibility by integrating structural similarity loss and low-frequency loss, while balancing defense effectiveness. This balance is achieved through a novel gradient-projection mechanism that resolves inherent conflicts among the multi-objective loss functions during optimization.

Abstract

With the flourishing prosperity of generative models, manipulated facial images have become increasingly accessible, raising concerns regarding privacy infringement and societal trust. In response, proactive defense strategies embed adversarial perturbations into facial images to counter deepfake manipulation. However, existing methods often face a tradeoff between imperceptibility and defense effectiveness-strong perturbations may disrupt forgeries but degrade visual fidelity. Recent studies have attempted to address this issue by introducing additional visual loss constraints, yet often overlook the underlying gradient conflicts among losses, ultimately weakening defense performance. To bridge the gap, we propose a gradient-projection-based adversarial proactive defense (GRASP) method that effectively counters facial deepfakes while minimizing perceptual degradation. GRASP is the first approach to successfully integrate both structural similarity loss and low-frequency loss to enhance perturbation imperceptibility. By analyzing gradient conflicts between defense effectiveness loss and visual quality losses, GRASP pioneers the design of the gradient-projection mechanism to mitigate these conflicts, enabling balanced optimization that preserves image fidelity without sacrificing defensive performance. Extensive experiments validate the efficacy of GRASP, achieving a PSNR exceeding 40 dB, SSIM of 0.99, and a 100% defense success rate against facial attribute manipulations, significantly outperforming existing approaches in visual quality.


Key findings
GRASP achieved superior visual quality (PSNR > 40 dB, SSIM ≈ 0.99) while maintaining high defense success rates (often 100%) against facial attribute editing and face swapping models. The method demonstrated better generalizability across various generative models and achieved stable, competitive robustness against common post-processing distortions (blur, rotation). Ablation studies confirmed that the gradient projection strategy is critical for balancing effectiveness and imperceptibility.
Approach
GRASP generates adversarial images by minimizing a loss function comprising MSE (for defense effectiveness) and visual quality losses (SSIM and low-frequency loss). To mitigate gradient conflicts between these competing objectives, the approach uses a cross-projection strategy where gradients are mutually projected onto each other’s normal planes, ensuring a conflict-free subspace for perturbation updates.
Datasets
CelebA, FFHQ, LFW
Model(s)
StarGAN, AttGAN, HiSD, SimSwap
Author countries
China, Italy