My Face Is Mine, Not Yours: Facial Protection Against Diffusion Model Face Swapping

Authors: Hon Ming Yam, Zhongliang Guo, Chun Pong Lau

Published: 2025-05-21 10:07:46+00:00

AI Summary

This paper introduces a novel proactive defense strategy through adversarial attacks to protect facial images from exploitation by diffusion-based deepfake systems. It proposes a dual-loss adversarial framework, combining a face identity loss and an inference-step averaging loss, to efficiently generate robust perturbations in the latent space. This approach effectively disrupts identity transfer across diverse diffusion architectures while preserving the visual fidelity of the protected images.

Abstract

The proliferation of diffusion-based deepfake technologies poses significant risks for unauthorized and unethical facial image manipulation. While traditional countermeasures have primarily focused on passive detection methods, this paper introduces a novel proactive defense strategy through adversarial attacks that preemptively protect facial images from being exploited by diffusion-based deepfake systems. Existing adversarial protection methods predominantly target conventional generative architectures (GANs, AEs, VAEs) and fail to address the unique challenges presented by diffusion models, which have become the predominant framework for high-quality facial deepfakes. Current diffusion-specific adversarial approaches are limited by their reliance on specific model architectures and weights, rendering them ineffective against the diverse landscape of diffusion-based deepfake implementations. Additionally, they typically employ global perturbation strategies that inadequately address the region-specific nature of facial manipulation in deepfakes.


Key findings
The proposed method significantly disrupts facial identity in diffusion-based face swaps, achieving a markedly lower cosine similarity (e.g., 0.0327 for the xadv variant) between the swapped attacked image and the clean source compared to existing baselines. It demonstrates robust protection against various diffusion-based deepfake models (including Face Adapter and REFace) and strong transferability, while maintaining visual fidelity of the protected images and resilience against common adversarial defenses.
Approach
The authors propose a dual-loss adversarial framework operating in the latent space of Latent Diffusion Models (LDMs) to create imperceptible perturbations. This framework combines a face identity loss, which targets the conditional mechanisms exploited in diffusion-based face manipulation, with an inference-step averaging loss that efficiently generates robust perturbations across multiple inference steps to overcome computational challenges.
Datasets
CelebA-HQ
Model(s)
UNKNOWN
Author countries
Hong Kong, UK