Defending Deepfake via Texture Feature Perturbation

Authors: Xiao Zhang, Changfang Chen, Tianyi Wang

Published: 2025-08-24 11:53:35+00:00

AI Summary

This paper proposes a proactive Deepfake detection approach that inserts invisible perturbations into facial texture regions to hinder Deepfake generation. It uses a dual-model attention strategy to optimize these perturbations, minimizing visual artifacts while maximizing disruption of Deepfake models.

Abstract

The rapid development of Deepfake technology poses severe challenges to social trust and information security. While most existing detection methods primarily rely on passive analyses, due to unresolvable high-quality Deepfake contents, proactive defense has recently emerged by inserting invisible signals in advance of image editing. In this paper, we introduce a proactive Deepfake detection approach based on facial texture features. Since human eyes are more sensitive to perturbations in smooth regions, we invisibly insert perturbations within texture regions that have low perceptual saliency, applying localized perturbations to key texture regions while minimizing unwanted noise in non-textured areas. Our texture-guided perturbation framework first extracts preliminary texture features via Local Binary Patterns (LBP), and then introduces a dual-model attention strategy to generate and optimize texture perturbations. Experiments on CelebA-HQ and LFW datasets demonstrate the promising performance of our method in distorting Deepfake generation and producing obvious visual defects under multiple attack models, providing an efficient and scalable solution for proactive Deepfake detection.


Key findings
The proposed method outperforms existing state-of-the-art methods in terms of visual quality (higher PSNR, SSIM, lower LPIPS) and Deepfake disruption (higher L2 norm distance and DSR). It demonstrates effectiveness across multiple Deepfake generation models and generalizes well to unseen datasets.
Approach
The approach inserts imperceptible perturbations into low-saliency texture regions of facial images. It uses Local Binary Patterns (LBP) for texture feature extraction and a dual-model attention strategy (combining ResNet50 and ViT) to guide perturbation generation and optimization, minimizing visual artifacts while maximizing disruption of Deepfake generation.
Datasets
CelebA-HQ and LFW
Model(s)
ResNet50, Vision Transformer (ViT), and several Deepfake generation models (StarGAN, AttGAN, AGGAN, HiSD, StarGAN-V2) used for evaluation.
Author countries
China, Singapore