Artificial Fingerprinting for Generative Models: Rooting Deepfake Attribution in Training Data

Authors: Ning Yu, Vladislav Skripniuk, Sahar Abdelnabi, Mario Fritz

Published: 2020-07-16 16:49:55+00:00

Comment: Accepted to ICCV'21 as Oral

AI Summary

This paper introduces a proactive and sustainable solution for deepfake detection and attribution by embedding artificial fingerprints into generative models. The core idea involves incorporating unique fingerprints into training data, which surprisingly transfers to the generative models and subsequently appears in all generated deepfakes. This approach enables robust deepfake identification that is agnostic to the evolution of generative models and outperforms state-of-the-art baselines.

Abstract

Photorealistic image generation has reached a new level of quality due to the breakthroughs of generative adversarial networks (GANs). Yet, the dark side of such deepfakes, the malicious use of generated media, raises concerns about visual misinformation. While existing research work on deepfake detection demonstrates high accuracy, it is subject to advances in generation techniques and adversarial iterations on detection countermeasure techniques. Thus, we seek a proactive and sustainable solution on deepfake detection, that is agnostic to the evolution of generative models, by introducing artificial fingerprints into the models. Our approach is simple and effective. We first embed artificial fingerprints into training data, then validate a surprising discovery on the transferability of such fingerprints from training data to generative models, which in turn appears in the generated deepfakes. Experiments show that our fingerprinting solution (1) holds for a variety of cutting-edge generative models, (2) leads to a negligible side effect on generation quality, (3) stays robust against image-level and model-level perturbations, (4) stays hard to be detected by adversaries, and (5) converts deepfake detection and attribution into trivial tasks and outperforms the recent state-of-the-art baselines. Our solution closes the responsibility loop between publishing pre-trained generative model inventions and their possible misuses, which makes it independent of the current arms race. Code and models are available at https://github.com/ningyu1991/ArtificialGANFingerprints .


Key findings
The proposed fingerprinting solution achieves perfect detection and attribution accuracy (100%) across various generative models and datasets, significantly outperforming existing baselines. It incurs negligible side effects on generation quality and demonstrates strong robustness against various image-level and model-level perturbations. Furthermore, the embedded fingerprints are secret and difficult for adversaries to detect, providing a sustainable defense against evolving deepfake techniques.
Approach
The authors embed artificial fingerprints into the training data using a deep-learning-based steganography system (encoder/decoder). Generative models are then trained on this fingerprinted data without any modification to their original training protocols. Deepfake detection and attribution are achieved by decoding the unique fingerprints from the generated images and matching them against a database of known fingerprints.
Datasets
CelebA, LSUN Bedroom, LSUN Cat, CIFAR-10, Horse→Zebra, Cat→Dog
Model(s)
UNKNOWN
Author countries
USA, Germany