Can Generative Models Actually Forge Realistic Identity Documents?

Authors: Alexander Vinogradov

Published: 2025-12-25 00:56:50+00:00

AI Summary

This paper investigates the capacity of open-source diffusion-based generative models to create forensically realistic identity document forgeries using text-to-image and image-to-image pipelines. The study conducts a qualitative, expert-driven forensic analysis of generated outputs from various models to assess their ability to mimic structural and security features. The findings suggest that while these models achieve surface-level visual plausibility, they systematically fail to reproduce the essential physical and micro-level characteristics required to bypass robust forensic authenticity checks.

Abstract

Generative image models have recently shown significant progress in image realism, leading to public concerns about their potential misuse for document forgery. This paper explores whether contemporary open-source and publicly accessible diffusion-based generative models can produce identity document forgeries that could realistically bypass human or automated verification systems. We evaluate text-to-image and image-to-image generation pipelines using multiple publicly available generative model families, including Stable Diffusion, Qwen, Flux, Nano-Banana, and others. The findings indicate that while current generative models can simulate surface-level document aesthetics, they fail to reproduce structural and forensic authenticity. Consequently, the risk of generative identity document deepfakes achieving forensic-level authenticity may be overestimated, underscoring the value of collaboration between machine learning practitioners and document-forensics experts in realistic risk assessment.


Key findings
Generative models can replicate the overall document layout and color palette but consistently digitalize, generalize, and simplify fine-grained material and security features (e.g., micro-text, laser engraving). The resulting forgeries retain a distinct digital appearance and lack the microstructural complexity of real documents, suggesting that the threat level of out-of-the-box generative models for achieving forensic-level authenticity may be currently overestimated.
Approach
The research performs controlled experiments evaluating text-to-image document generation from scratch and image-to-image manipulation scenarios, including background blending and portrait/text substitution. Outputs from multiple generative models are then subjected to manual, expert-driven document-forensics analysis to identify characteristic failures related to texture, security features, and printing approximations.
Datasets
Internal visual materials (not a named public dataset).
Model(s)
Stable Diffusion (v1.5, 2.x, 3.5), Qwen, Flux, Nano-Banana, Kandinsky 2.1, Seedream 4.5, GPT Image 1.5, realistic-vision-v4.0, dreamlike-diffusion-1.0, Kling-O1.
Author countries
Germany