Methods and Trends in Detecting Generated Images: A Comprehensive Review

Authors: Arpan Mahara, Naphtali Rishe

Published: 2025-02-21 03:16:18+00:00

AI Summary

This research paper provides a comprehensive review of state-of-the-art methods for detecting synthetic images generated by advanced generative AI models. It systematically examines core detection methodologies, categorizes them into meaningful taxonomies, and presents an overview of publicly available datasets.

Abstract

The proliferation of generative models, such as Generative Adversarial Networks (GANs), Diffusion Models, and Variational Autoencoders (VAEs), has enabled the synthesis of high-quality multimedia data. However, these advancements have also raised significant concerns regarding adversarial attacks, unethical usage, and societal harm. Recognizing these challenges, researchers have increasingly focused on developing methodologies to detect synthesized data effectively, aiming to mitigate potential risks. Prior reviews have primarily focused on deepfake detection and often lack coverage of recent advancements in synthetic image detection, particularly methods leveraging multimodal frameworks for improved forensic analysis. To address this gap, the present survey provides a comprehensive review of state-of-the-art methods for detecting and classifying synthetic images generated by advanced generative AI models. This review systematically examines core detection methodologies, identifies commonalities among approaches, and categorizes them into meaningful taxonomies. Furthermore, given the crucial role of large-scale datasets in this field, we present an overview of publicly available datasets that facilitate further research and benchmarking in synthetic data detection.


Key findings
The review indicates that multimodal frameworks, particularly those using vision-language models like CLIP, show greater robustness and adaptability across different generative models. Frequency domain analysis also proves effective, particularly for detecting diffusion model-generated images. Generalizability across various generative models and datasets remains a significant challenge.
Approach
The paper reviews existing approaches to synthetic image detection, categorizing them into spatial domain analysis, multimodal vision-language methods, frequency domain analysis, fingerprint analysis, and patch-based analysis. It analyzes the strengths and weaknesses of each approach and assesses their generalizability across different generative models and datasets.
Datasets
ForenSynths, Artifact, SynthBuster, DiffusionForensics, UnivFD, Community Forensics, GenImage, CIFAKE, AIGCDetection Benchmark, ImagiNet, Chameleon, and others.
Model(s)
ResNet-50, ResNet-34, EfficientNetB4, CLIP, ViT, ConvNeXt, XceptionNet, ProGAN, StyleGAN, BigGAN, CycleGAN, StarGAN, GauGAN, various diffusion models (e.g., Stable Diffusion, GLIDE, LDM, DALL-E), and others.
Author countries
USA