UniAIDet: A Unified and Universal Benchmark for AI-Generated Image Content Detection and Localization

View on arXiv ← Back to list

Authors: Huixuan Zhang, Xiaojun Wan

Published: 2025-10-27 05:37:23+00:00

AI Summary

UniAIDet introduces a new unified and universal benchmark for AI-generated image content detection and localization, addressing limitations in existing datasets regarding model diversity, content categories, and localization support. The benchmark includes 80k images spanning photographic and artistic content generated by 20 diverse models, covering holistic synthesis (T2I, I2I) and partial synthesis (editing, inpainting, deepfake) with ground truth masks. The authors use this comprehensive resource to evaluate existing detection methods and analyze their generalization capabilities.

Abstract

With the rapid proliferation of image generative models, the authenticity of digital images has become a significant concern. While existing studies have proposed various methods for detecting AI-generated content, current benchmarks are limited in their coverage of diverse generative models and image categories, often overlooking end-to-end image editing and artistic images. To address these limitations, we introduce UniAIDet, a unified and comprehensive benchmark that includes both photographic and artistic images. UniAIDet covers a wide range of generative models, including text-to-image, image-to-image, image inpainting, image editing, and deepfake models. Using UniAIDet, we conduct a comprehensive evaluation of various detection methods and answer three key research questions regarding generalization capability and the relation between detection and localization. Our benchmark and analysis provide a robust foundation for future research.

Key findings

Existing detection and localization methods perform poorly on the comprehensive UniAIDet benchmark, highlighting significant shortcomings in maturity and generalization. Generalization across different generative models, especially partial synthesis models (editing/inpainting), remains a critical challenge for existing detectors. The analysis revealed that strong detection performance generally correlates with good localization performance, indicating that the two tasks are complementary.

Approach

The main contribution is the construction of UniAIDet, a large-scale benchmark designed to evaluate both AI-generated image detection and fine-grained localization. The dataset includes real images, fully generated images, and partially generated images (with corresponding masks) sourced from 20 different generative models across five categories. A comprehensive evaluation of various detection and localization methods is conducted to analyze generalization capability across models and content types.

Datasets

UniAIDet (Proposed Benchmark), MSCOCO, NYTimes800k, WikiArt, Danbooru, FFHQ

Model(s)

CLIP, C2P-CLIP, DeeCLIP, DIRE, FIRE, Effort, DRCT, NPR, FreqNet, AIDE, SAFE, HiFi-Net, FakeShield, SIDA

Author countries

China

← Previous