Fact or Fake? Assessing the Role of Deepfake Detectors in Multimodal Misinformation Detection

View on arXiv ← Back to list

Authors: A S M Sharifuzzaman Sagar, Mohammed Bennamoun, Farid Boussaid, Naeha Sharif, Lian Xu, Shaaban Sahmoud, Ali Kishk

Published: 2026-02-02 09:28:16+00:00

AI Summary

This paper systematically evaluates the utility of pixel-level deepfake detectors in the context of multimodal (image-text) misinformation detection. It proposes an evidence-driven fact-checking system and compares its performance against standalone deepfake detectors and a hybrid system integrating detector outputs. The study concludes that pixel-level detectors offer limited value and can degrade overall fact-checking performance, emphasizing the importance of semantic understanding and external evidence.

Abstract

In multimodal misinformation, deception usually arises not just from pixel-level manipulations in an image, but from the semantic and contextual claim jointly expressed by the image-text pair. Yet most deepfake detectors, engineered to detect pixel-level forgeries, do not account for claim-level meaning, despite their growing integration in automated fact-checking (AFC) pipelines. This raises a central scientific and practical question: Do pixel-level detectors contribute useful signal for verifying image-text claims, or do they instead introduce misleading authenticity priors that undermine evidence-based reasoning? We provide the first systematic analysis of deepfake detectors in the context of multimodal misinformation detection. Using two complementary benchmarks, MMFakeBench and DGM4, we evaluate: (1) state-of-the-art image-only deepfake detectors, (2) an evidence-driven fact-checking system that performs tool-guided retrieval via Monte Carlo Tree Search (MCTS) and engages in deliberative inference through Multi-Agent Debate (MAD), and (3) a hybrid fact-checking system that injects detector outputs as auxiliary evidence. Results across both benchmark datasets show that deepfake detectors offer limited standalone value, achieving F1 scores in the range of 0.26-0.53 on MMFakeBench and 0.33-0.49 on DGM4, and that incorporating their predictions into fact-checking pipelines consistently reduces performance by 0.04-0.08 F1 due to non-causal authenticity assumptions. In contrast, the evidence-centric fact-checking system achieves the highest performance, reaching F1 scores of approximately 0.81 on MMFakeBench and 0.55 on DGM4. Overall, our findings demonstrate that multimodal claim verification is driven primarily by semantic understanding and external evidence, and that pixel-level artifact signals do not reliably enhance reasoning over real-world image-text misinformation.

Key findings

Image-only deepfake detectors showed limited standalone value, achieving F1 scores in the range of 0.26-0.53 on MMFakeBench and 0.33-0.49 on DGM4. Incorporating detector predictions into the fact-checking system consistently reduced performance by 0.04-0.08 F1. The evidence-centric fact-checking system achieved the highest performance (F1 scores of approximately 0.81 on MMFakeBench and 0.55 on DGM4), demonstrating that multimodal claim verification is primarily driven by semantic understanding and external evidence rather than pixel-level artifact signals.

Approach

The authors propose a two-stage evidence-centric fact-checking system. Stage 1 uses a Monte Carlo Tree Search (MCTS)-guided multimodal Large Language Model (LLM) to acquire diverse evidence for text and image subtasks via an extensible toolset. Stage 2 employs a Multi-Agent Debate (MAD) module where skeptic and supporter agents deliberate over the collected evidence, and a neutral Judge LLM issues the final verdict. For comparison, they also evaluate state-of-the-art image-only deepfake detectors and a hybrid system that injects detector outputs as auxiliary evidence into their fact-checking pipeline.

Datasets

MMFakeBench, DGM4

Model(s)

For image-only deepfake detection: GRAMNet, BNext-M, NPR, FatFormer, ProDet. For the proposed fact-checking system: Multimodal Large Language Models (LLMs) are central to the MCTS-guided planner and the Multi-Agent Debate (MAD) agents.

Author countries

Australia, Turkey, Qatar

← Previous