Is Seeing Believing? Evaluating Human Sensitivity to Synthetic Video

Authors: David Wegmann, Emil Stevnsborg, Søren Knudsen, Luca Rossi, Aske Mottelson

Published: 2026-03-14 09:00:52+00:00

AI Summary

This paper investigates human responses to visual and auditory distortions in videos, including deepfake-generated visuals and narration, to understand how individuals perceive synthetic media. Through three between-subjects experiments, the study examines whether these audio-visual distortions affect cognitive processing, such as subjective credibility assessment and objective learning outcomes. The research demonstrates that video distortions and deepfake artifacts can reduce credibility, highlighting the need for further theory development concerning deepfake exposure.

Abstract

Advances in machine learning have enabled the creation of realistic synthetic videos known as deepfakes. As deepfakes proliferate, concerns about rapid spread of disinformation and manipulation of public perception are mounting. Despite the alarming implications, our understanding of how individuals perceive synthetic media remains limited, obstructing the development of effective mitigation strategies. This paper aims to narrow this gap by investigating human responses to visual and auditory distortions of videos and deepfake-generated visuals and narration. In two between-subjects experiments, we study whether audio-visual distortions affect cognitive processing, such as subjective credibility assessment and objective learning outcomes. A third study reveals that artifacts from deepfakes influence credibility. The three studies show that video distortions and deepfake artifacts can reduce credibility. Our research contributes to the ongoing exploration of the cognitive processes involved in the evaluation and perception of synthetic videos, and underscores the need for further theory development concerning deepfake exposure.


Key findings
Visual distortions and deepfake artifacts, particularly when both visuals and narration were synthetic, reliably lowered message credibility and perceived source vividness. Despite these negative impacts on credibility, the manipulations did not significantly affect objective learning outcomes from the video. The study also found a consistent negative association between a participant's belief that a video was digitally altered and its perceived credibility, regardless of the actual presence of alterations.
Approach
The researchers conducted three preregistered online between-subjects experiments using a custom-produced educational video. Participants were exposed to versions of the video with visual distortions (e.g., ghosting filter), auditory distortions (e.g., echo), audio-visual asynchrony, or deepfake-generated faces and/or narration. They then completed self-report measures on message credibility, processing fluency, source vividness, and knowledge tests to assess learning outcomes.
Datasets
A custom-produced 447-second educational video on virology topics. For deepfake generation, the Latent Image Animator with "vox-pt" weights was used for synthetic video, and a fine-tuned So-Vits-Svc-fork model (trained on Harvard sentences) was used for synthetic narration.
Model(s)
UNKNOWN
Author countries
Denmark