Collab: Fostering Critical Identification of Deepfake Videos on Social Media via Synergistic Annotation

Authors: Shuning Zhang, Linzhi Wang, Shixuan Li, Yuanyuan Wu, Yuwei Chuai, Luoxi Chen, Xin Yi, Hewu Li

Published: 2026-01-24 08:32:23+00:00

Comment: To be published in CHI'26: 10.1145/3772318.3790501

AI Summary

Collab introduces a web plugin for collaboratively identifying deepfake videos on social media, addressing challenges posed by dynamic artifacts and inadequate user tools. It features an intuitive spatio-temporal labeling interface, a novel confidence-weighted spatio-temporal IoU aggregation algorithm, and a hierarchical demonstration strategy to guide users. An online study demonstrated that Collab significantly improved deepfake identification accuracy and enhanced critical reflection among participants.

Abstract

Identifying deepfake videos on social media platforms is challenged by dynamic spatio-temporal artifacts and inadequate user tools. This hinders both critical viewing by users and scalable moderation on platforms. Here, we present Collab, a web plugin enabling users to collaboratively annotate deepfake videos. Collab integrates three key components: (i) an intuitive interface for spatio-temporal labeling where users provide confidence scores and rationales, facilitating detailed input even from non-experts, (ii) a novel confidence-weighted spatio-temporal Intersection-over-Union (IoU) algorithm to aggregate diverse user annotations into accurate aggregations, and (iii) a hierarchical demonstration strategy presenting aggregated results to guide attention toward contentious regions and foster critical evaluation. A seven-day online study (N=90), where participants annotated suspicious videos when viewing an online experimental platforms, compared Collab against two conditions without aggregation or demonstration respectively. Collab significantly improved identification accuracy and enhanced reflection compared to non-demonstration condition, while outperforming non-aggregation condition for its novelty and effectiveness.


Key findings
Collab achieved an F1-score of 0.883, significantly improving deepfake video identification accuracy compared to conditions without aggregation or demonstration. It fostered critical user engagement, leading to more precise, smaller annotations and a shift from generic to specific artifact labels. Participants reported reduced perceived workload and increased reflection, rationality, and overall satisfaction with Collab.
Approach
The system, Collab, solves the problem by providing a web plugin for users to collaboratively annotate deepfake videos. It employs an intuitive spatio-temporal labeling interface where users provide confidence scores and rationales, aggregates these diverse annotations using a confidence-weighted spatio-temporal Intersection-over-Union (IoU) algorithm, and uses a hierarchical demonstration strategy to present aggregated results, guiding user attention and fostering critical evaluation.
Datasets
Face Forensics++, BioDeepAV, DFW, DDL
Model(s)
UNKNOWN
Author countries
China, Luxembourg