Interpretable facial dynamics as behavioral and perceptual traces of deepfakes

Authors: Timothy Joseph Murphy, Jennifer Cook, Hélio Clemente José Cuve

Published: 2026-04-23 15:07:30+00:00

Comment: Main paper: 19 pages, 5 figures, 4 tables. SI Appendix: 11 pages, 3 figures, 6 tables

AI Summary

This study introduces an interpretable deepfake detection method based on bio-behavioral facial dynamics, specifically identifying core low-dimensional patterns of movement and deriving temporal features. Traditional machine learning classifiers trained on these features achieved significant deepfake classification, with detection being more accurate for emotive expressions. The research also compares model decisions with human perceptual judgments, revealing context-dependent convergence for emotive content but divergent underlying strategies.

Abstract

Deepfake detection research has largely converged on deep learning approaches that, despite strong benchmark performance, offer limited insight into what distinguishes real from manipulated facial behavior. This study presents an interpretable alternative grounded in bio-behavioral features of facial dynamics and evaluates how computational detection strategies relate to human perceptual judgments. We identify core low-dimensional patterns of facial movement, from which temporal features characterizing spatiotemporal structure were derived. Traditional machine learning classifiers trained on these features achieved modest but significant above-chance deepfake classification, driven by higher-order temporal irregularities that were more pronounced in manipulated than real facial dynamics. Notably, detection was substantially more accurate for videos containing emotive expressions than those without. An emotional valence classification analysis further indicated that emotive signals are systematically degraded in deepfakes, explaining the differential impact of emotive dynamics on detection. Furthermore, we provide an additional and often overlooked dimension of explainability by assessing the relationship between model decisions and human perceptual detection. Model and human judgments converged for emotive but diverged for non-emotive videos, and even where outputs aligned, underlying detection strategies differed. These findings demonstrate that face-swapped deepfakes carry a measurable behavioral fingerprint, most salient during emotional expression. Additionally, model-human comparisons suggest that interpretable computational features and human perception may offer complementary rather than redundant routes to detection.

Key findings

Interpretable temporal features derived from facial dynamics provide a diagnostic fingerprint for deepfake detection, particularly for emotive expressions, where detection was significantly more accurate. Face-swapping systematically degrades emotion-related facial dynamics, contributing to this effect. While model and human judgments converge for emotive videos, their underlying detection strategies diverge, suggesting complementary rather than redundant routes to detection.

Approach

The approach extracts facial Action Unit (AU) intensities from videos, applies non-negative matrix factorization (NMF) to identify core low-dimensional patterns of facial movement, and then derives temporal features characterizing spatiotemporal structure from representative AUs. These interpretable features are used to train traditional machine learning classifiers for deepfake detection, with a parallel analysis comparing model performance and strategies to human perceptual judgments.

Datasets

Google DeepFakeDetection (DFD) dataset (subset of FaceForensics++)

Model(s)

Random Forest, C5.0 Boosted Decision Trees, Support Vector Machine (SVM), Logistic Regression

Author countries

United Kingdom