A Controllable 3D Deepfake Generation Framework with Gaussian Splatting

Authors: Wending Liu, Siyun Liang, Huy H. Nguyen, Isao Echizen

Published: 2025-09-15 06:34:17+00:00

AI Summary

This paper introduces a 3D deepfake generation framework using 3D Gaussian Splatting and a parametric head model (FLAME), enabling realistic, identity-preserving face swapping and reenactment with full 3D control. It surpasses 2D methods by achieving multi-view consistency and improved 3D realism.

Abstract

We propose a novel 3D deepfake generation framework based on 3D Gaussian Splatting that enables realistic, identity-preserving face swapping and reenactment in a fully controllable 3D space. Compared to conventional 2D deepfake approaches that suffer from geometric inconsistencies and limited generalization to novel view, our method combines a parametric head model with dynamic Gaussian representations to support multi-view consistent rendering, precise expression control, and seamless background integration. To address editing challenges in point-based representations, we explicitly separate the head and background Gaussians and use pre-trained 2D guidance to optimize the facial region across views. We further introduce a repair module to enhance visual consistency under extreme poses and expressions. Experiments on NeRSemble and additional evaluation videos demonstrate that our method achieves comparable performance to state-of-the-art 2D approaches in identity preservation, as well as pose and expression consistency, while significantly outperforming them in multi-view rendering quality and 3D consistency. Our approach bridges the gap between 3D modeling and deepfake synthesis, enabling new directions for scene-aware, controllable, and immersive visual forgeries, revealing the threat that emerging 3D Gaussian Splatting technique could be used for manipulation attacks.


Key findings
The proposed method achieves comparable 2D identity preservation to state-of-the-art 2D approaches while significantly outperforming them in multi-view rendering quality and 3D consistency. It also demonstrates higher frame rates, enabling real-time face reenactment, raising significant ethical concerns.
Approach
The framework combines FLAME for animatable head modeling with 3D Gaussian Splatting for efficient rendering. It uses a pre-trained 2D face swapping model as supervision for 3D Gaussian attribute optimization, incorporating a refinement module to handle challenging poses and expressions. A 3D background reconstruction and alignment strategy ensures scene coherence.
Datasets
NeRSemble dataset, additional evaluation videos with real-world scenes and camera settings.
Model(s)
FLAME (parametric head model), 3D Gaussian Splatting, pre-trained 2D face swapping model (SimSwap), CodeFormer (for refinement), COLMAP (for background reconstruction).
Author countries
Japan, Vietnam, Germany