LAKAN: Landmark-assisted Adaptive Kolmogorov-Arnold Network for Face Forgery Detection

View on arXiv ← Back to list

Authors: Jiayao Jiang, Siran Peng, Bin Liu, Qi Chu, Nenghai Yu

Published: 2025-10-01 08:10:38+00:00

AI Summary

The paper introduces LAKAN, a novel approach for face forgery detection based on the Kolmogorov-Arnold Network (KAN). LAKAN utilizes facial landmarks as a structural prior to dynamically generate KAN parameters, allowing the network to use learnable splines to model complex, non-linear forgery artifacts. This mechanism guides the image encoder toward critical facial areas, leading to enhanced generalization capabilities.

Abstract

The rapid development of deepfake generation techniques necessitates robust face forgery detection algorithms. While methods based on Convolutional Neural Networks (CNNs) and Transformers are effective, there is still room for improvement in modeling the highly complex and non-linear nature of forgery artifacts. To address this issue, we propose a novel detection method based on the Kolmogorov-Arnold Network (KAN). By replacing fixed activation functions with learnable splines, our KAN-based approach is better suited to this challenge. Furthermore, to guide the network's focus towards critical facial areas, we introduce a Landmark-assisted Adaptive Kolmogorov-Arnold Network (LAKAN) module. This module uses facial landmarks as a structural prior to dynamically generate the internal parameters of the KAN, creating an instance-specific signal that steers a general-purpose image encoder towards the most informative facial regions with artifacts. This core innovation creates a powerful combination between geometric priors and the network's learning process. Extensive experiments on multiple public datasets show that our proposed method achieves superior performance.

Key findings

LAKAN achieves superior performance in both cross-dataset generalization and cross-manipulation evaluations, consistently outperforming SOTA baselines and achieving AUC scores over 96% on CDF2. Ablation studies confirmed the LAKAN module provides significant and consistent performance gains when integrated into various backbone architectures, including ConvNeXt, EfficientNet, and Swin Transformer. The approach demonstrated high robustness against unseen manipulation types, achieving near-perfect detection accuracy on several FF++ manipulation subsets.

Approach

The method integrates the plug-and-play LAKAN module into a standard image encoder, such as ConvNeXt. LAKAN uses detected facial landmarks to generate a guidance vector via positional embedding and an MLP, which then dynamically configures the spline weights and scalers of the internal KAN layer. This configuration produces an instance-specific gating signal used to modulate the feature map, forcing the model to focus on forgery clues in structural facial regions.

Datasets

FaceForensics++ (FF++), Celeb-DeepFake-v2 (CDF2), DeepFake Detection Challenge (DFDC), DFDCP, FFIW-10K (FFIW)

Model(s)

Landmark-assisted Adaptive Kolmogorov-Arnold Network (LAKAN), ConvNeXt-Base, EfficientNet, Swin Transformer

Author countries

China

← Previous