Enhancing Efficiency and Performance in Deepfake Audio Detection through Neuron-level Dropin & Neuroplasticity Mechanisms

Authors: Yupei Li, Shuaijie Shao, Manuel Milling, Björn Schuller

Published: 2026-03-25 14:22:32+00:00

Comment: Accepted at IJCNN 2026

AI Summary

This paper introduces novel neuron-level dropin and neuroplasticity algorithms for enhancing efficiency and performance in audio deepfake detection, drawing inspiration from mammalian brain function. These algorithms dynamically adjust the number of neurons in deep learning models to modulate parameters. Evaluated on ASVSpoof2019 LA, PA, and FakeorReal datasets, the methods demonstrate consistent improvements in computational efficiency and significant reductions in Equal Error Rate, achieving state-of-the-art performance on ASVspoof2019 LA.

Abstract

Current audio deepfake detection has achieved remarkable performance using diverse deep learning architectures such as ResNet, and has seen further improvements with the introduction of large models (LMs) like Wav2Vec. The success of large language models (LLMs) further demonstrates the benefits of scaling model parameters, but also highlights one bottleneck where performance gains are constrained by parameter counts. Simply stacking additional layers, as done in current LLMs, is computationally expensive and requires full retraining. Furthermore, existing low-rank adaptation methods are primarily applied to attention-based architectures, which limits their scope. Inspired by the neuronal plasticity observed in mammalian brains, we propose novel algorithms, dropin and further plasticity, that dynamically adjust the number of neurons in certain layers to flexibly modulate model parameters. We evaluate these algorithms on multiple architectures, including ResNet, Gated Recurrent Neural Networks, and Wav2Vec. Experimental results using the widely recognised ASVSpoof2019 LA, PA, and FakeorReal dataset demonstrate consistent improvements in computational efficiency with the dropin approach and a maximum of around 39% and 66% relative reduction in Equal Error Rate with the dropin and plasticity approach among these dataset, respectively. The code and supplementary material are available at Github link.

Key findings

The proposed dropin and plasticity algorithms consistently improved audio deepfake detection performance, achieving up to a 66% relative reduction in Equal Error Rate. The dropin approach also significantly enhanced computational efficiency, showing reduced backward time per step. Plasticity offered a superior trade-off between accuracy and model size, performing particularly well with larger models like Wav2Vec 2.0 and outperforming LoRA.

Approach

The authors propose 'dropin' and 'plasticity' algorithms. Dropin involves selectively adding neurons to specific layers and training only these new connections while freezing the rest of the network, enhancing efficiency. The plasticity algorithm builds on this by initially adding neurons, retraining the entire model, and then pruning the added neurons to maintain the original model size while preserving performance gains from the expansion phase.

Datasets

ASVspoof2019 LA, ASVspoof2019 PA, FakeorReal (FoR)

Model(s)

ResNet18, Gated Recurrent Neural Networks (GRNN), Wav2Vec 2.0

Author countries

United Kingdom, Germany

← Previous