Diffuse or Confuse: A Diffusion Deepfake Speech Dataset
Authors: Anton Firc, Kamil Malinka, Petr Hanáček
Published: 2024-10-09 11:51:08+00:00
Comment: Presented at International Conference of the Biometrics Special Interest Group (BIOSIG 2024)
Journal Ref: 2024 International Conference of the Biometrics Special Interest Group (BIOSIG)
AI Summary
This paper introduces a novel deepfake speech dataset generated using diffusion models to evaluate their impact on current deepfake detection systems. The study compares diffusion-generated deepfakes with non-diffusion ones, assessing their quality and detectability. Findings suggest that diffusion-based deepfakes are generally comparable to non-diffusion deepfakes in terms of detection, with some variability across detector architectures.
Abstract
Advancements in artificial intelligence and machine learning have significantly improved synthetic speech generation. This paper explores diffusion models, a novel method for creating realistic synthetic speech. We create a diffusion dataset using available tools and pretrained models. Additionally, this study assesses the quality of diffusion-generated deepfakes versus non-diffusion ones and their potential threat to current deepfake detection systems. Findings indicate that the detection of diffusion-based deepfakes is generally comparable to non-diffusion deepfakes, with some variability based on detector architecture. Re-vocoding with diffusion vocoders shows minimal impact, and the overall speech quality is comparable to non-diffusion methods.