Advanced Signal Analysis in Detecting Replay Attacks for Automatic Speaker Verification Systems
Authors: Lee Shih Kuang
Published: 2024-03-02 08:19:58+00:00
Comment: https://github.com/shihkuanglee/ADFA
AI Summary
This study introduces novel signal analysis methods: Arbitrary Analysis (AA), Mel Scale Analysis (MA), and Constant Q Analysis (CQA) for replay speech detection in automatic speaker verification (ASV) systems. Inspired by the Fourier inversion formula, these methods offer new perspectives by using alternative sinusoidal sequence groups. They demonstrate superior efficacy and/or efficiency compared to conventional methods on ASVspoof 2019 & 2021 PA databases, especially when integrated with the Temporal Autocorrelation of Speech (TAC) feature.
Abstract
This study proposes novel signal analysis methods for replay speech detection in automatic speaker verification (ASV) systems. The proposed methods -- arbitrary analysis (AA), mel scale analysis (MA), and constant Q analysis (CQA) -- are inspired by the calculation of the Fourier inversion formula. These methods introduce new perspectives in signal analysis for replay speech detection by employing alternative sinusoidal sequence groups. The efficacy of the proposed methods is examined on the ASVspoof 2019 \\& 2021 PA databases with experiments, and confirmed by the performance of systems that incorporated the proposed methods; the successful integration of the proposed methods and a speech feature that calculates temporal autocorrelation of speech (TAC) from complex spectra strongly confirms it. Moreover, the proposed CQA and MA methods show their superiority to the conventional methods on efficiency (approximately 2.36 times as fast compared to the conventional constant Q transform (CQT) method) and efficacy, respectively, in analyzing speech signals, making them promising to utilize in music and speech processing works.