Advanced Signal Analysis in Detecting Replay Attacks for Automatic Speaker Verification Systems

View on arXiv ← Back to list

Authors: Lee Shih Kuang

Published: 2024-03-02 08:19:58+00:00

AI Summary

This research introduces novel signal analysis methods (arbitrary analysis, mel scale analysis, and constant Q analysis) inspired by the Fourier inversion formula for replay speech detection in automatic speaker verification. These methods improve efficiency and effectiveness in analyzing speech signals, particularly when integrated with temporal autocorrelation of speech features.

Abstract

This study proposes novel signal analysis methods for replay speech detection in automatic speaker verification (ASV) systems. The proposed methods -- arbitrary analysis (AA), mel scale analysis (MA), and constant Q analysis (CQA) -- are inspired by the calculation of the Fourier inversion formula. These methods introduce new perspectives in signal analysis for replay speech detection by employing alternative sinusoidal sequence groups. The efficacy of the proposed methods is examined on the ASVspoof 2019 & 2021 PA databases with experiments, and confirmed by the performance of systems that incorporated the proposed methods; the successful integration of the proposed methods and a speech feature that calculates temporal autocorrelation of speech (TAC) from complex spectra strongly confirms it. Moreover, the proposed CQA and MA methods show their superiority to the conventional methods on efficiency (approximately 2.36 times as fast compared to the conventional constant Q transform (CQT) method) and efficacy, respectively, in analyzing speech signals, making them promising to utilize in music and speech processing works.

Key findings

The proposed CQA method is approximately 2.36 times faster than the conventional CQT method. The MA method shows superior performance in capturing human speech characteristics, excelling in general replay speech detection. The integration of the proposed methods with temporal autocorrelation features significantly improves the detection of replay attacks.

Approach

The paper proposes three new signal analysis methods: arbitrary analysis (AA), mel scale analysis (MA), and constant Q analysis (CQA). These methods use alternative sinusoidal sequence groups to calculate spectra, offering improved efficiency and effectiveness compared to conventional methods like CQT. These are integrated with existing temporal autocorrelation features.

Datasets

ASVspoof 2019 & 2021 PA databases

Model(s)

Light Convolutional Neural Network (LCNN)

Author countries

Taiwan

← Previous