Source tracing involves identifying the origin of synthetic audio samples. For example, given a set of audio deepfakes circulating on social media, a crucial question is whether these were generated using the same text-to-speech (TTS) model. If they were, it could indicate a single source, potentially revealing coordinated misinformation efforts. This capability is essential for holding developers of TTS and instant voice cloning tools accountable for misuse.
This page outlines the partitioning of MLAAD for use in source tracing. The dataset is divided into training, development, and evaluation subsets, along with protocols to train and evaluate models.
Note: Use of the MLAAD dataset for the Interspeech 2025 Special Session is optional. Researchers may use any publicly available datasets or metrics for submissions. Papers introducing new metrics for source tracing are highly encouraged.
You may cite the protocols as follows
@misc{UsingMLAADforSourceTracing, author = {Nicolas M{\"u}ller}, organization = {Fraunhofer AISEC}, title = {Using MLAAD for Source Tracing of Audio Deepfakes}, howpublished = {\url{https://deepfake-total.com/sourcetracing}}, month = {11}, year = {2024}, }