Using MLAAD for Source Tracing of Audio Deepfakes

Source tracing involves identifying the origin of synthetic audio samples. For example, given a set of audio deepfakes circulating on social media, a crucial question is whether these were generated using the same text-to-speech (TTS) model. If they were, it could indicate a single source, potentially revealing coordinated misinformation efforts. This capability is essential for holding developers of TTS and instant voice cloning tools accountable for misuse.

This page outlines the partitioning of MLAAD for use in source tracing. The dataset is divided into training, development, and evaluation subsets, along with protocols to train and evaluate models.

MLAAD for Source Tracing

Downloads

Note: Use of the MLAAD dataset for the Interspeech 2025 Special Session is optional. Researchers may use any publicly available datasets or metrics for submissions. Papers introducing new metrics for source tracing are highly encouraged.

Languages
Model IDs

Dataset Statistics

Fine-Grained Statistics

Citation

You may cite the protocols as follows

@misc{UsingMLAADforSourceTracing,
    author = {Nicolas M{\"u}ller},
    organization = {Fraunhofer AISEC},
    title = {Using MLAAD for Source Tracing of Audio Deepfakes},
    howpublished = {\url{https://deepfake-total.com/sourcetracing}},
    month = {11},
    year = {2024},
}