A complex values neural network. It takes as input complex CQCC features and produces an output score between 0 and 1.Details will be made public soon.
The ComplexNet model predicted a deepfake score of . This means that the soundtrack in this video is a deepfake with a probability of percent (out of 100%).
Caution: These results are of limited value at the moment. This is among other things due to the fact that deepfake detection models do not (yet) generalise well, new fakes appear daily, and our model only evaluates the audio track, but not the video track. If the voice in the audio comes from a voice actor, i.e. an actor who imitates the speech, then this is not considered a synthetically generated voice and is therefore not recognised as such.
The LCNN model, presented in https://arxiv.org/pdf/2103.11326.pdf, is a light convolutional neural network introduced in 2020. It takes as input CQT features and produces an output score between 0 and 1.This model is trained only shortly (sees 1.8 million samples once)
The LCNN model predicted a deepfake score of . This means that the soundtrack in this video is a deepfake with a probability of percent (out of 100%).
Caution: These results are of limited value at the moment. This is among other things due to the fact that deepfake detection models do not (yet) generalise well, new fakes appear daily, and our model only evaluates the audio track, but not the video track. If the voice in the audio comes from a voice actor, i.e. an actor who imitates the speech, then this is not considered a synthetically generated voice and is therefore not recognised as such.
TODO
The Whisper-DF model predicted a deepfake score of . This means that the soundtrack in this video is a deepfake with a probability of percent (out of 100%).
Caution: These results are of limited value at the moment. This is among other things due to the fact that deepfake detection models do not (yet) generalise well, new fakes appear daily, and our model only evaluates the audio track, but not the video track. If the voice in the audio comes from a voice actor, i.e. an actor who imitates the speech, then this is not considered a synthetically generated voice and is therefore not recognised as such.
A Wav2Vec2-based model, presented in https://github.com/eurecom-asp/SSL_Anti-spoofing. It uses pre-trained weights to derive an embedding vector for the audio file, which is then classified by a neural network.
The SSL-W2V model predicted a deepfake score of . This means that the soundtrack in this video is a deepfake with a probability of percent (out of 100%).
Caution: These results are of limited value at the moment. This is among other things due to the fact that deepfake detection models do not (yet) generalise well, new fakes appear daily, and our model only evaluates the audio track, but not the video track. If the voice in the audio comes from a voice actor, i.e. an actor who imitates the speech, then this is not considered a synthetically generated voice and is therefore not recognised as such.
Tell us if you think this audio is fake or authentic: