Interpreting the results
The SSL-W2V model assigned a deepfake score of
This indicates that the soundtrack in the video has a likelihood of
being a deepfake with probability
percent (out of 100%).
Scores nearing zero suggest authentic, non-fake human speech,
while scores approaching 100% point towards the audio being a fake.
What has been analyzed?
The analysis was solely conducted on the audio waveform between second
0 and second 30.
No information beyond this duration, nor any metadata like filename, recording date,
etc., was considered. Additionally, the video component has not been analyzed (yet).
On the 'SSL-W2V' model
The Wav2Vec2-based model, as presented in https://github.com/eurecom-asp/SSL_Anti-spoofing. It uses pre-trained weights to derive an embedding vector for the audio file, which is then classified by a neural network.
Please note: the outcomes provided here are still in the experimental stage.
This is partly because current deepfake detection models have limited generalization capabilities,
as they do not (yet) effectively adapt to the rapidly evolving nature of new deepfakes
that emerge daily. For more information, see
our paper on the subject.
Additionally, the model focuses solely on analyzing the audio track
and does not evaluate the video component.
Consequently, if the voice in the audio is that of a voice actor mimicking someone else's speech,
it is not identified as a synthetically generated voice by our system.