Authors:
Junichi Yamagishi, Xin Wang, Massimiliano Todisco, Md Sahidullah, Jose Patino, Andreas Nautsch, Xuechen Liu, Kong Aik Lee, Tomi Kinnunen, Nicholas Evans, Hector Delgado
Junichi Yamagishi, Xin Wang, Massimiliano Todisco, Md Sahidullah, Jose Patino, Andreas Nautsch, Xuechen Liu, Kong Aik Lee, Tomi Kinnunen, Nicholas Evans, Hector Delgado
Abstract:
Involves detecting deepfake speech processed with different lossy codecs typically used for media storage. Best performing system achieved an EER of 15.64%.
Involves detecting deepfake speech processed with different lossy codecs typically used for media storage. Best performing system achieved an EER of 15.64%.
Data Creation Method:
A fake audio detection task comprising bona fide and spoofed utterances generated using TTS and VC algorithms. Genuine utterances were recorded from speakers in controlled environments. Spoofed utterances were generated using various speech synthesis and voice conversion algorithms.
A fake audio detection task comprising bona fide and spoofed utterances generated using TTS and VC algorithms. Genuine utterances were recorded from speakers in controlled environments. Spoofed utterances were generated using various speech synthesis and voice conversion algorithms.
Number of Speakers:
- Training and Development: 30 speakers (20 for training, 10 for development)
- Evaluation: 48 speakers (21 male, 27 female)
Total Size:
- Not specified
Number of Real Samples:
- Not specified
Number of Fake Samples:
- More than 100 different spoofing algorithms
Description of the Dataset:
- The dataset involves detecting deepfake speech processed with different lossy codecs used for media storage. It includes a variety of spoofed and genuine utterances generated using TTS and VC algorithms.
Extra Details:
The system was evaluated based on the Equal Error Rate (EER). The best performing system achieved an EER of 15.64%.
The system was evaluated based on the Equal Error Rate (EER). The best performing system achieved an EER of 15.64%.
Data Type:
- pcm files
Average Length:
- Not specified
Keywords:
- Speaker verification, Spoofing, Anti-spoofing, Countermeasure, Deepfake detection
When Published:
- 2021
Annotation Process:
Genuine utterances were recorded from speakers in controlled environments. Spoofed utterances were generated using various speech synthesis and voice conversion algorithms.
Genuine utterances were recorded from speakers in controlled environments. Spoofed utterances were generated using various speech synthesis and voice conversion algorithms.
Usage Scenarios:
Speaker verification, detecting and countering deepfake speech, enhancing anti-spoofing technologies.
Speaker verification, detecting and countering deepfake speech, enhancing anti-spoofing technologies.
Miscellaneous Information:
The dataset supports various spoofing algorithms and is used for evaluating deepfake speech detection systems. The detection system achieved significant performance metrics in identifying spoofed speech.
The dataset supports various spoofing algorithms and is used for evaluating deepfake speech detection systems. The detection system achieved significant performance metrics in identifying spoofed speech.
Credits:
Datasets Used:
Datasets Used:
- Not specified
Speech Synthesis Models Referenced:
- Various TTS and VC algorithms