Authors:
Junichi Yamagishi, Xin Wang, Massimiliano Todisco, Md Sahidullah, Jose Patino, Andreas Nautsch, Xuechen Liu, Kong Aik Lee, Tomi Kinnunen, Nicholas Evans, Hector Delgado
Junichi Yamagishi, Xin Wang, Massimiliano Todisco, Md Sahidullah, Jose Patino, Andreas Nautsch, Xuechen Liu, Kong Aik Lee, Tomi Kinnunen, Nicholas Evans, Hector Delgado
Abstract:
The ASVspoof 2021 dataset advances the field of spoofed and deepfake speech detection by introducing more challenging and realistic conditions, fostering the development of robust countermeasures.
The ASVspoof 2021 dataset advances the field of spoofed and deepfake speech detection by introducing more challenging and realistic conditions, fostering the development of robust countermeasures.
Data Creation Method:
The dataset includes three tasks:
The dataset includes three tasks:
- Logical Access (LA): Involves detecting synthetic and converted speech injected into communication systems without acoustic propagation.
- Physical Access (PA): Involves replay attacks recorded in real physical spaces with various noise and reverberation conditions.
- DeepFake (DF): Involves detecting deepfake speech processed with different lossy codecs typically used for media storage.
The data is derived from the VCTK base corpus and includes both bona fide and spoofed speech samples, processed under various conditions such as encoding, transmission, and physical environment variability.
Number of Speakers:
- Training and Development: 30 speakers (20 for training, 10 for development)
- Evaluation: 48 speakers (21 male, 27 female)
Total Size:
- Not specified
Number of Real Samples:
- Not specified
Number of Fake Samples:
- More than 100 different spoofing algorithms
Description of the Dataset:
- The dataset includes bona fide and spoofed speech samples processed under various conditions such as encoding, transmission, and physical environment variability.
Extra Details:
Not specified.
Not specified.
Data Type:
- PCM files
Average Length:
- Not specified
Keywords:
- Speaker verification, Spoofing, Anti-spoofing, Countermeasure, Deepfake detection
When Published:
- 2021
Annotation Process:
Genuine utterances were recorded from speakers in controlled environments. Spoofed utterances were generated using various speech synthesis and voice conversion algorithms.
Genuine utterances were recorded from speakers in controlled environments. Spoofed utterances were generated using various speech synthesis and voice conversion algorithms.
Usage Scenarios:
Developing and evaluating spoofing detection algorithms.
Developing and evaluating spoofing detection algorithms.
Miscellaneous Information:
The dataset provides a comprehensive benchmark for evaluating and improving spoofing detection systems under varied conditions.
The dataset provides a comprehensive benchmark for evaluating and improving spoofing detection systems under varied conditions.