Authors:
Zhizheng Wu, Tomi Kinnunen, Nicholas Evans, Junichi Yamagishi, Cemal Hanilçi, Md Sahidullah, Aleksandr Sizov
Zhizheng Wu, Tomi Kinnunen, Nicholas Evans, Junichi Yamagishi, Cemal Hanilçi, Md Sahidullah, Aleksandr Sizov
Abstract:
ASVspoof 2015 provides a standardized dataset to advance research in ASV spoofing detection, fostering the development of more generalizable countermeasures and enabling fair comparisons across different systems. The dataset includes genuine and spoofed speech, partitioned into training, development, and evaluation sets.
ASVspoof 2015 provides a standardized dataset to advance research in ASV spoofing detection, fostering the development of more generalizable countermeasures and enabling fair comparisons across different systems. The dataset includes genuine and spoofed speech, partitioned into training, development, and evaluation sets.
Data Creation Method:
The dataset includes genuine and spoofed speech collected from 106 human speakers. Genuine speech was recorded without modification, while spoofed speech was created using various speech synthesis (SS) and voice conversion (VC) algorithms.
The dataset includes genuine and spoofed speech collected from 106 human speakers. Genuine speech was recorded without modification, while spoofed speech was created using various speech synthesis (SS) and voice conversion (VC) algorithms.
Number of Speakers:
- Training: 25 speakers (10 male, 15 female)
- Development: 35 speakers (15 male, 20 female)
- Evaluation: 46 speakers (20 male, 26 female)
Total Size:
- Training: 16,375 utterances
- Development: 53,372 utterances
- Evaluation: 193,404 utterances
Number of Real Samples:
- Training: 3,750 genuine utterances
- Development: 3,497 genuine utterances
- Evaluation: 9,404 genuine utterances
Number of Fake Samples:
- Training: 12,625 spoofed utterances
- Development: 49,875 spoofed utterances
- Evaluation: 184,000 spoofed utterances
Description of the Dataset:
- Each utterance is approximately one to two seconds long.
Extra Details:
The dataset is specifically focused on high-technology speech synthesis and voice conversion spoofing algorithms, excluding low-technology replay and impersonation attacks.
The dataset is specifically focused on high-technology speech synthesis and voice conversion spoofing algorithms, excluding low-technology replay and impersonation attacks.
Data Type:
- Audio files
Average Length:
- Approximately 1 to 2 seconds per utterance
Keywords:
- Speaker verification, Spoofing, Anti-spoofing, Countermeasure, Spoofing detection
When Published:
- 2015
Annotation Process:
Genuine utterances were recorded from 106 speakers. Spoofed utterances were created using various SS and VC algorithms.
Genuine utterances were recorded from 106 speakers. Spoofed utterances were created using various SS and VC algorithms.
Usage Scenarios:
Developing and evaluating spoofing detection algorithms.
Developing and evaluating spoofing detection algorithms.
Miscellaneous Information:
The dataset focuses on high-technology spoofing algorithms and does not cover low-technology replay or impersonation attacks.
The dataset focuses on high-technology spoofing algorithms and does not cover low-technology replay or impersonation attacks.
Credits:
Datasets Used:
Datasets Used:
- ASVspoof 2015
Speech Synthesis Models Referenced:
- Various speech synthesis and voice conversion algorithms