Jiangyan Yi, Ruibo Fu, Jianhua Tao, Shuai Nie, Haoxin Ma, Chenglong Wang, Tao Wang, Zhengkun Tian, Ye Bai, Cunhang Fan, Shan Liang, Shiming Wang, Shuai Zhang, Xinrui Yan, Le Xu, Zhengqi Wen, Haizhou Li, Zheng Lian, Bin Liu
The dataset for the challenge consists of training, development, adaptation, and test sets. The audio samples were selected from publicly available Mandarin corpora AISHELL-1, AISHELL-3, and AISHELL-4, and were processed using various speech synthesis and voice conversion algorithms.
Number of Speakers:
- Training and Development: 40 male and 40 female speakers selected from the AISHELL-3 corpus
Total Size:
- 85 hours
Number of Real Samples:
- Low-Quality Fake Audio Detection (LF): 300
- Partially Fake Audio Detection (PF): 0
Number of Fake Samples:
- Low-Quality Fake Audio Detection (LF): 700
- Partially Fake Audio Detection (PF): 1052
Description of the Dataset:
- The dataset includes genuine and fake utterances with various noises and disturbances. Real utterances were selected from AISHELL-3.
Extra Details:
The dataset supports developing methods or algorithms to distinguish generated audio from real audio. It involves detecting various forms of audio deepfakes.
ADD 2022 challenge includes training, development, adaptation, and test sets designed to facilitate the detection of audio deepfakes across various scenarios. These datasets are built using selections from the AISHELL-1, AISHELL-3, and AISHELL-4 Mandarin speech corpora, ensuring a diverse range of speakers and no overlap among different sets to maintain integrity in model testing.
Data Type:
- WAV files
Average Length:
- Not specified
Keywords:
- Audio deepfake, Fake detection, Low-quality fake, Partially fake, Audio fake game
When Published:
- 26 Feb 2022
Datasets include genuine and fake utterances with various noises and disturbances. Real utterances were selected from AISHELL-3.
Developing methods or algorithms to distinguish generated audio from real audio.
The challenge dataset includes diverse audio deepfake scenarios for testing and improving detection systems.
Datasets Used:
- AISHELL-1, AISHELL-3, AISHELL-4
Speech Synthesis Models Referenced:
- Various speech synthesis and voice conversion algorithms