ADD 2022

Authors:
Jiangyan Yi, Ruibo Fu, Jianhua Tao, Shuai Nie, Haoxin Ma, Chenglong Wang, Tao Wang, Zhengkun Tian, Ye Bai, Cunhang Fan, Shan Liang, Shiming Wang, Shuai Zhang, Xinrui Yan, Le Xu, Zhengqi Wen, Haizhou Li, Zheng Lian, Bin Liu

 

Data Creation Method:
The dataset for the challenge consists of training, development, adaptation, and test sets. The audio samples were selected from publicly available Mandarin corpora AISHELL-1, AISHELL-3, and AISHELL-4, and were processed using various speech synthesis and voice conversion algorithms.

 

Number of Speakers:

  • Training and Development: 40 male and 40 female speakers selected from the AISHELL-3 corpus

Total Size:

  • 85 hours

Number of Real Samples:

  • Low-Quality Fake Audio Detection (LF): 300
  • Partially Fake Audio Detection (PF): 0

Number of Fake Samples:

  • Low-Quality Fake Audio Detection (LF): 700
  • Partially Fake Audio Detection (PF): 1052

Description of the Dataset:

  • The dataset includes genuine and fake utterances with various noises and disturbances. Real utterances were selected from AISHELL-3.

 

Extra Details:
The dataset supports developing methods or algorithms to distinguish generated audio from real audio. It involves detecting various forms of audio deepfakes.

ADD 2022 challenge includes training, development, adaptation, and test sets designed to facilitate the detection of audio deepfakes across various scenarios. These datasets are built using selections from the AISHELL-1, AISHELL-3, and AISHELL-4 Mandarin speech corpora, ensuring a diverse range of speakers and no overlap among different sets to maintain integrity in model testing.

 

Data Type:

  • WAV files

Average Length:

  • Not specified

Keywords:

  • Audio deepfake, Fake detection, Low-quality fake, Partially fake, Audio fake game

When Published:

  • 26 Feb 2022

 

Description of the dataset
Datasets include genuine and fake utterances with various noises and disturbances. Real utterances were selected from AISHELL-3.

 

Usage Scenarios:
Developing methods or algorithms to distinguish generated audio from real audio.

 

Miscellaneous Information:
The challenge dataset includes diverse audio deepfake scenarios for testing and improving detection systems.

 

Credits:
Datasets Used:

  • AISHELL-1, AISHELL-3, AISHELL-4

Speech Synthesis Models Referenced:

  • Various speech synthesis and voice conversion algorithms

Dataset Link


Main Paper Link


License Link


Last Accessed: 6/24/2024

NSF Award #2346473