ASVspoof 2021 (DF)

Authors:
Junichi Yamagishi, Xin Wang, Massimiliano Todisco, Md Sahidullah, Jose Patino, Andreas Nautsch, Xuechen Liu, Kong Aik Lee, Tomi Kinnunen, Nicholas Evans, Hector Delgado

 

Abstract:
Involves detecting deepfake speech processed with different lossy codecs typically used for media storage. Best performing system achieved an EER of 15.64%.

 

Data Creation Method:
A fake audio detection task comprising bona fide and spoofed utterances generated using TTS and VC algorithms. Genuine utterances were recorded from speakers in controlled environments. Spoofed utterances were generated using various speech synthesis and voice conversion algorithms.

 

Number of Speakers:

  • Training and Development: 30 speakers (20 for training, 10 for development)
  • Evaluation: 48 speakers (21 male, 27 female)

Total Size:

  • Not specified

Number of Real Samples:

  • Not specified

Number of Fake Samples:

  • More than 100 different spoofing algorithms

Description of the Dataset:

  • The dataset involves detecting deepfake speech processed with different lossy codecs used for media storage. It includes a variety of spoofed and genuine utterances generated using TTS and VC algorithms.

 

Extra Details:
The system was evaluated based on the Equal Error Rate (EER). The best performing system achieved an EER of 15.64%.

 

Data Type:

  • pcm files

Average Length:

  • Not specified

Keywords:

  • Speaker verification, Spoofing, Anti-spoofing, Countermeasure, Deepfake detection

When Published:

  • 2021

 

Annotation Process:
Genuine utterances were recorded from speakers in controlled environments. Spoofed utterances were generated using various speech synthesis and voice conversion algorithms.

 

Usage Scenarios:
Speaker verification, detecting and countering deepfake speech, enhancing anti-spoofing technologies.

 

Miscellaneous Information:
The dataset supports various spoofing algorithms and is used for evaluating deepfake speech detection systems. The detection system achieved significant performance metrics in identifying spoofed speech.

 

Credits:
Datasets Used:

  • Not specified

Speech Synthesis Models Referenced:

  • Various TTS and VC algorithms
Dataset Link


Main Paper Link


License Link


Last Accessed: 7/1/2024

NSF Award #2346473