ASVspoof 2021: Accelerating Progress in Spoofed and Deepfake Speech Detection

Authors:
Junichi Yamagishi, Xin Wang, Massimiliano Todisco, Md Sahidullah, Jose Patino, Andreas Nautsch, Xuechen Liu, Kong Aik Lee, Tomi Kinnunen, Nicholas Evans, Hector Delgado

 

Abstract:
Involves detecting synthetic and converted speech injected into communication systems without acoustic propagation.

 

Data Creation Method:
Bona fide and spoofed utterances generated using text-to-speech (TTS) and voice conversion (VC) algorithms are communicated across telephony and VoIP networks with various coding and transmission effects.

 

Number of Speakers:

  • Training and Development: 30 speakers (20 for training, 10 for development)
  • Evaluation: 48 speakers (21 male, 27 female)

Total Size:

  • Not specified

Number of Real Samples:

  • Not specified

Number of Fake Samples:

  • More than 100 different spoofing algorithms

Description of the Dataset:

  • The dataset includes bona fide and spoofed utterances communicated across telephony and VoIP networks with various coding and transmission effects.

 

Extra Details:
The best performing system achieved a minimum t-DCF of 0.2177 and an EER of 1.32%.

 

Data Type:

  • PCM files

Average Length:

  • Not specified

Keywords:

  • Speaker verification, Spoofing, Anti-spoofing, Countermeasure, Deepfake detection

When Published:

  • 2021

 

Annotation Process:
Genuine utterances were recorded from speakers in controlled environments. Spoofed utterances were generated using various speech synthesis and voice conversion algorithms.

 

Usage Scenarios:
Evaluating and improving detection systems for synthetic and converted speech communicated over telephony and VoIP networks.

 

Miscellaneous Information:
The dataset provides a benchmark for detecting spoofed speech in communication systems with various coding and transmission effects.

 

Credits:
Datasets Used:

  • Text-to-speech (TTS) and voice conversion (VC) algorithms

Speech Synthesis Models Referenced:

  • Various speech synthesis and voice conversion algorithms
Dataset Link


Main Paper Link


License Link


Last Accessed: 7/1/2024

NSF Award #2346473