ASVspoof2021 (LA)

Authors:
Junichi Yamagishi, Xin Wang, Massimiliano Todisco, Md Sahidullah, Jose Patino, Andreas Nautsch, Xuechen Liu, Kong Aik Lee, Tomi Kinnunen, Nicholas Evans, Hector Delgado

Abstract:
The ASVspoof 2021 dataset advances the field of spoofed and deepfake speech detection by introducing more challenging and realistic conditions, fostering the development of robust countermeasures.

Data Creation Method:
The dataset includes three tasks:

Logical Access (LA): Involves detecting synthetic and converted speech injected into communication systems without acoustic propagation.
Physical Access (PA): Involves replay attacks recorded in real physical spaces with various noise and reverberation conditions.
DeepFake (DF): Involves detecting deepfake speech processed with different lossy codecs typically used for media storage.

The data is derived from the VCTK base corpus and includes both bona fide and spoofed speech samples, processed under various conditions such as encoding, transmission, and physical environment variability.

Number of Speakers:

Training and Development: 30 speakers (20 for training, 10 for development)
Evaluation: 48 speakers (21 male, 27 female)

Total Size:

Not specified

Number of Real Samples:

Not specified

Number of Fake Samples:

More than 100 different spoofing algorithms

Description of the Dataset:

The dataset includes bona fide and spoofed speech samples processed under various conditions such as encoding, transmission, and physical environment variability.

Extra Details:
Not specified.

Data Type:

PCM files

Average Length:

Not specified

Keywords:

Speaker verification, Spoofing, Anti-spoofing, Countermeasure, Deepfake detection

When Published:

2021

Annotation Process:
Genuine utterances were recorded from speakers in controlled environments. Spoofed utterances were generated using various speech synthesis and voice conversion algorithms.

Usage Scenarios:
Developing and evaluating spoofing detection algorithms.

Miscellaneous Information:
The dataset provides a comprehensive benchmark for evaluating and improving spoofing detection systems under varied conditions.

Credits:
Datasets Used:

Logical Access (LA) Database: Link
Physical Access (PA) Database: Link
DeepFake (DF) Database: Link

Speech Synthesis Models Referenced:

Various speech synthesis and voice conversion algorithms

DF Database Link

Main Paper Link

License Link

Last Accessed: 6/25/2024

NSF Award #2346473

Search UMBC

Subscribe to UMBC Weekly Top Stories

I am interested in: