WAV Files | MP3 Files | PCM Files | Other Files
WAV Files
The dataset for the challenge consists of training, development, adaptation, and test sets.
Ar-DAD: Arabic Diversified Audio Dataset
This dataset contains 15,810 audio clips of 30 popular reciters cantillating verses from the Holy Quran (chapters 78-114).
A dataset consisting of 13,100 short audio clips of a single speaker reading passages from 7 non-fiction books.
The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus
Speech data for the acquisition of acoustic-phonetic knowledge and the development and evaluation of automatic speech recognition systems.
The M-AILABS Speech Dataset provides extensive audio and text data for training speech recognition and synthesis models. The dataset consists of WAV files.
MP3 Files
Baidu Silicon Valley AI Lab cloned audio (Neural Voice Cloning with a Few Samples)
This dataset introduces a neural voice cloning system that learns to synthesize a person’s voice from only a few audio samples.
FoR: Fake or Real Dataset for Synthetic Speech Detection
The dataset includes both real and synthetic speech samples for the purpose of detecting synthetic speech using machine learning and deep learning models.
FakeAVCeleb: A Novel Audio-Video Multimodal Deepfake Dataset
FakeAVCeleb is a multimodal dataset featuring synchronized fake audio and video, created to enhance the development of deepfake detection systems capable of identifying both visual and audio manipulations.
PCM Files
Involves detecting deepfake speech processed with different lossy codecs typically used for media storage. Best performing system achieved an EER of 15.64%.
Involves replay attacks recorded in real physical spaces with various noise and reverberation conditions.
ASVspoof 2021: Accelerating Progress in Spoofed and Deepfake Speech Detection
Involves detecting synthetic and converted speech injected into communication systems without acoustic propagation.
The ASVspoof 2021 dataset advances the field of spoofed and deepfake speech detection by introducing more challenging and realistic conditions, fostering the development of robust countermeasures.
WaveFake: A Data Set to Facilitate Audio Deepfake Detection
The dataset addresses the threat of audio deepfakes by providing a novel dataset of generated audio samples from 6 different network architectures across 2 languages.
Other Files
ASVspoof 2015 (The First Automatic Speaker Verification Spoofing and Countermeasures Challenge)
The dataset includes genuine and spoofed speech, partitioned into training, development, and evaluation sets.
ASVspoof 2019 (A large-scale public database of synthesized, converted and replayed speech)
The dataset includes various state-of-the-art spoofing techniques to provide a challenging test bed for anti-spoofing research.
AV-Deepfake1M: A Large-Scale LLM-Driven Audio-Visual Deepfake Dataset
It includes various combinations of real and fake audio-visual segments, providing a comprehensive benchmark for state-of-the-art deepfake detection and localization methods.
H-Voice: Histograms of Original and Fake Voice Recordings
The dataset consists of 6672 histograms of voice recordings, both original and fake. It is organized into six directories for training, validation, and external testing
Contains audio-visual recordings from over 6,000 speakers, extracted from YouTube videos.
The dataset comprises voice recordings and associated clinical information aimed at enabling the development, benchmarking, and validation of machine-learning models for diagnosing a wide range of health conditions.
Infographics by Pragya Pandit
Website Design by Lavanya Neelakandan