Multiple Languages

CMU WILDERNESS MULTILINGUAL SPEECH DATASET

Alan W Black & Language Technologies Institute, Carnegie Mellon University
2019

This paper describes the CMU Wilderness Multilingual Speech Dataset. A dataset of over 700 different languages providing audio, aligned text and word pronunciations.

Common Voice: A Massively-Multilingual Speech Corpus

Rosana Ardila, Megan Branson, Kelly Davis, Michael Henretty, Michael Kohler, Josh Meyer, Reuben Morais, Lindsay Saunders, Francis M. Tyers, Gregor Weber

2020

Dataset, Multiple Languages

The Common Voice corpus is a massively-multilingual collection of transcribed speech intended for speech technology research and development. Common Voice is designed for Automatic Speech Recognition purposes but can be useful in other domains (e.g. language identification).

Faked Speech Detection with Zero Prior Knowledge

Sahar Al Ajmi, Khizar Hayat, Alaa M. Al Obaidi, Naresh Kumar, Munaf Najmuldeen

2024

Detection, Dataset, English, Arabic, Multiple Languages

This work introduces a neural network method to develop a classifier that will blindly classify an input audio as real or mimicked; the word ’blindly’ refers to the ability to detect mimicked audio without references or real sources.

MLAAD: The Multi-Language Audio Anti-Spoofing Dataset

Nicolas M. Müller, Piotr Kawa, Wei Herng Choong, Edresson Casanova, Eren Gölge, Thorsten Müller, Piotr Syga, Philip Sperl, Konstantin Böttinger

2024

Dataset, Multiple Languages

This paper presents the Multi-Language Audio Anti-Spoof Dataset (MLAAD), created using 82 TTS models, comprising 33 different architectures, to generate 378.0 hours of synthetic voice in 38 different languages.

One Model, Many Languages: Meta-learning for Multilingual Text-to-Speech

Tomáš Nekvinda, Ondřej Dušek

2020

Generation, Multiple Languages

This paper introduces an approach to multilingual speech synthesis which uses the meta-learning concept of contextual parameter generation and produces natural-sounding multilingual speech using more languages and less training data than previous approaches.

OpenSLR

Jan “Yenda” Trmal

N/A

Dataset, Multiple Languages

OpenSLR is a site devoted to hosting speech and language resources, such as training corpora for speech recognition, and software related to speech recognition.

NSF Award #2346473

Search UMBC

Subscribe to UMBC Weekly Top Stories

I am interested in: