Bridge2AI-Voice

Authors:
Alistair Johnson, Jean-Christophe Bélisle-Pipon, David Dorr, Satrajit Ghosh, Philip Payne, Maria Powell, Anaïs Rameau, Vardit Ravitsky, Alexandros Sigaras, Olivier Elemento, Yael Bensoussan

 

Abstract:
The dataset comprises voice recordings and associated clinical information aimed at enabling the development, benchmarking, and validation of machine-learning models for diagnosing a wide range of health conditions. The initial release includes spectrograms, acoustic features, phonetic and prosodic features, and transcriptions derived from the raw audio data.

 

Data Creation Method:
Voice recordings and corresponding clinical data were collected from participants across five North American sites. Participants were selected based on conditions known to affect voice, including voice disorders, neurological disorders, mood disorders, and respiratory disorders. Data collection involved standardized protocols, including demographic information, health questionnaires, and specific voice recording tasks.

 

Number of Speakers:

  • 306 participants

Total Size:

  • 12,523 recordings

Number of Real Samples:

  • 12,523 recordings from actual participants

Number of Fake Samples:

  • None reported

Description of the Dataset:

  • The dataset comprises voice recordings and associated clinical information aimed at enabling the development, benchmarking, and validation of machine-learning models for diagnosing a wide range of health conditions. The initial release includes spectrograms, acoustic features, phonetic and prosodic features, and transcriptions derived from the raw audio data.

 

Extra Details:
The dataset focuses on five disease categories where voice changes are associated with specific conditions: Vocal Pathologies, Neurological and Neurodegenerative Disorders, Mood and Psychiatric Disorders, Respiratory Disorders, and Pediatric Diseases.

 

Data Type:

  • Derived data including spectrograms, acoustic features, phonetic and prosodic features, and transcriptions.

Average Length:

  • Not specified in the available information

Keywords:

  • Voice, Biomarker, Health, AI, Machine Learning, Speech Analysis, Clinical Diagnosis

When Published:

  • November 27, 2024

 

Annotation Process:
Derived features were extracted using tools like OpenSMILE, Praat, Parselmouth, and OpenAI’s Whisper Large model. De-identification steps were taken to ensure privacy, including the removal of HIPAA Safe Harbor identifiers and transcripts of free speech audio.

 

Usage Scenarios:
The dataset is intended for research into using voice as a biomarker for health, facilitating the development of AI models for screening, diagnosis, and treatment of various diseases.

 

 

Data Accessibility: Access is restricted to credentialed users who sign the Data Use Agreement and complete required training. The dataset is available through the Health Data Nexus platform. https://docs.b2ai-voice.org/

Dataset Link


Main Paper Link


License Link


Last Accessed: 7/18/2024

NSF Award #2346473