Authors:
Alistair Johnson, Jean-Christophe Bélisle-Pipon, David Dorr, Satrajit Ghosh, Philip Payne, Maria Powell, Anaïs Rameau, Vardit Ravitsky, Alexandros Sigaras, Olivier Elemento, Yael Bensoussan
Alistair Johnson, Jean-Christophe Bélisle-Pipon, David Dorr, Satrajit Ghosh, Philip Payne, Maria Powell, Anaïs Rameau, Vardit Ravitsky, Alexandros Sigaras, Olivier Elemento, Yael Bensoussan
Abstract:
The dataset comprises voice recordings and associated clinical information aimed at enabling the development, benchmarking, and validation of machine-learning models for diagnosing a wide range of health conditions. The initial release includes spectrograms, acoustic features, phonetic and prosodic features, and transcriptions derived from the raw audio data.
The dataset comprises voice recordings and associated clinical information aimed at enabling the development, benchmarking, and validation of machine-learning models for diagnosing a wide range of health conditions. The initial release includes spectrograms, acoustic features, phonetic and prosodic features, and transcriptions derived from the raw audio data.
Data Creation Method:
Voice recordings and corresponding clinical data were collected from participants across five North American sites. Participants were selected based on conditions known to affect voice, including voice disorders, neurological disorders, mood disorders, and respiratory disorders. Data collection involved standardized protocols, including demographic information, health questionnaires, and specific voice recording tasks.
Voice recordings and corresponding clinical data were collected from participants across five North American sites. Participants were selected based on conditions known to affect voice, including voice disorders, neurological disorders, mood disorders, and respiratory disorders. Data collection involved standardized protocols, including demographic information, health questionnaires, and specific voice recording tasks.
Number of Speakers:
- 306 participants
Total Size:
- 12,523 recordings
Number of Real Samples:
- 12,523 recordings from actual participants
Number of Fake Samples:
- None reported
Description of the Dataset:
- The dataset comprises voice recordings and associated clinical information aimed at enabling the development, benchmarking, and validation of machine-learning models for diagnosing a wide range of health conditions. The initial release includes spectrograms, acoustic features, phonetic and prosodic features, and transcriptions derived from the raw audio data.
Extra Details:
The dataset focuses on five disease categories where voice changes are associated with specific conditions: Vocal Pathologies, Neurological and Neurodegenerative Disorders, Mood and Psychiatric Disorders, Respiratory Disorders, and Pediatric Diseases.
The dataset focuses on five disease categories where voice changes are associated with specific conditions: Vocal Pathologies, Neurological and Neurodegenerative Disorders, Mood and Psychiatric Disorders, Respiratory Disorders, and Pediatric Diseases.
Data Type:
- Derived data including spectrograms, acoustic features, phonetic and prosodic features, and transcriptions.
Average Length:
- Not specified in the available information
Keywords:
- Voice, Biomarker, Health, AI, Machine Learning, Speech Analysis, Clinical Diagnosis
When Published:
- November 27, 2024
Annotation Process:
Derived features were extracted using tools like OpenSMILE, Praat, Parselmouth, and OpenAI’s Whisper Large model. De-identification steps were taken to ensure privacy, including the removal of HIPAA Safe Harbor identifiers and transcripts of free speech audio.
Derived features were extracted using tools like OpenSMILE, Praat, Parselmouth, and OpenAI’s Whisper Large model. De-identification steps were taken to ensure privacy, including the removal of HIPAA Safe Harbor identifiers and transcripts of free speech audio.
Usage Scenarios:
The dataset is intended for research into using voice as a biomarker for health, facilitating the development of AI models for screening, diagnosis, and treatment of various diseases.
The dataset is intended for research into using voice as a biomarker for health, facilitating the development of AI models for screening, diagnosis, and treatment of various diseases.
License Link:
Bridge2AI Voice Registered Access License
Bridge2AI Voice Registered Access License
Data Accessibility: Access is restricted to credentialed users who sign the Data Use Agreement and complete required training. The dataset is available through the Health Data Nexus platform. https://docs.b2ai-voice.org/