- Description
- Create an ethically sourced flagship dataset to enable future research in artificial intelligence and support critical insights into the use of voice as a biomarker of health.
Datasheet for Dataset - Human Readable Format
Why was the dataset created?
What do the instances represent?
| Description |
|---|
| Voice disorders (benign and malignant lesions affecting vocal folds) |
| Neurological and neurodegenerative disorders (including Parkinson's, ALS) |
| Mood and psychiatric disorders (depression, anxiety) |
| Respiratory disorders (cough, breathing sounds) |
How was the data acquired?
| Description | ID | Name |
|---|---|---|
| Time-frequency power spectrograms (513 x N dimension) computed using short-time FFT with 25ms window, 10ms hop length, and 512-point FFT. | voice:spectrograms | spectrograms.parquet |
| 60 Mel-frequency cepstral coefficients (MFCCs) derived from spectrograms, 60 x N dimension per recording. | voice:mfcc | mfcc.parquet |
| Participant demographics, validated questionnaire responses, and acoustic confounders with one row per unique participant. | voice:phenotype | phenotype.tsv |
| Acoustic features from openSMILE, Praat, parselmouth, and torchaudio with one row per unique recording. | voice:static-features | static_features.tsv |
What (other) tasks could the dataset be used for?
How will the dataset be distributed?
How will the dataset be maintained?
Does the dataset relate to people?