Skip to content

Bridge2AI Data Manifest

This page provides a comprehensive manifest of all data subsets, standards, substrates, topics, and relevant anatomy used across the Bridge2AI consortium.

Each data type is listed with its associated metadata and standards.

Table Features

  • Click on any column header to sort the table
  • Links are provided to standards, substrates, topics, and anatomy ontologies
  • Icons indicate different types of metadata: 📋 Standards, 💾 Substrates, 🏷️ Topics, 🧬 Anatomy

Functional Genomics Grand Challenge

Data Type Description 📋 Standards & Tools 💾 Substrates 🏷️ Topics 🧬 Anatomy
Cell maps Cell maps of cellular systems. B2AI_STANDARD:372 (RO-CRATE) B2AI_TOPIC:2 (Cell) CLO:0000031 (cell line)
Cell line metadata Cell line metadata. B2AI_STANDARD:372 (RO-CRATE) B2AI_TOPIC:2 (Cell) CLO:0000031 (cell line)
AP-MS - Level 1 AP-MS raw signal and metadata. B2AI_STANDARD:372 (RO-CRATE) B2AI_SUBSTRATE:58 (Mass Spectrometry Data) B2AI_TOPIC:2 (Cell), B2AI_TOPIC:28 (Proteome) CLO:0000031 (cell line)
AP-MS - Level 3 AP-MS peptide counts. B2AI_STANDARD:372 (RO-CRATE) B2AI_SUBSTRATE:58 (Mass Spectrometry Data) B2AI_TOPIC:2 (Cell), B2AI_TOPIC:28 (Proteome) CLO:0000031 (cell line)
Confocal IF - Level 1 Confocal immunofluorescence cell imaging raw signal (pixel intensity) and metadata. B2AI_STANDARD:344 (JPEG), B2AI_STANDARD:372 (RO-CRATE) B2AI_SUBSTRATE:56 (Immunofluorescence Image) B2AI_TOPIC:2 (Cell), B2AI_TOPIC:19 (Microscale Imaging) CLO:0000031 (cell line)
CROP-seq - Level 1 CROP-seq raw signal (unaligned reads) and metadata. B2AI_STANDARD:111 (FASTQ), B2AI_STANDARD:372 (RO-CRATE) B2AI_SUBSTRATE:64 (Perturb-seq Data) B2AI_TOPIC:34 (Transcriptome) CLO:0000031 (cell line)
CROP-seq - Level 2 CROP-seq aligned reads and metadata. B2AI_STANDARD:21 (BAM/CRAM), B2AI_STANDARD:372 (RO-CRATE) B2AI_SUBSTRATE:64 (Perturb-seq Data) B2AI_TOPIC:34 (Transcriptome) CLO:0000031 (cell line)
CROP-seq - Level 3 CROP-seq gene counts and metadata. B2AI_STANDARD:372 (RO-CRATE) B2AI_SUBSTRATE:64 (Perturb-seq Data), B2AI_SUBSTRATE:43 (Text) B2AI_TOPIC:34 (Transcriptome) CLO:0000031 (cell line)
CROP-seq protocol CROP-seq assay attributes. B2AI_STANDARD:372 (RO-CRATE) B2AI_TOPIC:34 (Transcriptome) CLO:0000031 (cell line)
CROP-seq sample metadata CROP-seq sample attributes. B2AI_STANDARD:372 (RO-CRATE) B2AI_SUBSTRATE:64 (Perturb-seq Data), B2AI_SUBSTRATE:6 (Comma-separated values) B2AI_TOPIC:34 (Transcriptome) CLO:0000031 (cell line)
CROP-seq QC CROP-seq quality control metrics. B2AI_STANDARD:372 (RO-CRATE) B2AI_SUBSTRATE:64 (Perturb-seq Data), B2AI_SUBSTRATE:43 (Text) B2AI_TOPIC:34 (Transcriptome) CLO:0000031 (cell line)
Evidence Evidence supporting predictions, modeled as provenance graphs. B2AI_STANDARD:372 (RO-CRATE), B2AI_STANDARD:444 (EVI) B2AI_SUBSTRATE:20 (JSON) B2AI_TOPIC:34 (Transcriptome) CLO:0000031 (cell line)

Salutogenesis Grand Challenge

Data Type Description 📋 Standards & Tools 💾 Substrates 🏷️ Topics 🧬 Anatomy
WGS - Level 1 Whole genome sequencing raw reads and metadata. B2AI_STANDARD:111 (FASTQ) B2AI_SUBSTRATE:61 (DNA Sequence Data) B2AI_TOPIC:13 (Genome)
OCT Optical coherence tomography (Retinal layer thickness measurements) and metadata. B2AI_STANDARD:98 (DICOM) B2AI_SUBSTRATE:67 (Optical coherence tomography data) B2AI_TOPIC:24 (Ophthalmic Imaging) UBERON:0003951 (ocular fundus)
Near-IR ophthalmic imaging Near-infrared ophthalmic imaging and metadata. B2AI_STANDARD:98 (DICOM) B2AI_SUBSTRATE:65 (Retinal Image) B2AI_TOPIC:24 (Ophthalmic Imaging) UBERON:0003951 (ocular fundus)
Retinal fundus imaging Retinal fundus imaging and metadata. B2AI_STANDARD:98 (DICOM) B2AI_SUBSTRATE:65 (Retinal Image) B2AI_TOPIC:24 (Ophthalmic Imaging) UBERON:0003951 (ocular fundus)
OCTA Optical coherence tomography angiography and metadata. B2AI_STANDARD:98 (DICOM) B2AI_SUBSTRATE:68 (Optical coherence tomography angiography data) B2AI_TOPIC:24 (Ophthalmic Imaging) UBERON:0003951 (ocular fundus)
FLIO Fluorescence lifetime imaging ophthalmoscopy and metadata. B2AI_STANDARD:98 (DICOM) B2AI_SUBSTRATE:66 (Fluorescence Lifetime Imaging Ophthalmoscopy data) B2AI_TOPIC:24 (Ophthalmic Imaging) UBERON:0003951 (ocular fundus)
Clinical visit Data produced during clinical visits, including results of vision assessments. B2AI_STANDARD:243 (OMOP CDM) B2AI_TOPIC:4 (Clinical Observations), B2AI_TOPIC:44 (Eye Diseases) UBERON:0000468 (multicellular organism)
Clinical labs Clinical laboratory measurements. B2AI_STANDARD:243 (OMOP CDM) B2AI_TOPIC:9 (EHR) UBERON:0000468 (multicellular organism)
Survey data Data from patient surveys, including social determinants of health (SDoH), diet, lifestyle, family history, and MoCA cognitive scores. B2AI_STANDARD:243 (OMOP CDM), B2AI_STANDARD:821 (REDCap) B2AI_SUBSTRATE:80 (Questionnaire response data) B2AI_TOPIC:4 (Clinical Observations)
B2AI_TOPIC:29 (SDoH)
B2AI_TOPIC:40 (Governance)
Glucose levels Patient glucose data from continuous monitoring over 10 days. B2AI_STANDARD:246 (Open mHealth) B2AI_SUBSTRATE:78 (Glucose monitoring data) B2AI_TOPIC:38 (Glucose Monitoring) UBERON:0000178 (blood)
Activity levels Patient activity data from continuous monitoring over 10 days. B2AI_STANDARD:246 (Open mHealth) B2AI_SUBSTRATE:73 (Physical activity data), B2AI_SUBSTRATE:74 (Caloric burn data) B2AI_TOPIC:39 (Activity Monitoring) UBERON:0000468 (multicellular organism)
Heart rate Patient heart rate data from continuous monitoring over 10 days. B2AI_STANDARD:246 (Open mHealth) B2AI_SUBSTRATE:71 (Heart rate) B2AI_TOPIC:39 (Activity Monitoring) UBERON:0000948 (heart)
SpO2 levels Patient blood oxygen saturation levels from continuous monitoring over 10 days. B2AI_STANDARD:246 (Open mHealth) B2AI_SUBSTRATE:72 (Oxygen saturation) B2AI_TOPIC:39 (Activity Monitoring), B2AI_TOPIC:46 (Respiration)
ECG Patient cardiac 12-lead electrocardiogram (waveform) data and metadata. B2AI_STANDARD:202 (WFDB Format) B2AI_SUBSTRATE:49 (Waveform Data) B2AI_TOPIC:37 (Waveform) UBERON:0000948 (heart)
Environmental data Patient environment and air quality data. Includes temperature, humidity, oxygen, particle, and other measures of the environment. B2AI_TOPIC:11 (Environment) UBERON:0000468 (multicellular organism)

Precision Public Health Grand Challenge

Data Type Description 📋 Standards & Tools 💾 Substrates 🏷️ Topics 🧬 Anatomy
WGS - Level 1 Whole genome sequencing raw reads and metadata. Preserving/sharing unaligned reads (in same file as aligned reads) via CRAM. B2AI_STANDARD:21 (BAM/CRAM) B2AI_TOPIC:13 (Genome)
WGS - Level 2 Whole genome sequencing aligned reads and metadata. Preserving/sharing unaligned reads (in same file as aligned reads) via CRAM. B2AI_STANDARD:21 (BAM/CRAM) B2AI_TOPIC:13 (Genome)
WGS - Level 3 Whole genome sequencing variant calls (SNV/INDEL/CNV/SV) and metadata. Preserving/sharing unaligned reads (in same file as aligned reads) via CRAM. B2AI_STANDARD:299 (VCF) B2AI_TOPIC:13 (Genome)
WGS protocol Whole genome sequencing protocol and metadata. B2AI_TOPIC:13 (Genome)
WGS sample metadata Whole genome sequencing sample attributes. B2AI_TOPIC:13 (Genome)
WGS QC Whole genome sequencing quality control metrics. B2AI_TOPIC:13 (Genome)
CT Computed tomography (CT) scans and metadata. B2AI_STANDARD:98 (DICOM) B2AI_SUBSTRATE:11 (DICOM) B2AI_TOPIC:22 (Neurologic Imaging)
MR Magnetic resonance (MR) imaging and metadata. B2AI_STANDARD:98 (DICOM) B2AI_SUBSTRATE:11 (DICOM) B2AI_TOPIC:22 (Neurologic Imaging)
X-ray X-ray imaging and metadata. B2AI_STANDARD:98 (DICOM) B2AI_SUBSTRATE:11 (DICOM)
Voice Voice recordings (spectrographs) and metadata. B2AI_STANDARD:202 (WFDB Format) B2AI_SUBSTRATE:49 (Waveform Data) B2AI_TOPIC:36 (Voice) UBERON:0000468 (multicellular organism)
Laryngoscopy Laryngoscopy videos and metadata. B2AI_STANDARD:98 (DICOM), B2AI_STANDARD:352 (MPEG-4) B2AI_SUBSTRATE:19 (Image) UBERON:0001737 (larynx)
Speech test assessment Speech test assessment results. B2AI_SUBSTRATE:49 (Waveform Data) B2AI_TOPIC:36 (Voice), B2AI_TOPIC:45 (Voice Disorders) UBERON:0000468 (multicellular organism)
Spontaneous speech assessment Spontaneous speech assessment results. B2AI_SUBSTRATE:49 (Waveform Data) B2AI_TOPIC:36 (Voice), B2AI_TOPIC:45 (Voice Disorders) UBERON:0000468 (multicellular organism)
Forced cough assessment Forced cough assessment results. B2AI_SUBSTRATE:49 (Waveform Data) B2AI_TOPIC:36 (Voice), B2AI_TOPIC:45 (Voice Disorders) UBERON:0000468 (multicellular organism)
Informed consent Informed consent form responses. B2AI_STANDARD:821 (REDCap)
Questionnaire Questionnaire responses, including MoCA, GAD-7, VHI-10, PANAS, and DI. B2AI_STANDARD:109 (FHIR) B2AI_TOPIC:31 (Survey)
Demographics Participant demographics information. B2AI_STANDARD:109 (FHIR) B2AI_TOPIC:29 (SDoH), B2AI_TOPIC:31 (Survey)
Treatment Participant treatment history information. B2AI_STANDARD:243 (OMOP CDM) B2AI_TOPIC:4 (Clinical Observations) UBERON:0000468 (multicellular organism)
Diagnosis Participant diagnosis history information and results of functional assessment. B2AI_STANDARD:174 (ICD-10-CM) B2AI_TOPIC:4 (Clinical Observations), B2AI_TOPIC:45 (Voice Disorders) UBERON:0000468 (multicellular organism)
Vital signs Participant vital signs information. B2AI_STANDARD:109 (FHIR) B2AI_TOPIC:4 (Clinical Observations)
Social history Participant social history information, including social history (e.g., smoking and alcohol use). B2AI_TOPIC:29 (SDoH)

AI/ML for Clinical Care Grand Challenge

Data Type Description 📋 Standards & Tools 💾 Substrates 🏷️ Topics 🧬 Anatomy
Clinical labs Clinical laboratory measurements. B2AI_TOPIC:4 (Clinical Observations), B2AI_TOPIC:9 (EHR)
Clinical treatments Clinical treatment information, including details of medications administered. B2AI_TOPIC:4 (Clinical Observations)
Physiologic telemetry Physiologic telemetry data. B2AI_SUBSTRATE:49 (Waveform Data) B2AI_TOPIC:4 (Clinical Observations), B2AI_TOPIC:9 (EHR)
Physiologic EEG Physiologic electroencephalogram data. B2AI_SUBSTRATE:49 (Waveform Data) B2AI_TOPIC:4 (Clinical Observations), B2AI_TOPIC:9 (EHR) UBERON:0000955 (brain)
ECG Patient cardiac 5-lead electrocardiogram (waveform) data and metadata. B2AI_STANDARD:202 (WFDB Format) B2AI_SUBSTRATE:49 (Waveform Data) B2AI_TOPIC:37 (Waveform) UBERON:0000948 (heart)
Heart rate Patient heart rate data. B2AI_SUBSTRATE:71 (Heart rate) B2AI_TOPIC:39 (Activity Monitoring) UBERON:0000948 (heart)
SpO2 levels Patient blood oxygen saturation levels. B2AI_SUBSTRATE:72 (Oxygen saturation) B2AI_TOPIC:46 (Respiration)
X-ray X-ray imaging and metadata. B2AI_STANDARD:98 (DICOM) B2AI_SUBSTRATE:11 (DICOM)
CT Computed tomography (CT) scans and metadata. B2AI_STANDARD:98 (DICOM) B2AI_SUBSTRATE:11 (DICOM) B2AI_TOPIC:22 (Neurologic Imaging)
MR Magnetic resonance (MR) imaging and metadata. B2AI_STANDARD:98 (DICOM) B2AI_SUBSTRATE:11 (DICOM) B2AI_TOPIC:22 (Neurologic Imaging)
Social determinants of health Social determinants of health data based on Area Deprivation Index (ADI). B2AI_TOPIC:29 (SDoH)
Practice metadata Metadata about clinical practice.