D4D-Core Schema
The D4D-Core schema is the curated, interop-focused subset of D4D — the recommended starting point for new datasheets and the canonical surface for systems that exchange datasheets with RO-Crate, FAIRSCAPE, schema.org, DCAT, or Croissant RAI consumers. Every slot in d4d-core is paired with a SKOS-aligned external term in the Semantic Exchange Layer.
The full schema (data_sheets_schema.yaml, ~284 attributes) remains the extended reservoir; d4d-core (~95 fields) is the cross-system interop layer.
Schema artifacts
| Artifact | Path | Description |
|---|---|---|
| Source schema | data_sheets_schema_core.yaml |
Core schema entry point (imports D4D_Core.yaml) |
| Core module | D4D_Core.yaml |
CoreDataset, CoreDatasetCollection, CoreDistribution and their slots |
| Merged form | data_sheets_schema_core_all.yaml |
Single-file merged schema (auto-generated by make gen-core-schema) |
| Base import | D4D_Base_import.yaml |
Shared base classes / slots / enums |
Build & validate
make gen-core-schema # produce merged data_sheets_schema_core_all.yaml
make validate-core # linkml-validate on the core schema
make lint-core # linkml-lint on the core module
Curated example datasheets
Each Bridge2AI generating center has a curated d4d-core-aligned datasheet:
- AI-READI — Retinal imaging and diabetes
- CHORUS — Health data for underrepresented populations
- CM4AI — Cell maps for AI
- VOICE — Voice biomarker
Core classes
| Class | Maps to | Notes |
|---|---|---|
CoreDataset |
schema:Dataset |
The primary dataset metadata record (~79 induced slots) |
CoreDatasetCollection |
schema:Dataset (RO-Crate root) + dcat:Catalog |
tree_root: true; renders as @id: "./" with @type: ["Dataset", "https://w3id.org/EVI#ROCrate"] |
CoreDistribution |
dcat:Distribution |
Concrete download/distribution surface |
Person, Creator |
schema:Person |
People referenced in creator, author, contributor, maintainer |
Organization |
schema:Organization |
Institutional affiliations and publishers |
Grant, FundingMechanism |
schema:Grant |
Funding records linked via schema:funder |
The full crosswalk lives in the Semantic Exchange Layer.
Why a "core" subset?
- FAIR interop: every core slot has a documented SKOS mapping to one of schema.org / RO-Crate / FAIRSCAPE EVI / DCAT / Croissant RAI.
- Smaller surface area: ~95 fields is tractable for hand-authoring and AI-assisted authoring; the full schema (~284 attributes) is for full-coverage research datasheets.
- Validation-friendly:
make validate-coreruns in seconds against typical Bridge2AI inputs. - RO-Crate round-trip: core ↔ RO-Crate JSON-LD is the supported lossless conversion path; full-schema ↔ RO-Crate may require attribute drops or extension contexts.
See Semantic Exchange for the mapping artifacts and the /d4d-add-mapping workflow.