Class: FileCollection
A collection of files with shared characteristics (format, purpose, structure). Represents a logical grouping of related files within a dataset, such as all training data files, all image files, or all raw data files. Maps to RO-Crate Dataset entities via schema:hasPart relationships.
URI: dcat:Dataset
classDiagram
class FileCollection
click FileCollection href "../FileCollection/"
Information <|-- FileCollection
click Information href "../Information/"
FileCollection : collection_type
FileCollection --> "*" FileCollectionTypeEnum : collection_type
click FileCollectionTypeEnum href "../FileCollectionTypeEnum/"
FileCollection : compression
FileCollection --> "0..1" CompressionEnum : compression
click CompressionEnum href "../CompressionEnum/"
FileCollection : conforms_to
FileCollection : conforms_to_class
FileCollection : conforms_to_schema
FileCollection : created_by
FileCollection : created_on
FileCollection : description
FileCollection : doi
FileCollection : download_url
FileCollection : external_resources
FileCollection --> "*" ExternalResource : external_resources
click ExternalResource href "../ExternalResource/"
FileCollection : file_count
FileCollection : id
FileCollection : issued
FileCollection : keywords
FileCollection : language
FileCollection : last_updated_on
FileCollection : license
FileCollection : modified_by
FileCollection : name
FileCollection : page
FileCollection : path
FileCollection : publisher
FileCollection : resources
FileCollection --> "*" Dataset : resources
click Dataset href "../Dataset/"
FileCollection : status
FileCollection : title
FileCollection : total_bytes
FileCollection : version
FileCollection : was_derived_from
Inheritance
- NamedThing
- Information
- FileCollection
- Information
Slots
| Name | Cardinality and Range | Description | Inheritance |
|---|---|---|---|
| path | 0..1 String |
Path or URL to the FileCollection | direct |
| compression | 0..1 CompressionEnum |
Compression format if the collection is packaged as a compressed archive (e | direct |
| external_resources | * ExternalResource |
External files or URLs referenced by this file collection | direct |
| resources | * Dataset or File or FileCollection |
Individual files or nested file collections within this collection | direct |
| collection_type | * FileCollectionTypeEnum |
Type(s) of content in this file collection | direct |
| file_count | 0..1 Integer |
Number of files in this collection | direct |
| total_bytes | 0..1 Integer |
Total size of all files in this collection, in bytes (integer) | direct |
| conforms_to | 0..1 String |
An established standard, specification, or schema to which the resource confo... | Information |
| conforms_to_class | 0..1 String |
The specific class or type within a schema to which the resource conforms | Information |
| conforms_to_schema | 0..1 String |
The schema or data model to which the resource conforms | Information |
| created_by | 0..1 String |
The person or organization primarily responsible for creating the resource | Information |
| created_on | 0..1 Datetime |
The date and time when the resource was created | Information |
| doi | 0..1 String |
Digital Object Identifier (DOI) in format 10 | Information |
| download_url | 0..1 Uri |
URL from which the data can be downloaded | Information |
| issued | 0..1 Datetime |
Date of formal issuance or publication of the resource | Information |
| keywords | * String |
Keywords or tags describing the resource for discovery and classification | Information |
| language | 0..1 String |
Language in which the information is expressed | Information |
| last_updated_on | 0..1 Datetime |
The date and time when the resource was most recently modified or updated | Information |
| license | 0..1 String |
The legal license under which the resource is made available (e | Information |
| modified_by | 0..1 String |
A person or organization that contributed to modifying or updating the resour... | Information |
| page | 0..1 String |
A landing page or web page providing access to or information about the resou... | Information |
| publisher | 0..1 Uriorcurie |
The organization or entity responsible for making the resource available | Information |
| status | 0..1 String |
The status of the resource (e | Information |
| title | 0..1 String |
The official title of the element | Information |
| version | 0..1 String |
The version identifier of the resource (e | Information |
| was_derived_from | 0..1 String |
A resource from which this resource was derived, in whole or in part | Information |
| id | 1 Uriorcurie |
A unique identifier for a thing | NamedThing |
| name | 0..1 String |
A human-readable name for a thing | NamedThing |
| description | 0..1 String |
A human-readable description for a thing | NamedThing |
Usages
| used by | used in | type | used |
|---|---|---|---|
| Dataset | file_collections | range | FileCollection |
| DataSubset | file_collections | range | FileCollection |
| FileCollection | resources | any_of[range] | FileCollection |
Aliases
- file collection
- data files
- file group
Identifier and Mapping Information
Schema Source
- from schema: https://w3id.org/bridge2ai/data-sheets-schema
Mappings
| Mapping Type | Mapped Value |
|---|---|
| self | dcat:Dataset |
| native | data_sheets_schema:FileCollection |
| exact | schema:Dataset |
| close | dcat:Distribution |
LinkML Source
Direct
name: FileCollection
description: A collection of files with shared characteristics (format, purpose, structure).
Represents a logical grouping of related files within a dataset, such as all training
data files, all image files, or all raw data files. Maps to RO-Crate Dataset entities
via schema:hasPart relationships.
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
aliases:
- file collection
- data files
- file group
exact_mappings:
- schema:Dataset
close_mappings:
- dcat:Distribution
is_a: Information
slots:
- path
- compression
- external_resources
- resources
slot_usage:
path:
name: path
description: Path or URL to the FileCollection. May be a directory path, archive
file path, or download URL depending on how the collection is distributed.
compression:
name: compression
description: Compression format if the collection is packaged as a compressed
archive (e.g., gzip, zip, bzip2). Omit this field for uncompressed collections
or purely logical groupings.
external_resources:
name: external_resources
description: External files or URLs referenced by this file collection.
range: ExternalResource
multivalued: true
inlined_as_list: true
resources:
name: resources
description: Individual files or nested file collections within this collection.
Allows hierarchical file organization with both File objects and nested FileCollection
objects.
multivalued: true
inlined_as_list: true
any_of:
- range: File
- range: FileCollection
attributes:
collection_type:
name: collection_type
description: Type(s) of content in this file collection. A collection may have
multiple types, for example a collection containing both raw_data and documentation
files would have both types listed.
from_schema: https://w3id.org/bridge2ai/data-sheets-schema/file-collection
rank: 1000
slot_uri: d4d:collectionType
domain_of:
- FileCollection
range: FileCollectionTypeEnum
multivalued: true
file_count:
name: file_count
annotations:
d4d:docExample:
tag: d4d:docExample
value: '47'
description: Number of files in this collection.
from_schema: https://w3id.org/bridge2ai/data-sheets-schema/file-collection
rank: 1000
slot_uri: d4d:fileCount
domain_of:
- FileCollection
range: integer
total_bytes:
name: total_bytes
annotations:
d4d:docExample:
tag: d4d:docExample
value: 1073741824 (1 GiB = 1024³ bytes)
description: Total size of all files in this collection, in bytes (integer). Maps
to dcat:byteSize.
from_schema: https://w3id.org/bridge2ai/data-sheets-schema/file-collection
exact_mappings:
- dcat:byteSize
rank: 1000
slot_uri: d4d:total_bytes
domain_of:
- FileCollection
range: integer
class_uri: dcat:Dataset
Induced
name: FileCollection
description: A collection of files with shared characteristics (format, purpose, structure).
Represents a logical grouping of related files within a dataset, such as all training
data files, all image files, or all raw data files. Maps to RO-Crate Dataset entities
via schema:hasPart relationships.
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
aliases:
- file collection
- data files
- file group
exact_mappings:
- schema:Dataset
close_mappings:
- dcat:Distribution
is_a: Information
slot_usage:
path:
name: path
description: Path or URL to the FileCollection. May be a directory path, archive
file path, or download URL depending on how the collection is distributed.
compression:
name: compression
description: Compression format if the collection is packaged as a compressed
archive (e.g., gzip, zip, bzip2). Omit this field for uncompressed collections
or purely logical groupings.
external_resources:
name: external_resources
description: External files or URLs referenced by this file collection.
range: ExternalResource
multivalued: true
inlined_as_list: true
resources:
name: resources
description: Individual files or nested file collections within this collection.
Allows hierarchical file organization with both File objects and nested FileCollection
objects.
multivalued: true
inlined_as_list: true
any_of:
- range: File
- range: FileCollection
attributes:
collection_type:
name: collection_type
description: Type(s) of content in this file collection. A collection may have
multiple types, for example a collection containing both raw_data and documentation
files would have both types listed.
from_schema: https://w3id.org/bridge2ai/data-sheets-schema/file-collection
rank: 1000
slot_uri: d4d:collectionType
alias: collection_type
owner: FileCollection
domain_of:
- FileCollection
range: FileCollectionTypeEnum
multivalued: true
file_count:
name: file_count
annotations:
d4d:docExample:
tag: d4d:docExample
value: '47'
description: Number of files in this collection.
from_schema: https://w3id.org/bridge2ai/data-sheets-schema/file-collection
rank: 1000
slot_uri: d4d:fileCount
alias: file_count
owner: FileCollection
domain_of:
- FileCollection
range: integer
total_bytes:
name: total_bytes
annotations:
d4d:docExample:
tag: d4d:docExample
value: 1073741824 (1 GiB = 1024³ bytes)
description: Total size of all files in this collection, in bytes (integer). Maps
to dcat:byteSize.
from_schema: https://w3id.org/bridge2ai/data-sheets-schema/file-collection
exact_mappings:
- dcat:byteSize
rank: 1000
slot_uri: d4d:total_bytes
alias: total_bytes
owner: FileCollection
domain_of:
- FileCollection
range: integer
path:
name: path
annotations:
d4d:docExample:
tag: d4d:docExample
value: data/ai_readi/participants.csv
description: Path or URL to the FileCollection. May be a directory path, archive
file path, or download URL depending on how the collection is distributed.
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
rank: 1000
slot_uri: schema:contentUrl
alias: path
owner: FileCollection
domain_of:
- File
- FileCollection
range: string
compression:
name: compression
annotations:
d4d:docExample:
tag: d4d:docExample
value: zip
description: Compression format if the collection is packaged as a compressed
archive (e.g., gzip, zip, bzip2). Omit this field for uncompressed collections
or purely logical groupings.
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
rank: 1000
slot_uri: dcat:compressFormat
alias: compression
owner: FileCollection
domain_of:
- Information
- File
- FileCollection
range: CompressionEnum
external_resources:
name: external_resources
description: External files or URLs referenced by this file collection.
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
rank: 1000
slot_uri: dcterms:references
alias: external_resources
owner: FileCollection
domain_of:
- Dataset
- ExternalResource
- FileCollection
range: ExternalResource
multivalued: true
inlined_as_list: true
resources:
name: resources
description: Individual files or nested file collections within this collection.
Allows hierarchical file organization with both File objects and nested FileCollection
objects.
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
rank: 1000
slot_uri: schema:hasPart
alias: resources
owner: FileCollection
domain_of:
- DatasetCollection
- Dataset
- FileCollection
range: Dataset
multivalued: true
inlined_as_list: true
any_of:
- range: File
- range: FileCollection
conforms_to:
name: conforms_to
annotations:
d4d:docExample:
tag: d4d:docExample
value: https://www.w3.org/TR/vocab-dcat-3/
description: An established standard, specification, or schema to which the resource
conforms.
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
rank: 1000
slot_uri: dcterms:conformsTo
alias: conforms_to
owner: FileCollection
domain_of:
- Information
range: string
conforms_to_class:
name: conforms_to_class
annotations:
d4d:docExample:
tag: d4d:docExample
value: Dataset
description: The specific class or type within a schema to which the resource
conforms.
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
broad_mappings:
- dcterms:conformsTo
rank: 1000
slot_uri: d4d:conformsToClass
alias: conforms_to_class
owner: FileCollection
domain_of:
- Information
range: string
conforms_to_schema:
name: conforms_to_schema
annotations:
d4d:docExample:
tag: d4d:docExample
value: https://w3id.org/bridge2ai/data-sheets-schema
description: The schema or data model to which the resource conforms.
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
broad_mappings:
- dcterms:conformsTo
rank: 1000
slot_uri: d4d:conformsToSchema
alias: conforms_to_schema
owner: FileCollection
domain_of:
- Information
range: string
created_by:
name: created_by
annotations:
d4d:docExample:
tag: d4d:docExample
value: orcid:0000-0002-1234-5678
description: The person or organization primarily responsible for creating the
resource.
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
rank: 1000
slot_uri: dcterms:creator
alias: created_by
owner: FileCollection
domain_of:
- Information
range: string
created_on:
name: created_on
annotations:
d4d:docExample:
tag: d4d:docExample
value: '2023-07-18T00:00:00'
description: The date and time when the resource was created.
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
rank: 1000
slot_uri: dcterms:created
alias: created_on
owner: FileCollection
domain_of:
- Information
range: datetime
doi:
name: doi
annotations:
d4d:docExample:
tag: d4d:docExample
value: 10.5281/zenodo.10642459
description: Digital Object Identifier (DOI) in format 10.xxxx/xxxxx providing
persistent identification (e.g., '10.1038/s41586-020-2649-2', '10.5281/zenodo.1234567').
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
exact_mappings:
- schema:identifier
broad_mappings:
- dcterms:identifier
rank: 1000
slot_uri: d4d:doiIdentifier
alias: doi
owner: FileCollection
domain_of:
- Information
range: string
pattern: 10\.\d{4,}\/.+
download_url:
name: download_url
annotations:
d4d:docExample:
tag: d4d:docExample
value: https://fairhub.io/datasets/2/download
description: URL from which the data can be downloaded. This is not the same as
the landing page, which is a page that describes the dataset. Rather, this URL
points directly to the data itself.
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
exact_mappings:
- schema:url
rank: 1000
slot_uri: dcat:downloadURL
alias: download_url
owner: FileCollection
domain_of:
- Information
range: uri
issued:
name: issued
annotations:
d4d:docExample:
tag: d4d:docExample
value: '2024-11-15T00:00:00'
description: Date of formal issuance or publication of the resource.
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
rank: 1000
slot_uri: dcterms:issued
alias: issued
owner: FileCollection
domain_of:
- Information
range: datetime
keywords:
name: keywords
annotations:
d4d:docExample:
tag: d4d:docExample
value: diabetes, retinal imaging, multimodal, clinical data
description: Keywords or tags describing the resource for discovery and classification.
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
rank: 1000
slot_uri: dcat:keyword
alias: keywords
owner: FileCollection
domain_of:
- Information
range: string
multivalued: true
language:
name: language
annotations:
d4d:docExample:
tag: d4d:docExample
value: en
description: Language in which the information is expressed.
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
exact_mappings:
- schema:inLanguage
rank: 1000
slot_uri: dcterms:language
alias: language
owner: FileCollection
domain_of:
- Information
range: string
last_updated_on:
name: last_updated_on
annotations:
d4d:docExample:
tag: d4d:docExample
value: '2024-11-15T00:00:00'
description: The date and time when the resource was most recently modified or
updated.
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
rank: 1000
slot_uri: dcterms:modified
alias: last_updated_on
owner: FileCollection
domain_of:
- Information
range: datetime
license:
name: license
annotations:
d4d:docExample:
tag: d4d:docExample
value: CC-BY-NC-4.0
description: The legal license under which the resource is made available (e.g.,
"MIT", "CC-BY-4.0").
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
rank: 1000
slot_uri: dcterms:license
alias: license
owner: FileCollection
domain_of:
- Software
- Information
range: string
modified_by:
name: modified_by
annotations:
d4d:docExample:
tag: d4d:docExample
value: orcid:0000-0002-9876-5432
description: A person or organization that contributed to modifying or updating
the resource.
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
rank: 1000
slot_uri: dcterms:contributor
alias: modified_by
owner: FileCollection
domain_of:
- Information
range: string
page:
name: page
annotations:
d4d:docExample:
tag: d4d:docExample
value: https://fairhub.io/datasets/2
description: A landing page or web page providing access to or information about
the resource.
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
rank: 1000
slot_uri: dcat:landingPage
alias: page
owner: FileCollection
domain_of:
- Information
range: string
publisher:
name: publisher
annotations:
d4d:docExample:
tag: d4d:docExample
value: 'ror:04t3en479 # use a ROR ID, DOI, or URL — not a plain name'
description: The organization or entity responsible for making the resource available.
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
rank: 1000
slot_uri: dcterms:publisher
alias: publisher
owner: FileCollection
domain_of:
- Information
range: uriorcurie
status:
name: status
annotations:
d4d:docExample:
tag: d4d:docExample
value: published
description: The status of the resource (e.g., draft, published, deprecated).
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
rank: 1000
slot_uri: d4d:publicationStatus
alias: status
owner: FileCollection
domain_of:
- Information
range: string
title:
name: title
annotations:
d4d:docExample:
tag: d4d:docExample
value: 'AI-READI: Salutogenesis Study of Type 2 Diabetes'
description: The official title of the element.
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
rank: 1000
slot_uri: dcterms:title
alias: title
owner: FileCollection
domain_of:
- Information
range: string
version:
name: version
annotations:
d4d:docExample:
tag: d4d:docExample
value: 2.0.0
description: The version identifier of the resource (e.g., "1.0", "2.3.1").
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
rank: 1000
slot_uri: schema:version
alias: version
owner: FileCollection
domain_of:
- Software
- Information
range: string
was_derived_from:
name: was_derived_from
annotations:
d4d:docExample:
tag: d4d:docExample
value: https://fairhub.io/datasets/2/versions/1
description: A resource from which this resource was derived, in whole or in part.
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
exact_mappings:
- dcterms:source
rank: 1000
slot_uri: prov:wasDerivedFrom
alias: was_derived_from
owner: FileCollection
domain_of:
- Information
range: string
id:
name: id
annotations:
d4d:docExample:
tag: d4d:docExample
value: https://example.org/dataset/my-dataset-001
description: A unique identifier for a thing.
from_schema: https://w3id.org/bridge2ai/data-sheets-schema/base
rank: 1000
slot_uri: schema:identifier
identifier: true
alias: id
owner: FileCollection
domain_of:
- NamedThing
- DatasetProperty
range: uriorcurie
required: true
name:
name: name
annotations:
d4d:docExample:
tag: d4d:docExample
value: AI-READI Dataset
description: A human-readable name for a thing.
from_schema: https://w3id.org/bridge2ai/data-sheets-schema/base
rank: 1000
slot_uri: schema:name
alias: name
owner: FileCollection
domain_of:
- NamedThing
- DatasetProperty
range: string
description:
name: description
annotations:
d4d:docExample:
tag: d4d:docExample
value: A multimodal dataset of 4,000 participants with Type 2 Diabetes.
description: A human-readable description for a thing.
from_schema: https://w3id.org/bridge2ai/data-sheets-schema/base
rank: 1000
slot_uri: schema:description
alias: description
owner: FileCollection
domain_of:
- NamedThing
- DatasetProperty
- DatasetRelationship
range: string
class_uri: dcat:Dataset