Skip to content

Class: FileCollection

A collection of files with shared characteristics (format, purpose, structure). Represents a logical grouping of related files within a dataset, such as all training data files, all image files, or all raw data files. Maps to RO-Crate Dataset entities via schema:hasPart relationships.

URI: dcat:Dataset

classDiagram class FileCollection click FileCollection href "../FileCollection/" Information <|-- FileCollection click Information href "../Information/" FileCollection : collection_type FileCollection --> "*" FileCollectionTypeEnum : collection_type click FileCollectionTypeEnum href "../FileCollectionTypeEnum/" FileCollection : compression FileCollection --> "0..1" CompressionEnum : compression click CompressionEnum href "../CompressionEnum/" FileCollection : conforms_to FileCollection : conforms_to_class FileCollection : conforms_to_schema FileCollection : created_by FileCollection : created_on FileCollection : description FileCollection : doi FileCollection : download_url FileCollection : external_resources FileCollection --> "*" ExternalResource : external_resources click ExternalResource href "../ExternalResource/" FileCollection : file_count FileCollection : id FileCollection : issued FileCollection : keywords FileCollection : language FileCollection : last_updated_on FileCollection : license FileCollection : modified_by FileCollection : name FileCollection : page FileCollection : path FileCollection : publisher FileCollection : resources FileCollection --> "*" Dataset : resources click Dataset href "../Dataset/" FileCollection : status FileCollection : title FileCollection : total_bytes FileCollection : version FileCollection : was_derived_from

Inheritance

Slots

Name Cardinality and Range Description Inheritance
path 0..1
String
Path or URL to the FileCollection direct
compression 0..1
CompressionEnum
Compression format if the collection is packaged as a compressed archive (e direct
external_resources *
ExternalResource
External files or URLs referenced by this file collection direct
resources *
Dataset or 
File or 
FileCollection
Individual files or nested file collections within this collection direct
collection_type *
FileCollectionTypeEnum
Type(s) of content in this file collection direct
file_count 0..1
Integer
Number of files in this collection direct
total_bytes 0..1
Integer
Total size of all files in this collection, in bytes (integer) direct
conforms_to 0..1
String
An established standard, specification, or schema to which the resource confo... Information
conforms_to_class 0..1
String
The specific class or type within a schema to which the resource conforms Information
conforms_to_schema 0..1
String
The schema or data model to which the resource conforms Information
created_by 0..1
String
The person or organization primarily responsible for creating the resource Information
created_on 0..1
Datetime
The date and time when the resource was created Information
doi 0..1
String
Digital Object Identifier (DOI) in format 10 Information
download_url 0..1
Uri
URL from which the data can be downloaded Information
issued 0..1
Datetime
Date of formal issuance or publication of the resource Information
keywords *
String
Keywords or tags describing the resource for discovery and classification Information
language 0..1
String
Language in which the information is expressed Information
last_updated_on 0..1
Datetime
The date and time when the resource was most recently modified or updated Information
license 0..1
String
The legal license under which the resource is made available (e Information
modified_by 0..1
String
A person or organization that contributed to modifying or updating the resour... Information
page 0..1
String
A landing page or web page providing access to or information about the resou... Information
publisher 0..1
Uriorcurie
The organization or entity responsible for making the resource available Information
status 0..1
String
The status of the resource (e Information
title 0..1
String
The official title of the element Information
version 0..1
String
The version identifier of the resource (e Information
was_derived_from 0..1
String
A resource from which this resource was derived, in whole or in part Information
id 1
Uriorcurie
A unique identifier for a thing NamedThing
name 0..1
String
A human-readable name for a thing NamedThing
description 0..1
String
A human-readable description for a thing NamedThing

Usages

used by used in type used
Dataset file_collections range FileCollection
DataSubset file_collections range FileCollection
FileCollection resources any_of[range] FileCollection

Aliases

  • file collection
  • data files
  • file group

Identifier and Mapping Information

Schema Source

  • from schema: https://w3id.org/bridge2ai/data-sheets-schema

Mappings

Mapping Type Mapped Value
self dcat:Dataset
native data_sheets_schema:FileCollection
exact schema:Dataset
close dcat:Distribution

LinkML Source

Direct

name: FileCollection
description: A collection of files with shared characteristics (format, purpose, structure).
  Represents a logical grouping of related files within a dataset, such as all training
  data files, all image files, or all raw data files. Maps to RO-Crate Dataset entities
  via schema:hasPart relationships.
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
aliases:
- file collection
- data files
- file group
exact_mappings:
- schema:Dataset
close_mappings:
- dcat:Distribution
is_a: Information
slots:
- path
- compression
- external_resources
- resources
slot_usage:
  path:
    name: path
    description: Path or URL to the FileCollection. May be a directory path, archive
      file path, or download URL depending on how the collection is distributed.
  compression:
    name: compression
    description: Compression format if the collection is packaged as a compressed
      archive (e.g., gzip, zip, bzip2). Omit this field for uncompressed collections
      or purely logical groupings.
  external_resources:
    name: external_resources
    description: External files or URLs referenced by this file collection.
    range: ExternalResource
    multivalued: true
    inlined_as_list: true
  resources:
    name: resources
    description: Individual files or nested file collections within this collection.
      Allows hierarchical file organization with both File objects and nested FileCollection
      objects.
    multivalued: true
    inlined_as_list: true
    any_of:
    - range: File
    - range: FileCollection
attributes:
  collection_type:
    name: collection_type
    description: Type(s) of content in this file collection. A collection may have
      multiple types, for example a collection containing both raw_data and documentation
      files would have both types listed.
    from_schema: https://w3id.org/bridge2ai/data-sheets-schema/file-collection
    rank: 1000
    slot_uri: d4d:collectionType
    domain_of:
    - FileCollection
    range: FileCollectionTypeEnum
    multivalued: true
  file_count:
    name: file_count
    annotations:
      d4d:docExample:
        tag: d4d:docExample
        value: '47'
    description: Number of files in this collection.
    from_schema: https://w3id.org/bridge2ai/data-sheets-schema/file-collection
    rank: 1000
    slot_uri: d4d:fileCount
    domain_of:
    - FileCollection
    range: integer
  total_bytes:
    name: total_bytes
    annotations:
      d4d:docExample:
        tag: d4d:docExample
        value: 1073741824 (1 GiB = 1024³ bytes)
    description: Total size of all files in this collection, in bytes (integer). Maps
      to dcat:byteSize.
    from_schema: https://w3id.org/bridge2ai/data-sheets-schema/file-collection
    exact_mappings:
    - dcat:byteSize
    rank: 1000
    slot_uri: d4d:total_bytes
    domain_of:
    - FileCollection
    range: integer
class_uri: dcat:Dataset

Induced

name: FileCollection
description: A collection of files with shared characteristics (format, purpose, structure).
  Represents a logical grouping of related files within a dataset, such as all training
  data files, all image files, or all raw data files. Maps to RO-Crate Dataset entities
  via schema:hasPart relationships.
from_schema: https://w3id.org/bridge2ai/data-sheets-schema
aliases:
- file collection
- data files
- file group
exact_mappings:
- schema:Dataset
close_mappings:
- dcat:Distribution
is_a: Information
slot_usage:
  path:
    name: path
    description: Path or URL to the FileCollection. May be a directory path, archive
      file path, or download URL depending on how the collection is distributed.
  compression:
    name: compression
    description: Compression format if the collection is packaged as a compressed
      archive (e.g., gzip, zip, bzip2). Omit this field for uncompressed collections
      or purely logical groupings.
  external_resources:
    name: external_resources
    description: External files or URLs referenced by this file collection.
    range: ExternalResource
    multivalued: true
    inlined_as_list: true
  resources:
    name: resources
    description: Individual files or nested file collections within this collection.
      Allows hierarchical file organization with both File objects and nested FileCollection
      objects.
    multivalued: true
    inlined_as_list: true
    any_of:
    - range: File
    - range: FileCollection
attributes:
  collection_type:
    name: collection_type
    description: Type(s) of content in this file collection. A collection may have
      multiple types, for example a collection containing both raw_data and documentation
      files would have both types listed.
    from_schema: https://w3id.org/bridge2ai/data-sheets-schema/file-collection
    rank: 1000
    slot_uri: d4d:collectionType
    alias: collection_type
    owner: FileCollection
    domain_of:
    - FileCollection
    range: FileCollectionTypeEnum
    multivalued: true
  file_count:
    name: file_count
    annotations:
      d4d:docExample:
        tag: d4d:docExample
        value: '47'
    description: Number of files in this collection.
    from_schema: https://w3id.org/bridge2ai/data-sheets-schema/file-collection
    rank: 1000
    slot_uri: d4d:fileCount
    alias: file_count
    owner: FileCollection
    domain_of:
    - FileCollection
    range: integer
  total_bytes:
    name: total_bytes
    annotations:
      d4d:docExample:
        tag: d4d:docExample
        value: 1073741824 (1 GiB = 1024³ bytes)
    description: Total size of all files in this collection, in bytes (integer). Maps
      to dcat:byteSize.
    from_schema: https://w3id.org/bridge2ai/data-sheets-schema/file-collection
    exact_mappings:
    - dcat:byteSize
    rank: 1000
    slot_uri: d4d:total_bytes
    alias: total_bytes
    owner: FileCollection
    domain_of:
    - FileCollection
    range: integer
  path:
    name: path
    annotations:
      d4d:docExample:
        tag: d4d:docExample
        value: data/ai_readi/participants.csv
    description: Path or URL to the FileCollection. May be a directory path, archive
      file path, or download URL depending on how the collection is distributed.
    from_schema: https://w3id.org/bridge2ai/data-sheets-schema
    rank: 1000
    slot_uri: schema:contentUrl
    alias: path
    owner: FileCollection
    domain_of:
    - File
    - FileCollection
    range: string
  compression:
    name: compression
    annotations:
      d4d:docExample:
        tag: d4d:docExample
        value: zip
    description: Compression format if the collection is packaged as a compressed
      archive (e.g., gzip, zip, bzip2). Omit this field for uncompressed collections
      or purely logical groupings.
    from_schema: https://w3id.org/bridge2ai/data-sheets-schema
    rank: 1000
    slot_uri: dcat:compressFormat
    alias: compression
    owner: FileCollection
    domain_of:
    - Information
    - File
    - FileCollection
    range: CompressionEnum
  external_resources:
    name: external_resources
    description: External files or URLs referenced by this file collection.
    from_schema: https://w3id.org/bridge2ai/data-sheets-schema
    rank: 1000
    slot_uri: dcterms:references
    alias: external_resources
    owner: FileCollection
    domain_of:
    - Dataset
    - ExternalResource
    - FileCollection
    range: ExternalResource
    multivalued: true
    inlined_as_list: true
  resources:
    name: resources
    description: Individual files or nested file collections within this collection.
      Allows hierarchical file organization with both File objects and nested FileCollection
      objects.
    from_schema: https://w3id.org/bridge2ai/data-sheets-schema
    rank: 1000
    slot_uri: schema:hasPart
    alias: resources
    owner: FileCollection
    domain_of:
    - DatasetCollection
    - Dataset
    - FileCollection
    range: Dataset
    multivalued: true
    inlined_as_list: true
    any_of:
    - range: File
    - range: FileCollection
  conforms_to:
    name: conforms_to
    annotations:
      d4d:docExample:
        tag: d4d:docExample
        value: https://www.w3.org/TR/vocab-dcat-3/
    description: An established standard, specification, or schema to which the resource
      conforms.
    from_schema: https://w3id.org/bridge2ai/data-sheets-schema
    rank: 1000
    slot_uri: dcterms:conformsTo
    alias: conforms_to
    owner: FileCollection
    domain_of:
    - Information
    range: string
  conforms_to_class:
    name: conforms_to_class
    annotations:
      d4d:docExample:
        tag: d4d:docExample
        value: Dataset
    description: The specific class or type within a schema to which the resource
      conforms.
    from_schema: https://w3id.org/bridge2ai/data-sheets-schema
    broad_mappings:
    - dcterms:conformsTo
    rank: 1000
    slot_uri: d4d:conformsToClass
    alias: conforms_to_class
    owner: FileCollection
    domain_of:
    - Information
    range: string
  conforms_to_schema:
    name: conforms_to_schema
    annotations:
      d4d:docExample:
        tag: d4d:docExample
        value: https://w3id.org/bridge2ai/data-sheets-schema
    description: The schema or data model to which the resource conforms.
    from_schema: https://w3id.org/bridge2ai/data-sheets-schema
    broad_mappings:
    - dcterms:conformsTo
    rank: 1000
    slot_uri: d4d:conformsToSchema
    alias: conforms_to_schema
    owner: FileCollection
    domain_of:
    - Information
    range: string
  created_by:
    name: created_by
    annotations:
      d4d:docExample:
        tag: d4d:docExample
        value: orcid:0000-0002-1234-5678
    description: The person or organization primarily responsible for creating the
      resource.
    from_schema: https://w3id.org/bridge2ai/data-sheets-schema
    rank: 1000
    slot_uri: dcterms:creator
    alias: created_by
    owner: FileCollection
    domain_of:
    - Information
    range: string
  created_on:
    name: created_on
    annotations:
      d4d:docExample:
        tag: d4d:docExample
        value: '2023-07-18T00:00:00'
    description: The date and time when the resource was created.
    from_schema: https://w3id.org/bridge2ai/data-sheets-schema
    rank: 1000
    slot_uri: dcterms:created
    alias: created_on
    owner: FileCollection
    domain_of:
    - Information
    range: datetime
  doi:
    name: doi
    annotations:
      d4d:docExample:
        tag: d4d:docExample
        value: 10.5281/zenodo.10642459
    description: Digital Object Identifier (DOI) in format 10.xxxx/xxxxx providing
      persistent identification (e.g., '10.1038/s41586-020-2649-2', '10.5281/zenodo.1234567').
    from_schema: https://w3id.org/bridge2ai/data-sheets-schema
    exact_mappings:
    - schema:identifier
    broad_mappings:
    - dcterms:identifier
    rank: 1000
    slot_uri: d4d:doiIdentifier
    alias: doi
    owner: FileCollection
    domain_of:
    - Information
    range: string
    pattern: 10\.\d{4,}\/.+
  download_url:
    name: download_url
    annotations:
      d4d:docExample:
        tag: d4d:docExample
        value: https://fairhub.io/datasets/2/download
    description: URL from which the data can be downloaded. This is not the same as
      the landing page, which is a page that describes the dataset. Rather, this URL
      points directly to the data itself.
    from_schema: https://w3id.org/bridge2ai/data-sheets-schema
    exact_mappings:
    - schema:url
    rank: 1000
    slot_uri: dcat:downloadURL
    alias: download_url
    owner: FileCollection
    domain_of:
    - Information
    range: uri
  issued:
    name: issued
    annotations:
      d4d:docExample:
        tag: d4d:docExample
        value: '2024-11-15T00:00:00'
    description: Date of formal issuance or publication of the resource.
    from_schema: https://w3id.org/bridge2ai/data-sheets-schema
    rank: 1000
    slot_uri: dcterms:issued
    alias: issued
    owner: FileCollection
    domain_of:
    - Information
    range: datetime
  keywords:
    name: keywords
    annotations:
      d4d:docExample:
        tag: d4d:docExample
        value: diabetes, retinal imaging, multimodal, clinical data
    description: Keywords or tags describing the resource for discovery and classification.
    from_schema: https://w3id.org/bridge2ai/data-sheets-schema
    rank: 1000
    slot_uri: dcat:keyword
    alias: keywords
    owner: FileCollection
    domain_of:
    - Information
    range: string
    multivalued: true
  language:
    name: language
    annotations:
      d4d:docExample:
        tag: d4d:docExample
        value: en
    description: Language in which the information is expressed.
    from_schema: https://w3id.org/bridge2ai/data-sheets-schema
    exact_mappings:
    - schema:inLanguage
    rank: 1000
    slot_uri: dcterms:language
    alias: language
    owner: FileCollection
    domain_of:
    - Information
    range: string
  last_updated_on:
    name: last_updated_on
    annotations:
      d4d:docExample:
        tag: d4d:docExample
        value: '2024-11-15T00:00:00'
    description: The date and time when the resource was most recently modified or
      updated.
    from_schema: https://w3id.org/bridge2ai/data-sheets-schema
    rank: 1000
    slot_uri: dcterms:modified
    alias: last_updated_on
    owner: FileCollection
    domain_of:
    - Information
    range: datetime
  license:
    name: license
    annotations:
      d4d:docExample:
        tag: d4d:docExample
        value: CC-BY-NC-4.0
    description: The legal license under which the resource is made available (e.g.,
      "MIT", "CC-BY-4.0").
    from_schema: https://w3id.org/bridge2ai/data-sheets-schema
    rank: 1000
    slot_uri: dcterms:license
    alias: license
    owner: FileCollection
    domain_of:
    - Software
    - Information
    range: string
  modified_by:
    name: modified_by
    annotations:
      d4d:docExample:
        tag: d4d:docExample
        value: orcid:0000-0002-9876-5432
    description: A person or organization that contributed to modifying or updating
      the resource.
    from_schema: https://w3id.org/bridge2ai/data-sheets-schema
    rank: 1000
    slot_uri: dcterms:contributor
    alias: modified_by
    owner: FileCollection
    domain_of:
    - Information
    range: string
  page:
    name: page
    annotations:
      d4d:docExample:
        tag: d4d:docExample
        value: https://fairhub.io/datasets/2
    description: A landing page or web page providing access to or information about
      the resource.
    from_schema: https://w3id.org/bridge2ai/data-sheets-schema
    rank: 1000
    slot_uri: dcat:landingPage
    alias: page
    owner: FileCollection
    domain_of:
    - Information
    range: string
  publisher:
    name: publisher
    annotations:
      d4d:docExample:
        tag: d4d:docExample
        value: 'ror:04t3en479  # use a ROR ID, DOI, or URL — not a plain name'
    description: The organization or entity responsible for making the resource available.
    from_schema: https://w3id.org/bridge2ai/data-sheets-schema
    rank: 1000
    slot_uri: dcterms:publisher
    alias: publisher
    owner: FileCollection
    domain_of:
    - Information
    range: uriorcurie
  status:
    name: status
    annotations:
      d4d:docExample:
        tag: d4d:docExample
        value: published
    description: The status of the resource (e.g., draft, published, deprecated).
    from_schema: https://w3id.org/bridge2ai/data-sheets-schema
    rank: 1000
    slot_uri: d4d:publicationStatus
    alias: status
    owner: FileCollection
    domain_of:
    - Information
    range: string
  title:
    name: title
    annotations:
      d4d:docExample:
        tag: d4d:docExample
        value: 'AI-READI: Salutogenesis Study of Type 2 Diabetes'
    description: The official title of the element.
    from_schema: https://w3id.org/bridge2ai/data-sheets-schema
    rank: 1000
    slot_uri: dcterms:title
    alias: title
    owner: FileCollection
    domain_of:
    - Information
    range: string
  version:
    name: version
    annotations:
      d4d:docExample:
        tag: d4d:docExample
        value: 2.0.0
    description: The version identifier of the resource (e.g., "1.0", "2.3.1").
    from_schema: https://w3id.org/bridge2ai/data-sheets-schema
    rank: 1000
    slot_uri: schema:version
    alias: version
    owner: FileCollection
    domain_of:
    - Software
    - Information
    range: string
  was_derived_from:
    name: was_derived_from
    annotations:
      d4d:docExample:
        tag: d4d:docExample
        value: https://fairhub.io/datasets/2/versions/1
    description: A resource from which this resource was derived, in whole or in part.
    from_schema: https://w3id.org/bridge2ai/data-sheets-schema
    exact_mappings:
    - dcterms:source
    rank: 1000
    slot_uri: prov:wasDerivedFrom
    alias: was_derived_from
    owner: FileCollection
    domain_of:
    - Information
    range: string
  id:
    name: id
    annotations:
      d4d:docExample:
        tag: d4d:docExample
        value: https://example.org/dataset/my-dataset-001
    description: A unique identifier for a thing.
    from_schema: https://w3id.org/bridge2ai/data-sheets-schema/base
    rank: 1000
    slot_uri: schema:identifier
    identifier: true
    alias: id
    owner: FileCollection
    domain_of:
    - NamedThing
    - DatasetProperty
    range: uriorcurie
    required: true
  name:
    name: name
    annotations:
      d4d:docExample:
        tag: d4d:docExample
        value: AI-READI Dataset
    description: A human-readable name for a thing.
    from_schema: https://w3id.org/bridge2ai/data-sheets-schema/base
    rank: 1000
    slot_uri: schema:name
    alias: name
    owner: FileCollection
    domain_of:
    - NamedThing
    - DatasetProperty
    range: string
  description:
    name: description
    annotations:
      d4d:docExample:
        tag: d4d:docExample
        value: A multimodal dataset of 4,000 participants with Type 2 Diabetes.
    description: A human-readable description for a thing.
    from_schema: https://w3id.org/bridge2ai/data-sheets-schema/base
    rank: 1000
    slot_uri: schema:description
    alias: description
    owner: FileCollection
    domain_of:
    - NamedThing
    - DatasetProperty
    - DatasetRelationship
    range: string
class_uri: dcat:Dataset