| access_details |
Information on how to access or retrieve the raw source data |
| access_url |
URL or access point for the raw data |
| access_urls |
Details of the distribution channel(s) or format(s) |
| acquisition_details |
Details on how data was acquired for each instance |
| acquisition_methods |
|
| addressing_gaps |
|
| affected_subsets |
Specific subsets or features of the dataset affected by this bias |
| affiliation |
The organization(s) to which the person belongs in the context of this datase... |
| affiliations |
Organizations with which the creator or team is affiliated |
| agreement_metric |
Type of agreement metric used (Cohen's kappa, Fleiss' kappa, Krippendorff's a... |
| analysis_method |
Methodology used to assess annotation quality and resolve disagreements |
| annotation_analyses |
Analysis of annotation quality and inter-annotator agreement |
| annotation_quality_details |
Additional details on annotation quality assessment and findings |
| annotations_per_item |
Number of annotations collected per data item |
| annotator_demographics |
Demographic information about annotators, if available and relevant (e |
| anomalies |
|
| anomaly_details |
Details on errors, noise sources, or redundancies in the dataset |
| anonymization_method |
What methods were used to anonymize or de-identify participant data? Include ... |
| archival |
Indication whether official archival versions of external resources are inclu... |
| assent_procedures |
For research involving minors, what assent procedures were used? How was deve... |
| bias_description |
Detailed description of how this bias manifests in the dataset, including aff... |
| bias_type |
The type of bias identified, using standardized categories from the Artificia... |
| bytes |
Size of the data in bytes |
| categories |
The permitted categories or values for a categorical variable |
| citation |
Recommended citation for this dataset in DataCite or BibTeX format |
| cleaning_details |
Details on data cleaning procedures applied |
| cleaning_strategies |
|
| collection_details |
Details on direct vs |
| collection_mechanisms |
|
| collection_timeframes |
|
| collector_details |
Details on who collected the data and their compensation |
| comment_prefix |
|
| compensation_amount |
What was the amount or value of compensation provided? Include currency or eq... |
| compensation_provided |
Were participants compensated for their participation? |
| compensation_rationale |
What was the rationale for the compensation structure? How was the amount det... |
| compensation_type |
What type of compensation was provided (e |
| compression |
compression format used, if any |
| confidential_elements |
|
| confidential_elements_present |
Indicates whether any confidential data elements are present |
| confidentiality_details |
Details on confidential data elements and handling procedures |
| confidentiality_level |
Confidentiality classification of the dataset indicating level of access rest... |
| conforms_to |
|
| conforms_to_class |
|
| conforms_to_schema |
|
| consent_details |
Details on how consent was requested, provided, and documented |
| consent_documentation |
How is consent documented? Include references to consent forms or procedures ... |
| consent_obtained |
Was informed consent obtained from all participants? |
| consent_scope |
What specific uses did participants consent to? Are there limitations on data... |
| consent_type |
What type of consent was obtained (e |
| contact_person |
Contact person for questions about ethical review |
| content_warnings |
|
| content_warnings_present |
Indicates whether any content warnings are needed |
| contribution_url |
URL for contribution guidelines or process |
| counts |
How many instances are there in total (of each type, if appropriate)? |
| created_by |
|
| created_on |
|
| creators |
|
| credit_roles |
Contributor roles using the CRediT (Contributor Roles Taxonomy) for the princ... |
| data_annotation_platform |
Platform or tool used for annotation (e |
| data_annotation_protocol |
Annotation methodology, tasks, and protocols followed during labeling |
| data_collectors |
|
| data_linkage |
Can this dataset be linked to other datasets in ways that might compromise pa... |
| data_protection_impacts |
|
| data_substrate |
Type of data (e |
| data_topic |
General topic of each instance (e |
| data_type |
The data type of the variable (e |
| data_use_permission |
Structured data use permissions using the Data Use Ontology (DUO) |
| deidentification_details |
Details on de-identification procedures and residual risks |
| delimiter |
|
| derivation |
Description of how this variable was derived or calculated from other variabl... |
| description |
A human-readable description for a thing |
| dialect |
|
| disagreement_patterns |
Systematic patterns in annotator disagreements (e |
| discouraged_uses |
|
| discouragement_details |
Details on tasks for which the dataset should not be used |
| distribution |
|
| distribution_dates |
|
| distribution_formats |
|
| doi |
digital object identifier |
| double_quote |
|
| download_url |
URL from which the data can be downloaded |
| email |
The email address of the person |
| encoding |
the character encoding of the data |
| end_date |
End date of data collection |
| errata |
|
| erratum_details |
Details on any errata or corrections to the dataset |
| erratum_url |
URL or access point for the erratum |
| ethical_reviews |
|
| ethics_review_board |
What ethics review board(s) reviewed this research? Include institution names... |
| examples |
List of examples of known/previous uses of the dataset |
| existing_uses |
|
| extension_details |
Details on extension mechanisms, contribution validation, and communication |
| extension_mechanism |
|
| external_resources |
Links or identifiers for external resources |
| format |
The file format, physical medium, or dimensions of a resource |
| frequency |
How often updates are planned (e |
| funders |
|
| future_guarantees |
Explanation of any commitments that external resources will remain available ... |
| future_use_impacts |
|
| governance_committee_contact |
Contact person for data governance committee |
| grant_number |
The alphanumeric identifier for the grant |
| grantor |
Name/identifier of the organization providing monetary or resource support |
| grants |
Grant mechanisms supporting dataset creation |
| guardian_consent |
For participants unable to provide their own consent, how was guardian or sur... |
| handling_strategy |
Strategy used to handle missing data (e |
| hash |
hash of the data |
| header |
|
| hipaa_compliant |
Indicates compliance with the Health Insurance Portability and Accountability... |
| human_subject_research |
Information about whether dataset involves human subjects research, including... |
| id |
A unique identifier for a thing |
| identifiable_elements_present |
Indicates whether data subjects can be identified |
| identification |
|
| identifiers_removed |
List of identifier types removed during de-identification |
| impact_details |
Details on potential impacts, risks, and mitigation strategies |
| imputation_method |
Specific imputation technique used (mean, median, mode, forward fill, backwar... |
| imputation_protocols |
Data imputation methodology and techniques |
| imputation_rationale |
Justification for the imputation approach chosen, including assumptions made ... |
| imputation_validation |
Methods used to validate imputation quality (if any) |
| imputed_fields |
Fields or columns where imputation was applied |
| informed_consent |
Details about informed consent procedures, including consent type, documentat... |
| instance_type |
Multiple types of instances? (e |
| instances |
|
| intended_uses |
Explicit intended and recommended uses for this dataset |
| inter_annotator_agreement |
Measure of agreement between annotators (e |
| inter_annotator_agreement_score |
Measured agreement between annotators (e |
| involves_human_subjects |
Does this dataset involve human subjects research? |
| ip_restrictions |
|
| irb_approval |
Was Institutional Review Board (IRB) approval obtained? Include approval numb... |
| is_data_split |
Is this subset a split of the larger dataset, e |
| is_deidentified |
|
| is_direct |
Whether collection was direct from individuals |
| is_identifier |
Indicates whether this variable serves as a unique identifier or key for reco... |
| is_random |
Indicates whether the sample is random |
| is_representative |
Indicates whether the sample is representative of the larger set |
| is_sample |
Indicates whether it is a sample of a larger set |
| is_sensitive |
Indicates whether this variable contains sensitive information (e |
| is_shared |
Boolean indicating whether the dataset is distributed to parties external to ... |
| is_subpopulation |
Is this subset a subpopulation of the larger dataset, e |
| is_tabular |
|
| issued |
|
| keywords |
|
| known_biases |
Known biases present in the dataset that may affect fairness, representativen... |
| known_limitations |
Known limitations of the dataset that may affect its use or interpretation |
| label |
Is there a label or target associated with each instance? |
| label_description |
If labeled, what pattern or format do labels follow? |
| labeling_details |
Details on labeling/annotation procedures and quality metrics |
| labeling_strategies |
|
| language |
language in which the information is expressed |
| last_updated_on |
|
| latest_version_doi |
DOI or URL of the latest dataset version |
| license |
|
| license_and_use_terms |
|
| license_terms |
Description of the dataset's license and terms of use (including links, costs... |
| limitation_description |
Detailed description of the limitation and its implications |
| limitation_type |
Category of limitation (e |
| machine_annotation_tools |
Automated annotation tools used in dataset creation |
| maintainer_details |
Details on who will support, host, or maintain the dataset |
| maintainers |
|
| maximum_value |
The maximum value that the variable can take |
| md5 |
md5 hash of the data |
| measurement_technique |
The technique or method used to measure this variable |
| mechanism_details |
Details on mechanisms or procedures used to collect the data |
| media_type |
The media type of the data |
| method |
Method used for de-identification (e |
| minimum_value |
The minimum value that the variable can take |
| missing |
Description of the missing data fields or elements |
| missing_data_causes |
Known or suspected causes of missing data (e |
| missing_data_documentation |
Documentation of missing data patterns and handling strategies |
| missing_data_patterns |
Description of patterns in missing data (e |
| missing_information |
References to one or more MissingInfo objects describing missing data |
| missing_value_code |
Code(s) used to represent missing values for this variable |
| mitigation_strategy |
Steps taken or recommended to mitigate this bias |
| modified_by |
|
| name |
A human-readable name for a thing |
| notification_details |
Details on how individuals were notified about data collection |
| orcid |
ORCID (Open Researcher and Contributor ID) - a persistent digital identifier ... |
| other_compliance |
Other regulatory compliance frameworks applicable to this dataset (e |
| other_tasks |
|
| page |
|
| parent_datasets |
Parent datasets that this dataset is part of or derived from |
| participant_compensation |
Compensation or incentives provided to human research participants |
| participant_privacy |
Privacy protections and anonymization procedures for human research participa... |
| path |
|
| precision |
The precision or number of decimal places for numeric variables |
| preprocessing_details |
Details on preprocessing steps applied to the data |
| preprocessing_strategies |
|
| principal_investigator |
A key individual (Principal Investigator) responsible for or overseeing datas... |
| privacy_techniques |
What privacy-preserving techniques were applied (e |
| prohibited_uses |
Explicitly prohibited or forbidden uses for this dataset |
| prohibition_reason |
Reason why this use is prohibited (e |
| publisher |
|
| purposes |
|
| quality_notes |
Notes about data quality, reliability, or known issues specific to this varia... |
| quote_char |
|
| raw_data_details |
Details on raw data availability and access procedures |
| raw_data_format |
Format of the raw data before any preprocessing |
| raw_data_sources |
Description of raw data sources before preprocessing |
| raw_sources |
|
| recommended_mitigation |
Recommended approaches for users to address this limitation |
| regulatory_compliance |
What regulatory frameworks govern this human subjects research (e |
| regulatory_restrictions |
|
| reidentification_risk |
What is the assessed risk of re-identification? What measures were taken to m... |
| related_datasets |
Related datasets with typed relationships (e |
| relationship_details |
Details on relationships between instances (e |
| relationship_type |
The type of relationship (e |
| release_dates |
Dates or timeframe for dataset release |
| repository_details |
Details on the repository of known dataset uses |
| repository_url |
URL to a repository of known dataset uses |
| representative_verification |
Explanation of how representativeness was validated or verified |
| resources |
Sub-resources or component datasets |
| response |
Short explanation describing the primary purpose of creating the dataset |
| restrictions |
Description of any restrictions or fees associated with external resources |
| retention_details |
Details on data retention limits and enforcement procedures |
| retention_limit |
|
| retention_period |
Time period for data retention |
| review_details |
Details on ethical review processes, outcomes, and supporting documentation |
| reviewing_organization |
Organization that conducted the ethical review (e |
| revocation_details |
Details on consent revocation mechanisms and procedures |
| role |
Role of the data collector (e |
| same_as |
URL of a reference web resource that is the same as this dataset |
| sampling_strategies |
|
| scope_impact |
How this limitation affects the scope or applicability of the dataset |
| sensitive_elements |
|
| sensitive_elements_present |
Indicates whether sensitive data elements are present |
| sensitivity_details |
Details on sensitive data elements present and handling procedures |
| sha256 |
sha256 hash of the data |
| source_data |
Description of the larger set from which the sample was drawn, if any |
| source_description |
Detailed description of where raw data comes from (e |
| source_type |
Type of raw source (sensor, database, user input, web scraping, etc |
| special_populations |
Does the research involve any special populations that require additional pro... |
| special_protections |
What additional protections were implemented for vulnerable populations? Incl... |
| split_details |
Details on recommended data splits and their rationale |
| start_date |
Start date of data collection |
| status |
|
| strategies |
Description of the sampling strategy (deterministic, probabilistic, etc |
| subpopulation_elements_present |
Indicates whether any subpopulations are explicitly identified |
| subpopulations |
|
| subsets |
|
| target_dataset |
The dataset that this relationship points to |
| task_details |
Details on other potential tasks the dataset could be used for |
| tasks |
|
| themes |
Themes associated with the data |
| timeframe_details |
Details on the collection timeframe and relationship to data creation dates |
| title |
the official title of the element |
| tool_accuracy |
Known accuracy or performance metrics for the automated tools (if available) |
| tool_descriptions |
Descriptions of what each tool does in the annotation process and what types ... |
| tools |
List of automated annotation tools with their versions |
| unit |
The unit of measurement for the variable, preferably using QUDT units (http:/... |
| update_details |
Details on update plans, responsible parties, and communication methods |
| updates |
|
| url |
|
| usage_notes |
Notes or caveats about using the dataset for intended purposes |
| use_category |
Category of intended use (e |
| use_repository |
|
| used_software |
What software was used as part of this dataset property? |
| variable_name |
The name or identifier of the variable as it appears in the data files |
| variables |
Metadata describing individual variables, fields, or columns in the dataset |
| version |
|
| version_access |
|
| version_details |
Details on version support policies and obsolescence communication |
| versions_available |
List of available versions with metadata |
| vulnerable_groups_included |
Are any vulnerable populations included (e |
| vulnerable_populations |
Information about protections for vulnerable populations (e |
| warnings |
|
| was_derived_from |
|
| was_directly_observed |
Whether the data was directly observed |
| was_inferred_derived |
Whether the data was inferred or derived from other data |
| was_reported_by_subjects |
Whether the data was reported directly by the subjects themselves |
| was_validated_verified |
Whether the data was validated or verified in any way |
| why_missing |
Explanation of why each piece of data is missing |
| why_not_representative |
Explanation of why the sample is not representative, if applicable |
| withdrawal_mechanism |
How can participants withdraw their consent? What procedures are in place for... |