Subset: Collection
The questions in this section are designed to elicit information that may help researchers and practitioners to create alternative datasets with similar characteristics.
URI: Collection
Identifier and Mapping Information
Schema Source
- from schema: https://w3id.org/bridge2ai/data-sheets-schema
Classes in subset
Class | Description |
---|---|
CollectionConsent | Did the individuals in question consent to the collection and use of their da... |
CollectionMechanism | What mechanisms or procedures were used to collect the data (e |
CollectionNotification | Were the individuals in question notified about the data collection? If so, p... |
CollectionTimeframe | Over what timeframe was the data collected? Does this timeframe match the cre... |
ConsentRevocation | If consent was obtained, were the consenting individuals provided with a mech... |
DataCollector | Who was involved in the data collection process (e |
DataProtectionImpact | Has an analysis of the potential impact of the dataset and its use on data su... |
DirectCollection | Did you collect the data from the individuals in question directly, or obtain... |
EthicalReview | Were any ethical review processes conducted (e |
InstanceAcquisition | How was the data associated with each instance acquired? Was the data directl... |
SamplingStrategy | Does the dataset contain all possible instances or is it a sample (not necess... |
CollectionConsent
Did the individuals in question consent to the collection and use of their data? If so, please describe (or show with screenshots or other information) how consent was requested and provided, and provide a link or other access point to, or otherwise reproduce, the exact language to which the individuals consented.
CollectionMechanism
What mechanisms or procedures were used to collect the data (e.g., hardware apparatuses or sensors, manual human curation, software programs, software APIs)? How were these mechanisms or procedures validated?
CollectionNotification
Were the individuals in question notified about the data collection? If so, please describe (or show with screenshots or other information) how notice was provided, and provide a link or other access point to, or otherwise reproduce, the exact language of the notification itself.
CollectionTimeframe
Over what timeframe was the data collected? Does this timeframe match the creation timeframe of the data associated with the instances (e.g., recent crawl of old news articles)? If not, please describe the timeframe in which the data associated with the instances was created.
ConsentRevocation
If consent was obtained, were the consenting individuals provided with a mechanism to revoke their consent in the future or 8 for certain uses? If so, please provide a description, as well as a link or other access point to the mechanism (if appropriate).
DataCollector
Who was involved in the data collection process (e.g., students, crowdworkers, contractors) and how were they compensated (e.g., how much were crowdworkers paid)?
DataProtectionImpact
Has an analysis of the potential impact of the dataset and its use on data subjects (e.g., a data protection impact analysis) been conducted? If so, please provide a description of this analysis, including the outcomes, as well as a link or other access point to any supporting documentation.
DirectCollection
Did you collect the data from the individuals in question directly, or obtain it via third parties or other sources (e.g., websites)?
EthicalReview
Were any ethical review processes conducted (e.g., by an institutional review board)? If so, please provide a description of these review processes, including the outcomes, as well as a link or other access point to any supporting documentation.
InstanceAcquisition
How was the data associated with each instance acquired? Was the data directly observable (e.g., raw text, movie ratings), reported by subjects (e.g., survey responses), or indirectly inferred/derived from other data (e.g., part-of-speech tags, model-based guesses for age or language)? If the data was reported by subjects or indirectly inferred/derived from other data, was the data validated/verified?
SamplingStrategy
Does the dataset contain all possible instances or is it a sample (not necessarily random) of instances from a larger set? If the dataset is a sample, then what is the larger set? Is the sample representative of the larger set (e.g., geographic coverage)? If so, please describe how this representativeness was validated/verified. If it is not representative of the larger set, please describe why not (e.g., to cover a more diverse range of instances, because instances were withheld or unavailable).