What is FAIR Data?
The FAIR Data Principles where published in 2016 by a consortium of organisations and researchers who not only wanted to enhance the reusability of datasets, but also related facets such as tools, workflows and algorithms. The principles developed addressed four key aspects of making data Finable, Accessible, Interoperable and Reusable (FAIR).
- Findable – Data and metadata should be easily discoverable by both humans and machines through the use of standard identification mechanisms.
- Accessible – Once found, data should be easy to download and use either locally or in a trusted digital research environment. The hosting repository should also have plans in place to keep metadata accessible even in the event of the data itself no longer being available.
- Interoperable – Utilise standard vocabularies and ontologies in order to ensure the data can be easily mapped and combined with other datasets. This, along with the ability to transform data into standardised formats like FHIR, enables sharing between various scientific disciplines and organisations.
- Reusable – Data and metadata should be richly described with the least restrictive licenses, allowing it to be easily reused in future research. Integration with other data sources should be easy, facilitated by proper citations and descriptions. The standard identification of items improves data provenance and allows researchers to not only re-use data, but to identify how aspects of it may be reproduced in their own research. The specific principles and criteria are as follows:
To be Findable:
- F1. (meta)data are assigned a globally unique and eternally persistent identifier.
- F2. data are described with rich metadata.
- F3. (meta)data are registered or indexed in a searchable resource.
- F4. metadata specify the data identifier.
To be Accessible:
- A1 (meta)data are retrievable by their identifier using a standardized communications protocol.
- A1.1 the protocol is open, free, and universally implementable.
- A1.2 the protocol allows for an authentication and authorization procedure, where necessary.
- A2 metadata are accessible, even when the data are no longer available.
To be Interoperable:
- I1. (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.
- I2. (meta)data use vocabularies that follow FAIR principles.
- I3. (meta)data include < to other (meta)data.
To be Reusable:
- R1. meta(data) have a plurality of accurate and relevant attribute.
- R1.1. (meta)data are released with a clear and accessible data usage license.
- R1.2. (meta)data are associated with their provenance.
- R1.3. (meta)data meet domain-relevant community standard.
Why FAIR Data?
There is a great need for improved data management and stewardship required across all disciplines, particularly as data quality and volume have increased significantly in recent years. The FAIR Data Principles provide guidelines on how to achieve this however there are specific benefits to organisations and researchers.
Benefits to Researchers
- Researchers can focus on adding value by interpreting the data rather than searching, collecting or re-creating existing data. Data scientists reported that this accounts for up to 80% of their working time.
- By knowing what data already exists, researchers can build upon existing work to further innovation and not spend effort duplicating existing work.
- Data can be unlocked that is typically held in remote silos to help further research.
- Increased return in terms of credit and citations.
- Easier to cross fertilise between disciplines.
Benefits to Organisations
- More efficient workforce with reduced time to impact.
- Maximises value of investments and reduced double-funding. Approximately €26.2 billion per year is wasted in the European Union by researchers and innovators not yet adopting FAIR data principles (source).
- Rigorous management and stewardship of digital resource helping researchers adhere to the expectations and requirements of their funding agencies.