FAIR data

FAIR data is data which meets the FAIR principles of findability, accessibility, interoperability, and reusability (FAIR).[1][2] The acronym and principles were defined in a March 2016 paper in the journal Scientific Data by a consortium of scientists and organizations.[1]
The FAIR principles emphasize machine-actionability (i.e., the capacity of computational systems to find, access, interoperate, and reuse data with none or minimal human intervention) because humans increasingly rely on computational support to deal with data as a result of the increase in the volume, complexity, and rate of production of data.[3]
The abbreviation FAIR/O data is sometimes used to indicate that the dataset or database in question complies with the FAIR principles and also carries an explicit data‑capable open license.
FAIR principles published by GO FAIR
Findable
The first step in (re)using data is to find them. Metadata and data should be easy to find for both humans and computers. Machine-readable metadata are essential for automatic discovery of datasets and services, so this is an essential component of the FAIRification process.
F1. (Meta)data are assigned a globally unique and persistent identifier
F2. Data are described with rich metadata (defined by R1 below)
F3. Metadata clearly and explicitly include the identifier of the data they describe
F4. (Meta)data are registered or indexed in a searchable resource
Accessible
Once the user finds the required data, they need to know how they can be accessed, possibly including authentication and authorisation.
A1. (Meta)data are retrievable by their identifier using a standardised communications protocol
A1.1 The protocol is open, free, and universally implementable
A1.2 The protocol allows for an authentication and authorisation procedure, where necessary
A2. Metadata are accessible, even when the data are no longer available
Interoperable
The data usually need to be integrated with other data. In addition, the data need to interoperate with applications or workflows for analysis, storage, and processing.
I1. (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.
I2. (Meta)data use vocabularies that follow FAIR principles
I3. (Meta)data include qualified references to other (meta)data
Reusable
The ultimate goal of FAIR is to optimise the reuse of data. To achieve this, metadata and data should be well-described so that they can be replicated and/or combined in different settings.
R1. (Meta)data are richly described with a plurality of accurate and relevant attributes
R1.1. (Meta)data are released with a clear and accessible data usage license
R1.2. (Meta)data are associated with detailed provenance
R1.3. (Meta)data meet domain-relevant community standards
The principles refer to three types of entities: data (or any digital object), metadata (information about that digital object), and infrastructure. For instance, principle F4 defines that both metadata and data are registered or indexed in a searchable resource (the infrastructure component).
— GO FAIR Foundation, FAIR Principles, https://www.gofair.foundation/
Acceptance and implementation
Before FAIR a 2007 paper was the earliest paper discussing similar ideas related to data accessibility.[4]
At the 2016 G20 Hangzhou summit, the G20 leaders issued a statement endorsing the application of FAIR principles to research.[5][6] Also in 2016, a group of Australian organisations developed a Statement on FAIR Access to Australia's Research Outputs, which aimed to extend the principles to research outputs more generally.[7] In 2017, Germany, Netherlands and France agreed to establish[8] an international office to support the FAIR initiative, the GO FAIR International Support and Coordination Office.[9]

Other international organisations active in the research data ecosystem, such as CODATA or Research Data Alliance (RDA) also support FAIR implementations by their communities. FAIR principles implementation assessment is being explored by FAIR Data Maturity Model Working Group of RDA,[10] CODATA's strategic Decadal Programme "Data for Planet: Making data work for cross-domain challenges"[11] mentions FAIR data principles as a fundamental enabler of data driven science. The Association of European Research Libraries recommends the use of FAIR principles.[12]
A 2017 paper by advocates of FAIR data reported that awareness of the FAIR concept was increasing among various researchers and institutes, but also, understanding of the concept was becoming confused as different people apply their own differing perspectives to it.[13]
Guides on implementing FAIR data practices state that the cost of a data management plan in compliance with FAIR data practices should be 5% of the total research budget.[14]
In 2019 the Global Indigenous Data Alliance (GIDA) released the CARE Principles for Indigenous Data Governance as a complementary guide.[15] The CARE principles extend principles outlined in FAIR data to include Collective benefit, Authority to control, Responsibility, and Ethics to ensure data guidelines address historical contexts and power differentials. The CARE Principles for Indigenous Data Governance were drafted at the International Data Week and Research Data Alliance Plenary co-hosted event, "Indigenous Data Sovereignty Principles for the Governance of Indigenous Data Workshop", held 8 November 2018, in Gaborone, Botswana.[16]
The lack of information on how to implement the guidelines have led to inconsistent interpretations of them.[2]
In January 2020, representatives of nine groups of universities around the world produced the Sorbonne declaration on research data rights,[17] which included a commitment to FAIR data, and called on governments to provide support to enable it.[18] In 2021, researchers identified the FAIR principles as a conceptual component of data catalog software tools, with the other components being metadata management, business context and data responsibility roles.[19] In April 2022, Matthias Scheffler and colleagues argued in Nature that FAIR principles are "a must" so that data mining and artificial intelligence can extract useful scientific information from the data.[20]
However, making data (and research outcomes) FAIR is a challenging task, and it is challenging to assess the FAIRness.[21]
See also
- Data management
- Open access
- Open data – datasets and databases carrying an explicit data‑capable open license
- Open science
- Remix culture
References
- ^ a b Mark D. Wilkinson; Michel Dumontier; IJsbrand Jan Aalbersberg; et al. (15 March 2016). "The FAIR Guiding Principles for scientific data management and stewardship". Scientific Data. 3 (1): 160018. doi:10.1038/SDATA.2016.18. ISSN 2052-4463. PMC 4792175. PMID 26978244. Wikidata Q27942822.
- ^ a b Annika Jacobsen; Ricardo de Miranda Azevedo; Nick Juty; et al. (31 January 2020). "FAIR Principles: Interpretations and Implementation Considerations". Data Intelligence. 2 (1–2): 10–29. doi:10.1162/DINT_R_00024. ISSN 2641-435X. Wikidata Q76394974.
- ^ "FAIR Principles". GO FAIR. Retrieved 2020-02-16.   Material was copied from this source, which is available under a Creative Commons Attribution 4.0 International License. Material was copied from this source, which is available under a Creative Commons Attribution 4.0 International License.
- ^ Sandra Collins; Françoise Genova; Natalie Harrower; Simon Hodson; Sarah Jones; Leif Laaksonen; Daniel Mietchen; Rūta Petrauskaité; Peter Wittenburg (7 June 2018), "Turning FAIR data into reality: interim report from the European Commission Expert Group on FAIR data", Zenodo, doi:10.5281/ZENODO.1285272
- ^ G20 leaders (5 September 2016). "G20 Leaders' Communique Hangzhou Summit". europa.eu. European Commission.{{cite web}}: CS1 maint: numeric names: authors list (link)
- ^ "European Commission embraces the FAIR principles – Dutch Techcentre for Life Sciences". Dutch Techcentre for Life Sciences. 20 April 2016.
- ^ "Australian FAIR Access Working Group". www.fair-access.net.au. Retrieved 2020-04-03.
- ^ "Progress towards the European Open Science Cloud – GO FAIR". Government.nl. Ministry of Education, Culture and Science. 2017-12-01. Archived from the original on Feb 21, 2020. Retrieved 2020-02-15.
- ^ "GO FAIR Offices". GO FAIR. Retrieved 2023-12-05.
- ^ "FAIR Data Maturity Model WG". RDA. 2018-09-23. Retrieved 2020-02-16.
- ^ "Decadal Programme – CODATA". www.codata.org. Retrieved 2020-02-16.
- ^ Association of European Research Libraries (13 July 2018). "Open Consultation on FAIR Data Action Plan – LIBER". LIBER.
- ^ Barend Mons; Cameron Neylon; Jan Velterop; Michel Dumontier; Luiz Olavo Bonino da Silva Santos; Mark D. Wilkinson (7 March 2017). "Cloudy, increasingly FAIR; revisiting the FAIR Data guiding principles for the European Open Science Cloud". Information Services & Use. 37 (1): 49–56. doi:10.3233/ISU-170824. ISSN 0167-5265. Wikidata Q29051495.
- ^ Science Europe (May 2016). "Funding research data management and related infrastructures" (PDF).
- ^ "CARE Principles of Indigenous Data Governance". Global Indigenous Data Alliance. Retrieved 2019-09-30.
- ^ O'Donnell, Dan (2021-12-16). "Thinking about the CARE Principles in the Digital Humanities". DARIAH-Campus.
- ^ Sorbonne Declaration on Research Data Rights, Jan 27 2020
- ^ Open data 'tougher' than open access and needs 'mindset change', Times Higher Education, January 31 2020
- ^ Ehrlinger, Lisa; Schrott, Johannes; Melichar, Martin; Kirchmayr, Nicolas; Wöß, Wolfram (2021), Kotsis, Gabriele; Tjoa, A Min; Khalil, Ismail; Moser, Bernhard (eds.), "Data Catalogs: A Systematic Literature Review and Guidelines to Implementation", Database and Expert Systems Applications - DEXA 2021 Workshops, Communications in Computer and Information Science, vol. 1479, Cham: Springer International Publishing, pp. 148–158, doi:10.1007/978-3-030-87101-7_15, ISBN 978-3-030-87100-0, S2CID 237621026, retrieved 2022-06-26
- ^ Scheffler, Matthias; Aeschlimann, Martin; Albrecht, Martin; Bereau, Tristan; Bungartz, Hans-Joachim; Felser, Claudia; Greiner, Mark; Groß, Axel; Koch, Christoph T.; Kremer, Kurt; Nagel, Wolfgang E. (2022-04-28). "FAIR data enabling new horizons for materials research". Nature. 604 (7907): 635–642. arXiv:2204.13240. Bibcode:2022Natur.604..635S. doi:10.1038/s41586-022-04501-x. ISSN 0028-0836. PMID 35478233. S2CID 248415511.
- ^ Candela, Leonardo; Mangione, Dario; Pavone, Gina (2024-05-27). "The FAIR Assessment Conundrum: Reflections on Tools and Metrics". Data Science Journal. 23: 33. doi:10.5334/dsj-2024-033.
External links
- FAIR Data and Semantic Publishing, a statement from the lab of the first author of the original paper
- Guide to FAIR Data from Dutch Techcentre for Life Sciences
- GO FAIR initiative website
- FAIR Principles with detailed description of each of the guiding principles by the GO FAIR initiative
- A FAIRy tale explaining the FAIR principles, published by the FAIR project