Strong data management and monitoring is essential when it comes to reusing scholarly digital assets. Without a rigorous and stable framework in place, the opportunity for driving discovery and innovation becomes limited.
That is why the FAIR Data Principles were published in 2016 – to provide a solution for enhancing the reusability of data, with a particular emphasis on machines having the ability to automatically locate, use and enable the reuse of data when it is needed in the future.
The theme for ConTech 2021 was, ‘What does digital transformation mean in the context of FAIR Data?’. With data quality and management growing in significance in scholarly publishing and research communities, we want to revisit the FAIR Data Principles. In this post, I'll explore the four elements of the FAIR Data Principles and look at some key related points in data management, including open/FAIR data.
The FAIR Data Principles
In 2016, open access journal Scientific Data published The FAIR Guiding Principles for scientific data management and stewardship. The authors outlined a framework for improving the Findability, Accessibility, Interoperability, and Reuse (FAIR) of digital assets.
We summarize each of the four FAIR elements:
An essential component for reusing data is the ability to find it. The process for locating data and metadata should be simple for both humans and computers. This is especially important for the latter, as the proper indexing of data and metadata enables automatic discovery.
The next step of FAIR data is ensuring that it is accessible to users. It must be retrievable, and this may be done through authentication and authorization.
The data needs to be based on standardized language to allow for seamless integration into, and operation within, other datasets, workflows and systems.
Reusability is at the core of FAIR data and can be achieved by having the data and metadata richly described. By making data reusable, it can be applied in future research.
Why is data quality and management so significant?
‘Content creation and consumption is changing faster than ever before which is impacting business models, patterns of behaviour and virtually every aspect of the role of scientific information.’
SOURCE: ConTech 2021
The research landscape is continually evolving and growing. If this vast bank of scientific knowledge is to be maintained and useable in the future, there must be standards that can be applied to the data surrounding the research. The FAIR Data Principles are an example of one way to approach developing data standards.
Ulrich’s Web shows records for over 48,000 active, scholarly peer-reviewed journals in all languages.
In 2020, Scopus data available via SCImago showed 4.2 million records under the category ‘citable documents’, which includes articles, reviews, conference proceedings and short surveys.
By ensuring data is stored and managed to a high standard, researchers, institutions, funders and publishers can be safe in the knowledge that data can be referred back to and built upon in future research.
How can researchers protect their own scholarly data?
Big changes are needed in order to improve data quality and management as a whole, and these changes will be driven by funders and publishers. That said, there are many actions individuals can take to future-proof their own research data:
1. Create a research data management plan
2. Find out about institution data policies
3. Keep clearly labelled versions of data
4. Use open file formats
5. Store data with a reputable cloud storage service as well as locally
Explore more data management tips with Editage, and remember, it is important to choose practices that are sustainable and don’t rely on excessive admin time:
‘To begin with we suggest keeping it simple, use a few tools, and focus on fundamentals – be smart with the time you have available.’
The new alchemy: Online networking, data sharing and research activity distribution tools for scientists, Antony J. Williams, Lou Peck, Sean Ekins, 2017
The case for open/FAIR data
The open/FAIR data movement is growing fast. Researchers know how valuable their data is, and want to share it without barriers to create opportunities for feedback, collaboration and follow-up research.
There are widespread concerns that the quality and longevity of research data is inconsistent. Building on the original blog post from 2015, a collection of professionals including those from Crossref, California Digital Library, DataCite, eLife and Dryad have come together to create the Principles of Open Scholarly Infrastructure (POSI).
‘…the POSI principles on governance, sustainability, and insurance of open infrastructures are topics that IOI hopes to research and address as part of its new strategic plan. The very reason that IOI and related initiatives such as SCOSS exist, is an acknowledgement of the current vulnerability of open infrastructures and the inadequacy of the funding mechanisms available for such infrastructures.’
Now is the time to work together toward open infrastructures for scholarly metadata, LSE Impact Blog, 2021
Research data quality and management, especially open/FAIR data, is only going to increase in importance for scholarly communications and research. We'll soon be publishing our recent findings from our Research Data Management (RDM) project, sponsored by EBSCO, and the results from our RDM workshop at Researcher to Reader 2022.