This guide provides an introduction to engaging with research data management (RDM) processes.
The guide is of interest to university researchers and research data management professional support staff. This includes any staff who give advice to researchers on the storage, management, publication and archiving of their research data.
Research data is any information that has been collected, observed, generated or created to validate original research findings. Although usually digital, research data also includes non-digital formats such as laboratory notebooks and sketchbooks.
What is research data management?
'Research data management' is simply the effective handling of information that is created in the course of research.
Managing research data is usually an integral part of the research process, so you probably already do it. Most of the activities should be familiar: naming files so you can find them quickly; keeping track of different versions, and deleting those not needed; backing up valuable data and outputs; and controlling who has access to your data.
How research data is handled depends on the type of data involved, how that data is created or collected and how the data is to be used now and in future. For example, most data from experiments is reproducible; other data may not be repeatable, such as observations from the field.
However, any research outputs or data may be used to evidence published findings, or may be combined with other data to produce new types of data record.
Effective data management is carried out for the entire lifecycle of the data, from the point of creation through to dissemination, publication and archiving. Aspects of data management will usually continue long after the initial research project has ended.
Why should I manage my research data?
Engaging with the RDM process at your institution can provide benefits for you as well as your students, other researchers, your institution, and your external collaborators and partners.
Your research data is crucial as it is the evidence base for your research findings. Your research data is also a valuable resource that will have taken a great deal of time and money to create.
There are a number of very good reasons why research data should be managed in an appropriate and timely manner and they are associated with the reasons for sharing data. These could be seen as both sticks (requirements) and carrots (benefits)!
The sticks - or research data management requirements
You can read about the main requirements and some unselfish reasons for good RDM below, but, if those don’t convince you, there are also a series of "self-interested" reasons covered in this entertaining article by Florian Markowetz.
Compliance with policies
Good RDM will benefit you and your institution by ensuring compliance with funders’ research data expectations and policies. Institutional policies may also be in place, often in response to mandates from funders.
Some funding bodies have introduced regulatory requirements. The RCUK Policy on Open Access states that "all papers must include … if applicable, a statement on how the underlying research materials – such as data, samples or models – can be accessed". The RCUK Common Principles on Data Policy state that "publicly funded research data are a public good, produced in the public interest, which should be made openly available with as few restrictions as possible in a timely and responsible manner that does not harm intellectual property".
Most funders now require the production of a data management plan (DMP). DMPOnline has been developed by the Digital Curation Centre (DCC) to help you write data management plans. It has templates for all funders, and guidance (where available) from your institution.
The EPSRC’s policy includes nine specific expectations concerning RDM; they assign primary responsibility for promotion of RDM to the research organisation and require it to provide systems, tools and support services to enable this. We support the implementation of RDM solutions for UK universities through our development of shared infrastructure and services, such as advice through our RDM toolkit.
The DCC also maintains a number of useful resources, in particular the how to develop RDM services guide.
Ensure your data is accessible and shareable
On the subject of compliance, journal publishers increasingly require researchers to make all data underlying the findings described in their manuscript fully available without restriction at the time of publication. This usually means that the data are deposited in an accessible data centre or repository. This requirement applies to both commercially and publicly-funded research.
A Jisc/Research Libraries UK (RLUK) service, SHERPA JULIET, lists open access publishing and data archiving policies. We are currently funding the development of the UK research data discovery service, which will aggregate metadata for research data held within UK universities and national, discipline-specific data centres to help ensure increased access to data.
Data needs to be made FAIR as well, and one important aspect of that is the proper description of the data by use of metadata.
The EC-funded FOSTER portal is a useful place to locate training content on open science more generally.
Demonstrate responsible practice
By managing your research data and making it publicly available you will be able to demonstrate the responsible use of public resources to fund research. RDM good practice improves validation of research results and research integrity.
It’s worth remembering that the REF 2021 is committed to the fair and equal assessment of all types of research and forms of research output.
A group of research funders, sector bodies, and infrastructure experts are working in partnership to promote the responsible use of research metrics. The forum have a programme of activities, including advice on, and work to improve, the data infrastructure that underpins metric use.
Work being carried out with Jisc by the University of Glasgow has resulted in a set of outputs that can help both depositors and users of data better understand the opportunities and limitations offered by various licences. The DCC has published guidance on how researchers should respond when faced with an FOI request.
What are the carrots - or benefits - of good RDM?
Working across 5 EU countries, and supported by Jisc, the Knowledge Exchange gathered evidence, examples and opinions on current and future incentives for research data sharing from the researchers’ point of view. They wrote up recommendations for policy and practice development in incentives and motivations for sharing research data, a researcher’s perspective.
Keep your research safe and secure
You can reduce the risk of data loss by keeping your research data safe and secure: use of robust and appropriate data storage facilities will help to reduce the loss of your data through accidents, or neglect.
Work with research support within your institution to make your data curation and preservation needs clear. This will help them put in place or advise on adequate storage for your data.
The right place for your research data is likely to be your institution’s own data repository or possibly a disciplinary repository. You can work out which by seeking advice from your local library or using an extensive list on the re3data.org registry of research data repositories.
Increase your research efficiency
You can increase your research efficiency: good research data management will enable you to organise your files and data for access and analysis without difficulty. This way you can track progress more easily, and mitigate against the risk of a team member leaving taking valuable knowledge about the nature and extent of work completed with them.
Improve your research integrity
Good data management can result in improved research integrity as well as act as validation for research results. Accurate and complete research data are an essential part of the evidence necessary for evaluating and validating research results and for reconstructing the events and processes leading to them.
Make your research outputs more visible
Making your data available enhances the visibility of your research outputs and increases the number of citations. Research data, if correctly formatted, described and attributed, will have significant ongoing value and can continue to have impact long after the completion of a research project. A “robust citation benefit from open data” was found by Piwowar and Vision (2013).
Perhaps the most common reasons to retain and manage research data are to ensure reproducibility and to facilitate online sharing.
Data citation underpins the recognition of data as a primary research output rather than as a by-product of research. There are a number of data citation initiatives, including DataCite, a registry assigning unique digital object identifiers (DOIs) to research data. Using a DOI helps to make data citable, traceable and findable, so that research data, as well as publications based on those data, can form an alternative, but important part of a researcher's output.
We are currently collaborating with the British Library to promote the use of persistent identifiers.
A DOI is also short and easy to share on social media.
As research becomes increasingly more complex, researchers can pool their resources towards a common goal: new discovery. Collaborations not only benefit researchers but journals as well. For instance, journals such as Nature and Science have found a positive correlation between the number of authors in a publication and their impact factor.
You could be providing opportunities for collaboration with other researchers within your discipline, or even with other disciplines, by facilitating the sharing and re-use of research data for future research. Sharing well-managed research data and enabling others to use it will also help to prevent duplication of effort.
Advanced computing capabilities help researchers manipulate and explore massive datasets, an idea that’s articulated by Microsoft AI developers creating datasets so researchers from competing institutions can share knowledge and build on each other’s work.
How to get started
As both the creators and users of research data, researchers are crucial in the development of research data management and data sharing services.
Overall, by managing your data well, and fitting within the policies and frameworks you are required to, you could increase debate and the potential for new enquiry in your field. This should help to ensure that you continue to receive funding, and make yourself open to innovation and potential new data uses.
There are some excellent training programmes available online, including Mantra, a free online course developed through Jisc funding. Mantra is designed for researchers or others who manage digital data as part of a research project.
Working with your institution
To put in place a supportive infrastructure your institution needs to understand your research, its patterns and timetables, motivations and priorities. Your university management should define expectations through an RDM policy, and support staff will deliver services.
As the primary data creator you should aim to:
- Manage your data appropriately within your institutional policy and the guidelines set out locally or for your discipline; the main way you can do this is by creating a data management plan at the outset of a project and revisit it on a regular basis;
- Make sure you clearly articulate - in terms of the data creation, use and management - the requirements, opportunities and obstacles you might encounter while doing your research so your institution or other infrastructure providers can help support you in keeping your data safe;
- Use your institutional repository, or an appropriate disciplinary repository, for storing your research data, outputs and publications.
Jisc is engaged in delivering a range of work to help universities and others address the urgent challenges involved in sharing and managing research data. You can follow what we are doing on our blog. Our current priority is to deliver to the sector an effective end to end solution for research data management. In November 2018 the Jisc open research hub was launched for managing, preserving and sharing institutional digital research data.
Now, we are aiming to broaden our perspective to address the wider open science agenda, with aim of including all research outputs (including publications and methods – provenance/code/metadata). Ultimately, the service should form part of the Jisc open research offer and join up to other Jisc services.