Sound research rests on the ability to evidence, verify and reproduce results – managing your data enables all three.
First published in the Guardian Higher Education Network blog.
The availability of research data – the digital data or analogue sources that underpin research findings – is high on the agenda of higher education policy makers, funders and researchers committed to open practice. Sound research rests on the ability to evidence, verify and reproduce results.
If this sounds obvious, the practice of making research data available is surprisingly limited. Take the recent case of the 2010 Reinhart-Rogoff paper on economic growth that was found to contain errors and the exclusion of some data that significantly undermined the results. The results were published in a prestigious journal, the American Economic Review, that seemingly failed to enforce its own data availability policy, which meant it was only this year that these errors were discovered.
The drivers for greater research data availability are not just to do with verifying results and uncovering errors. The Royal Society's landmark report, Science as an Open Enterprise, stresses the potential for data reuse and a need for rapid data sharing so that we can respond to global challenges, such as flu epidemics or disaster risks. Data may often have uses unforeseen by the original creators and further information may be extracted by applying different techniques or integrating with other data sets.
Let's be clear though, not all research data can or should be made openly available. There are often very good reasons that prevent the sharing of research data, including concerns for individual privacy or commercial confidentiality. However, where such conditions apply, it is even more important for researchers and research institutions to ensure that data is well managed and securely stored.
The recent Engineering and Physical Sciences Research Council (EPSRC) policy places emphasis on the research organisation and its responsibility to promote research data management (RDM) practice and provide tools and resources that enable this. While undoubtedly challenging for many universities struggling with tightening budgets and daunted by the sheer volume of data being produced, effective data management does also present opportunities. So how can universities respond to these challenges and realise this potential?
Over the last two years, Jisc's Managing Research Data (MRD) programme has run a set of 17 projects to pilot research data management services in universities. In parallel, the Digital Curation Centre has also undertaken a series of 21 institutional engagement projects providing tailored support to increase research data management capability.
The early findings have been summarised in a guide to help higher education institutions understand their key aims and issues in planning and implementing research data management services. Here are seven key steps to help you improve RDM at your university:
1) Understand how your institution deals with research data
Do you know what research data you hold and where it is? How is that data being stored, backed-up, shared and managed? Are you exposed to risk, for example data loss, security breaches or reputational damage? What proportion of your data are you obliged to preserve and share in the long-term? Is the level of support and services that you currently provide sufficient?
We found that many universities had little idea of the volume of research data being created and how it was managed. Without this knowledge it is very difficult to improve your RDM practices. A useful first step is to conduct data surveys and interviews – the Data Asset Framework and Collaborative Assessment of Research Data Infrastructure and Objectives (CARDIO) tools can help organisations to understand and benchmark current RDM practice and infrastructure.
2) Build a case for RDM and gather support
It is unlikely that current provision and practice will be sufficient, so you will probably need to make the case for RDM. The universities we worked with found it invaluable to present evidence of current practice and expected demand from data surveys. Without such evidence, university managers are unlikely to be persuaded of the need to invest in RDM services.
It is also useful to gather support by establishing an RDM steering group and securing the input of lead researchers as data champions to help spread good practice to others.
3) Define your institution's position on RDM to establish policy and strategy
To provide guidance and support you need to be clear about your position on RDM. There are existing policies and roadmaps which you can use to help get you started. For some universities, the research data policy has been an 'aspirational' statement of principle providing a rational for investment, while for others the strategy and elements of a service have been the priority.
In all cases, close collaboration between library services, IT services and the research support office has been essential. We found that few institutions made progress without high-level support and senior research advocates.
4) Ensure researchers are aware of what data is available
The data surveys that universities ran often found researchers weren't aware of current support. A quick win can be to provide RDM guidance pages which collate details of support and provide basic pointers on good practice.
Many universities have also raised awareness in RDM training and advocacy sessions, developing training materials that you may find useful. These include:
• A general online course targeted at postgraduate researchers (but useful to anyone wishing to understand research data management).
• Training materials for postgraduate and early career researchers in a range of subjects including archaeology, creative arts, health studies, psychology, social anthropology and – still in development – for astronomy and physics.
5) Provide easy to use, robust data storage
Surveys found that current RDM practice is generally good for the short-term, with solutions typically being ad hoc and local. Given such a state of affairs, it is unsurprising that during this work the projects and institutional engagement uncovered substantial evidence of data loss.
One survey revealed that nearly a quarter of researchers had suffered significant data loss, others found cases of high value data on inadequate hardware, while another examined of the costs of losing dataand estimated that ad-hoc systems are between two and four times more expensive than centrally-provided services. Although these are rough estimates, they certainly demonstrate that centrally provided storage is competitive when the total cost of ownership and data loss are taken into account.
6) Make it easy for others to find and cite research data
There is a growing demand among researchers for services which allow research data to be published, described with adequate metadata (or information about the data) and provided with robust identification, for example, DataCite's digital object identifier. Making data available in this way allows it to be used and cited in literature, thereby increasing researchers' scholarly impact and reputation.
A number of studies have indicated that making research data available alongside publications causes an increase in citations.
7) Stay ahead of your peers
To keep pace with your peers, you need to consider how to support the management and sharing of research data. Many universities have begun to develop RDM services and a number of them have committed significant resources to continue this work. In these cases, between two and five full-time equivalent posts are covering technical development, systems support and research liaison.
Poor awareness of RDM has also been shown to cause the loss of research income. The data.bris project identified a case where an inadequate data management plan had led to a research grant proposal being rejected. After the plan was rewritten with the help of the university's RDM team the proposal was successful.
RDM may seem to present lots of challenges but there are also clear benefits to be gained. By managing and sharing data effectively you can boost your research standing, advance research and enhance your university's reputation. The lessons and models are freely available for you to take and reuse so what are you waiting for?
First published in the Guardian Higher Education Network blog.