In this time of crises the importance of research data management (RDM) in research has never been more pressing.
It was while I was doing my PhD when I first recognised the immense potential of good research data management.
I was studying epigenetic changes in embryos of a particular type of African toads. As it happened, my study evolved more and more towards big data and suddenly, whilst trying to map these toads’ genome sequence, I had to deal with big data analysis even though I'd never done any programming in my life.
Because these toads had not been studied by many researchers there wasn’t a genome sequence available. So how did we get this sequence together?
A collaborative effort
A group of researchers from the University of Texas in the US led a collaborative effort where people from all over the world could send in their unpublished toad genomic data. They pieced together the first community-driven genome of that species.
It was eye-opening to see how much progress can be made when raw data is shared and published.
It showed, even for our relatively small project, how much faster we were able to progress when we came together as a community, and how much more we were able to achieve.
Using FAIR principles
Scientific data management has gained traction since the 1990s and has matured with the introduction of the FAIR (findable, accessible, interoperable and reusable) principles. The FAIR principles emphasise the use of machine processing, because humans increasingly rely on computational support to deal with the increase in volume, complexity and speed at which we now create data.
Despite the growing support for suitable RDM, many research-performing organisations are still reluctant to apply the FAIR principles or share their datasets due to real or perceived cost.
A major study by PriceWaterhouseCoopers on the costs of not implementing robust RDM in the EU has been estimated at 10.2 billion euros per year, and this figure is bound to grow unless more researchers buy into it.
How data management can benefit everyone
It was eye-opening to how much faster we progressed when we came together as a community.
Very often, when we advocate RDM, it is seen as a policy requirement, or as a part of a funding or publishing contract. It would be much more helpful if researchers saw the selfish benefits of RDM. In reality, it’s something that will save them time while driving their career progression.
A study of more than half a million papers from open access publishers PLOS and Biomedical Central (BMC) found that researchers who shared their data in a repository were associated with an average 25% increase in citations to their research papers.
Good documentation and version control also saves a lot of time in the long term, says Florian Markowetz, a senior group leader at the Cancer Research UK Cambridge Institute in his article, ‘Five selfish reasons to work reproducibly’.
How do we convince researchers that data management is not just inconvenient admin?
At the University of Delft, we hone in on ‘what’s in it for the researcher’ and have assembled compelling case studies illustrating the benefits for individual researchers.
Another way of engaging researchers in RDM is to identify data management champions; researchers who are already using good practices and are willing to actively engage with peers.
These people don’t need convincing as they recognise that good data management is selfishly beneficial to them. This hyper-local, micro approach works best. It’s personal and relatable, not some abstract advice coming from a different university or country.
Unfortunately, there's no single solution that would work for all universities and research institutions. In our recently published book, we’ve shown a range of solutions that might apply to different research settings, depending on the resources available.
In the ideal situation, there’s time to meet researchers one-to-one, get to know them, and understand what they’re working on and what level of support they need. Meeting researchers in person also helps to convince sceptical academics, some of whom perceive data management as little more than an extra administrative burden on their already crammed agenda. Talking helps. For instance, we discuss what would happen if the key person from their research project suddenly fell ill or left the research group. Would they be able to carry on their research?
The issue with data documentation is about making sure it’s understandable. From my experience, in coordinating one of the largest institutional data stewardship programmes at TU Delft in the Netherlands, it’s easiest to engage sceptical researchers not by directly talking about data management, but by focusing on their passions. Start with what they like to talk about.
“Data management is about human interaction”
To scale data stewardship support in universities, a national group was created to professionalise data stewardship and create a job profile so we can appoint the right people to drive change.
We found that the most important thing is not whether data stewards have a background in research but if they have the empathy to put themselves in the shoes of their fellow researchers, to understand their problems and come up with solutions jointly.
Technical skills can be learned on-the-job, but it’s the social skills that will determine the impact of data stewards.
Meanwhile, to address the lack of engagement from researchers with RDM, we need systemic changes in the reward system where publication is seen as the holy grail for career progression and recognition. Today’s researchers are under tremendous pressure to publish as much as possible as fast as possible. Instead, we need to focus on delivering high-quality research which includes solid RDM documentation whether we record toad sequencing or search for the cure of new viruses.
Reported by Faye Holst, senior media and communications officer, Jisc.