- Home
- » Publications
- » Comparative Study of International Approaches to Enabling the Sharing of Research Data
Comparative Study of International Approaches to Enabling the Sharing of Research Data
JISC has commissioned this study to survey the different national agendas that are addressing variant infrastructure models, to inform developments within the UK and for facilitating an internationally integrated approach to data curation.
Executive Summary
The current methods of storing research data are as diverse as the disciplines that generate them and are necessarily driven by the myriad ways in which researchers need to subsequently access and exploit the information they contain. Institutional repositories, data centres and all other methods of storing data have to exist within an infrastructure that enables researchers to access ad exploit the data, and variant models for this infrastructure can be conceptualised. Discussion of effective infrastructures for curating data is taking place a all levels, wherever research is reliant on the longterm stewardship of digital material. JISC has commissioned this study to survey the different national agendas that are addressing variant infrastructure models, to inform developments within the UK and for facilitating an internationally integrated approach to data curation.
The study of data sharing initiatives in the OECD countries confirmed the traditional perception that the policy instruments ar clustered more in the upper end of the stakeholder taxonomy – i.e. at the level of national and research funding organisations whereas the services and practical tools are being developed by organisations at the lower end of the taxonomy. Despite the differences that exist between countries in terms of the models used for research funding, as well as the levels at which decisions are taken, there is agreement on the expected strata of responsibility for applying the instruments of data sharing. This supports the structure of stakeholder taxonomy used in the study.
Policy Support for Data Sharing
The lack of a universal model for data sharing policies appears to be a fundamental consequence of research funding models diffring between individual countries. This study found no evidence of either a universal model or agreement on what a data sharing policy should include.
On an international level, the key players (organisations like OECD, UNESCO, EU and interest groups like CODATA, ESFRI) have concentrated their policy statements around the principle of open access to publicly funded research outputs. While OECD, UNESCO and CODATA have policies explicitly for data sharing, the European Commission is looking at data sharing issues in the broader context of open access to public domain information.
No national level policies or strategic documents that explicitly mandate the sharing of research data were found. Nevertheless, the provision of access to research data is seen as a vital element of the general research infrastructure, and all research infrastructure development strategies acknowledge the need to develop the means for accessing data. Applying Open Access principles to data is discussed at the national level in Germany.
The main burden of developing and implementing data sharing policies is currently being carried by research funding agencies, with an expectation (but not a mandate) that individual research institutions and departments will follow these up with their own policy statements. Measures to motivate researchers into sharing their data incorporate conditions being attached to funding schemes or the proviion of data sharing policies backed up by services offered to recipients of funding. The prospect of a more pro‐active stance in mandating the sharing of data is evidenced in the recent initiatives of funding agencies to agree on common prnciples for data sharing.
Typically, but not in all cases, the funding agency policies draw on the following incentives and enablers:
| Policy Enablers |
Aspects Covered |
| International level examples and statements |
General policy statements |
| National strategic planning documents and mandates |
Obligation / mandate to share data |
| Research associations’ statements and codes of ethics |
Division of responsibilities between stakeholders |
| Open Access principles |
What data sharing channels should be used |
| Government funding for research infrastructure |
How can the costs involved in data sharing be covered |
| Government audit and watchdog offices’ reports and requirements |
What sanctions can be applied if the data sharing requirements are not being met |
| |
Data access principles and protection of data subjects’ rights |
| |
Conditions of exclusive use of data |
The emerging institutional policies still remain ad hoc and do not appear to be well coordinated. To develop uniform data sharing polices and put them into practice, the institutions will currently require significant help and guidance.
Data Sharing Infrastructure Provision
Policies alone will not result in a higher use of research data. Optimum accessibility and usability of data presuppose a trajectory of proper organisation and curation of data, with access services and analysis tools that provide the researchers with added value.
Proposals for national data services have opted for a distributed, umbrella‐type approach where the national service provides the environment for repositories – common principles and standards that data repositories in the country apply, and develop tools that facilitate interaction between repositories. The main expected outcomes are better data curation and dissemination services that are based on shared tools and principles.
Data archives and centres funded directly by research funding agencies are the dominant class of data repositories. But there is a variance in how data curation and sharing infrastructure is offered and models of how these are used in differet research domains. In domains such as the social sciences and medicine a strong tradition exists for depositing data in national data centres, which are usually directly funded by the funding agencies; in astronomy, biomedicine, earth sciences and physics, data centres with a profile of international dissemination are favoured by researchers. The first examples of funding agencies relying on a network of institutional data repositories are emerging (e.g. AHRC in the UK that stopped funding the AHDS and is relying on institutional service provision, or the Helmholtz Society in Germany), whilst some data centres are offering services to more than one funding agency (e.g., ICPSR in the US). Nonetheless, differences still remain in the degree to which funding agencies take responsibility for data sharing, as well as the extent to which they communicate data sharing principles to their research community.
Institutional repositories have until recently put emphasis on the deposit of textual research output. The scope of these repositories is gradually being extended to cover research data as well, but the overall number of stored datasets is very low. Institutional data repositories hold promise for the future with the advantage of being close to researchers, but are at present entangled in a maze of shortages in expert know‐how and resources, unclear responsibilities for maintaining the repository (e.g. university library vs IT services), and insufficient institutional policy support. The business case for supporting a data repository is not yet clear for many research institutions.
Data Federation and Access Services
In the increasingly international and interdisciplinary context of research, locating data in disparate repositories in different countries, gaining access to them through a web of licence agreements in different languages, and re‐using them in a multitude of file formats can be a daunting task. These barriers are not easy to overcome – the sheer diversity of data makes it difficult to design tools with the range and ability to accommodate and translate betweenthe distinctly different data needs of the various domain communities.
To bridge these gaps, a significant portion of data sharing infrastructure funding is being allocated to developing technical solutions for data fedration from different repositories in one research domain and across domains. Portal services are emerging that harvest metadata from disparate data repositories and allow the creation of entire cross‐sections of research output on national or research domain levels. Digital repository system tools are appearing that allow the integration and management of textual, multimedia and data object collections.
These services are predominantly developed by short‐term projects, which inevitably are faced with the transition to a sustainable service environment, with a long‐term financial and business structure (e.g. the CARMEN project). Development of data access tools and services has started to receive government funding and backing in several countries (e.g., the US, Australia, Netherlands, France).
Data Sharing Services in Support of the Research Process
To support collaboration between research groups, tools are emerging for the dissemination and sharing of data between disparate groups across diverse disciplines. The data often need to be shared between small and medium sized laboratories and institutes that may have very different computing environments and levels of IT expertise. To help with automation of the research process and reduce the effort that goes into data conversion, various virtual research environments and researchers’ toolbox solutions are being developed. These are predominantly project‐based initiatives at this stage, but in the case of Germany and Japan have the backing of a nation‐wide platform.
Researcher Skills for Data Sharing
Data publishing to a standard that facilitates re‐use requires the effective planning and management of data throughout the life‐cycle of a project. Studies in the UK and Australia have demonstrated low awareness of policies and requirements, and a lack of adequate data management skills among researchers. Similar conclusions have been drawn from digital library user surveys. Researchers require guidance in translating policy requirements, including open access policy, into operational tasks for which they can plan and take responsibility.
Examples of data management plans that are increasingly required as conditions for receiving funding have been produced in Austalia. Good examples of data management and curation manuals have been developed by the UK Data Archive, DCC and ICPSR.