- Home
- » Publications
- » Increasing repository content through automation & services
Increasing repository content through automation & services
Download the full report
This project built upon the 1st phase of White Rose Research Online, a shared institutional repository for the Universities of Leeds, Sheffield and York. The project aimed to increase content in White Rose Research Online; to automate aspects of the repository ingest process; and to start to embed the repository within research workflows by lowering barriers to deposit.
Executive Summary
This was an 18-month project (subsequently extended to 20-months) to enhance White Rose Research Online (WRRO). This is a shared repository of research outputs (primarily publications) from the Universities of Leeds, Sheffield and York; it runs on the EPrints open source repository platform. The repository was created in 2004 and had steady growth but, in common with many other similar repositories, had difficulty in achieving a 'critical mass' of content and in becoming truly embedded within researchers’ workflows.
The main aim of the project was to assess ingestion routes into WRRO with a view to lowering barriers to deposit. We reviewed the feasibility of bulk import of pre-existing metadata and/or full-text research outputs, hoping this activity would have a positive knock-on effect on repository growth and embedding. Prior to the project, we had identified researchers’ reluctance to duplicate effort in metadata creation as a significant barrier to WRRO uptake; we investigated how WRRO might share data with internal and external IT systems. This work included a review of how WRRO, as an institutional based repository, might interact with the subject repository of the Economic and Social Research Council (ESRC).
The project addressed 4 main areas:
- Researcher behaviour We investigated researcher awareness, motivation and workflow through a survey of archiving activity on the university web sites, a questionnaire and discussions with researchers
- Bulk import We imported data from local systems, including York’s submission data for the 2008 Research Assessment Exercise (RAE), and developed an import plug-in for use with the arXiv repository
- Interoperability We looked at how WRRO might interact with university and departmental publication databases and ESRC’s repository
- Metadata We assessed metadata issues raised by importing publication data from a variety of sources
A number of outputs from the project have been made available from the project website
The project highlighted the low levels of researcher awareness of WRRO - and of broader open access issues, including research funders’ deposit requirements. We designed some new publicity materials to start to address this. Departmental publication databases provided a useful jumping off point for advocacy and liaison; this activity was helpful in promoting awareness of WRRO. Bulk import proved time consuming – both in terms of adjusting EPrints plug-ins to incorporate different datasets and in the staff time required to improve publication metadata.
A number of deposit scenarios were developed in the context of our work with ESRC; we concentrated on investigating how a local deposit of a research paper and attendant metadata in WRRO might be used to populate ESRC’s repository. This work improved our understanding of researcher workflows and of the SWORD protocol as a potential (if partial) solution to the single deposit, multiple destination model we wish to develop; we think the prospect of institutional repository / ESRC data sharing is now a step closer.
The project experienced some staff recruitment difficulties. It was also necessary to adapt the project to the changing IT landscape at the three partner institutions – in particular, the introduction of a centralised publication management system at the University of Leeds. Although these factors had some impact on deliverables, the aims and objectives of the project were largely achieved.