FAIR Synthesis: Repository and Related Software
When developing an institutional repository, one of the most important considerations is what software to choose. There is a wide variety of open source software available and the majority of FAIR projects have followed this route. Follow the links below to find out how FAIR projects evaluated software and what packages they chose for repositories and other aspects of the projects.
Selecting and Evaluating Repository software
There are many open source software packages that can be used to create repositories. Though they have similar features, each is unique. The Open Society Institute (OSI) Guide to
institutional repository software is a useful guide to what’s available; the DAEDALUS project contributed to this. Making a decision can be complex and involves careful thought about factors like what the repository will contain, how it will be used, and the local technical environment and skill sets.
Most of the projects developing repositories for e-prints selected EPrints, though some have opted for DSpace. The projects developing repositories for ETDs chose DSpace, though they also considered ETD-db, a package developed especially for ETDs at Virginia Tech University. DAEDALUS has developed two separate repositories using EPrints (for post-prints) and DSpace (for pre-prints, theses, and grey literature). Several have written reports about their experience of selecting software and comparing different packages. The processes they followed and criteria they used will be of interest to institutions making similar decisions.
Experience selecting software
- A report to JISC on the decision to develop an ETD service using DSpace rather than ETD-db, William J Nixon, Morag Mackie and Lesley Drysdale, available at https://dspace.gla.ac.uk/handle/1905/195 (DAEDALUS)
- Key issues to be taken into consideration when selecting a software system, available at http://www.rgu.ac.uk/library/guidelines/options.html (Electronic Theses)
- DSpace and ETD-db comparative evaluation, Richard Jones, August 2003, available at http://www.thesesalive.ac.uk/arch_reports.shtml (Theses Alive!)
- Jones, R., DSpace vs. ETD-db: Choosing software to manage electronic theses and dissertations, Ariadne, 2004, Issue 38, available at http://www.ariadne.ac.uk/issue38/jones/ (Theses Alive!)
Experience using software
- Nixon, W.J., DAEDALUS: Initial experiences with EPrints and DSpace, Ariadne, 2003, Issue 37, available http://www.ariadne.ac.uk/issue37/nixon/ and https://dspace.gla.ac.uk/handle/1905/197 and in Japanese at http://www.nii.ac.jp/metadata/oai-pmh/nixon2/. A follow-on report to be prepared during 2005 will offer a comparison of using EPrints and DSpace within the project (DAEDALUS)
- Report on the technical issues of using GNU EPrints Software for the development of an institutional e-Print repository at the University of Southampton, Gutteridge, C.J., Hitchcock, S., Simpson, P. and Hey, J., 2003, University of Southampton, Southampton, available at http://eprints.soton.ac.uk/archive/00000184/ (TARDis)
Software used by FAIR projects
The following software has been used by FAIR projects to support their work:
GNU EPrints
The ePrints software was first developed by the Department of Electronics and Computer Science at the University of Southampton as part of the CogPrints project, funded under the JISC eLib Programme. It was designed as repository software for e-prints, electronic versions of research articles, in either pre-print or post-print versions (or both). It continues to be developed within the Department with support from the JISC, but also has a wide user community around the world - EPrints has the largest installed base of the repository systems described in the OSI Guide to Institutional Repository Software. EPrints software is made available under the GNU GPL open source licence. See http://software.eprints.org for further information and downloads.
DAEDALUS has used EPrints software version 2.2.3 for the creation and development of their OAI-compliant published and peer-reviewed papers archive. This is available on open access at http://eprints.gla.ac.uk/.
Electronic Theses has tested EPrints software version 2.1.1 for the creation and investigation of use as an ETD archive. EPrints was selected as an option because of the range of institutions already using this software for institutional repositories. The project was keen to investigate how such repositories might be used for ETDs as well, to avoid additional and over-specialisation of repositories.
HaIRST has used EPrints software version 2.0 (with adaptations) to create and develop an e-print archive for the University of Strathclyde (Strathprints). A pilot version of this is available at http://strathprints.cdlr.strath.ac.uk/, though note that this implementation has been used primarily for technical testing and not for building a collection of content. The open source nature of EPrints has allowed the HaIRST team to make a number of adaptations to the software to suit local requirements, and these have been fed back to the EPrints developers at the University of Southampton. The project has also used EPrints software version 2.0 to create and develop an e-print archive for the St Andrews University, available at http://eprints.st-andrews.ac.uk/.
The SHERPA project is creating e-print repositories at its development partner (with the exception of Arts & Humanities Data Service which is focussing on preservation and licence issues) and associate partner institutions. The following are based on EPrints software:
TARDis has used EPrints software version 2.0 for the creation and development of a multi-disciplinary institutional repository for the University of Southampton. The work of the project has fed back into and influenced the ongoing development of the EPrints software. A new data entry design was designed as part of this work with input from an HCI specialist; this is now available for wider use as required and a report of this work is available (see below). An on-screen help system has also been devised and implemented.
- Report on the technical issues of using GNU EPrints Software for the development of an institutional e-Print repository at the University of Southampton, Gutteridge, C.J., Hitchcock, S., Simpson, P. and Hey, J., 2003, University of Southampton, Southampton, UK
DSpace
The DSpace software was originally developed as a joint venture between MIT and Hewlett-Packard. It is designed as a digital library repository system that can capture, store, index, preserve and redistribute any digital institutional outputs. The system has now been made available under an open source BSD licence and has been implemented widely. The core development team at MIT still maintains and develops the system, though a wider DSpace Community is being established to allow others to contribute. See http://www.dspace.org for further information and downloads.
DAEDALUS has used DSpace software version 1.1.1 for the creation and development of their OAI-compliant preprints, working papers and grey literature archive. This is available on open access at https://dspace.gla.ac.uk/index.jsp. The project plans to upgrade to 1.2.1.
Electronic Theses has tested DSpace version 1.1.1 for the creation and investigation of use as an ETD archive. DSpace was selected as an option due to the increasing interest and use of this software for ETDs elsewhere (c.f. the University of Edinburgh and University of Glasgow) and as an institutional repository in general. This was enhanced with Virtual Research Environment (VRE) features. Feedback from users indicated that DSpace, with its structures based on user communities and collections, was a good choice of software for making EDTs available alongside e-prints and other research materials.
A number of the partners in SHERPA are creating e-print repositories using DSpace, as follows:
Theses Alive! has used DSpace software for the creation and development of an ETD archive for the University of Edinburgh. This software is also the basis of the Edinburgh Research Archive, which, through a combination of activities in the Theses Alive! and SHERPA projects, seeks to record all the research outputs from the University of Edinburgh. The archive represents the service outcome as well as software output from the project. A report describes the installation and setup.
FAIR Enough has used DSpace to create a repository of multimedia elements. Original images, sound and video clips, and other similar materials were made available to FE practitioners for developing learning resources. uPortal was then used to deliver the learning resources to users via VLEs. The project is developing a guide for institutions wishing to develop a similar repository.
OAI Static Repositories
HaIRST has configured and implemented version 2.0 of the OAI Static Repositories specification for Napier University and Glasgow Colleges Group/John Wheatley College. This has enabled the disclosure of metadata records from these institutions and is being tested as a possible solution to facilitate this at institutions unable to maintain a full local repository. Records are formatted as XML files that can be harvested for delivery via an OAI service provider. The test files are available to view at http://hairst.cdlr.strath.ac.uk/resources.htm. A report on the experience of using this specification is a future output from the project.
Harvesting and Search Software
DAEDALUS is using the PKP OAI Harvester from the Public Knowledge Project at the University of British Columbia to provide access across the repositories it is implementing. The PKP OAI Harvester allows you to create a searchable index of the metadata from OAI-compliant archives and features the ability to export search results into bibliographic reference management software. See details of the software. DAEDALUS is also planning to implement Google Scholar and will report on their experience in a future output.
The ePrints UK service demo version 0.3 is based on ARC harvester software from Old Dominion University, available at http://eprints-uk.rdn.ac.uk/. The service has been technically derived from the RDN service and is based on the same technologies. It comprises three components: the ARC harvester, the cheshire2 system to index the harvested metadata, and a normaliser specifically developed for the project. The project also explored the potential of using third-party web services to enhance the metadata. The system is described in the project’s final technical report.
HaIRST is using the ARC harvester software to create and develop an experimental discovery service across all the repositories developed, including the static repositories. This is available at http://speirserver.cdlr.strath.ac.uk:8088/arc/hairst_search.jsp.
Museum Collections
The Accessing the Virtual Museum project used the existing Adlib collection management system in place at the Petrie Museum, which has OAI-compatibility incorporated as part of its Internet Server module. All records and associated images have been made available through the Museum OPAC and can be accessed at http://www.petrie.ucl.ac.uk/index2.html by selecting ‘Search the online catalogue’ from ‘The Petrie Museum’ menu.
The Adlib system is also used for the Fitzwilliam Museum’s internal collection management system and was the basis for the main collection created within the Harvesting the Fitzwilliam project. All records and associated images can be accessed at http://www.fitzmuseum.cam.ac.uk/opac/public/guide.htm. The coins database was maintained separately and made available for disclosure using the OAI protocol via an open source OAI-compliant PHP/MySQL database.
uPortal
uPortal is software for developing institutional portals, rather than repositories per se. It’s a free, sharable portal under development by institutions of higher education and available through the Java Architectures Special Interest Group (JA-SIG). The PORTAL project has developed a Beginners Guide to uPortal aimed at those new to installing and configuring the uPortal product. It also serves as an introduction to the main features of uPortal. The Guide is based on uPortal version 2.1.x and available at http://www.fair-portal.hull.ac.uk/deliverables.html under workpackage 10. The software itself is available from http://www.uportal.org.
The FAIR Enough project has also used uPortal, but in a very innovative way. The objective was to develop realistic technical solutions to increase the use of electronic resources within FE institutions of the project consortium. uPortal was used to embed electronic resources available within VLE environments – Teknical Virtual Campus and Moodle for comparison. The development is in its very early stages, but the project team has made enough progress with uPortal to know that it is a viable solution to the problem of delivering a wide variety of electronic resources within a VLE while balancing targeted access with an element of choice. A toolkit was developed and will be made available on the website to provide specific recommendations about how to target resources appropriately and how to make resources available to increase the usage.
Developing New Repository software
Though most FAIR projects used existing repository software, the BioMed Image Archive developed their own, an Apache Cocoon-based open source software repository system for biomedical images. The system is designed to enable the capture, management and display of images and associated metadata. It has a built in authentication layer using
ATHENS and supports the submission, review and copyright checking of new images. It is built around a model of self-deposit into a community collection of images, although it should be noted that due to the legal environment that the project took place in, with concerns over patient consent for biomedical images, this model cannot currently be used to its full capability, with collection building currently requiring greater control and checks than first envisaged.