Session notes: Preservation services

Speaker: Steve Hitchcock, University of Southampton 

Repository Preservation Services: divisible, viable and sustainable?

Steve Hitchcock, project manager of the Preserv project at Southampton, opened the session by observing that while the volume of content within institutional repositories is growing, not many of them are concerned with digital preservation. People have a general assumption that they do not need to do preservation themselves as someone else out there would do it for them. The Preserv project took that responsibility and developed a prototype preservation service for institutional repositories.

Steve then explained that digital preservation is not a single activity but contains many processes. Institutional repositories do have a role for digital preservation and must start with preservation policy, which is concerned with all aspects of repository management: content strategy (mandates!), collection policy, rights etc. Preservation strategy will emerge from this analysis. Repositories can only know the requirements and scale of the preservation task with a fully formed policy. In addition, preservation policy needs institutional backing to become a true institutional policy.

Preserv surveyed 20 institution repositories and found out that most repositories do not have an existing policy on preservation. Some however have a policy on submission of file formats. Steve introduced the PRONOM-ROAR profiling service, developed by Preserv which can provide a list of format ids of all content in the repositories within the Registry of Open Access Repositories (ROAR). Format ID is a useful starting point. All kinds of preservation services can be provided on top of this information, such as risk assessment, which enables decisions on consequent preservation actions. Steve concluded that the most valuable experience of Preserv is the much better understanding of what preservations services are going to be, which was something unclear at the start of the project. We can now try to breakdown the black box of single preservation service provision and offer a range of specialist services based on different institutional needs.

Steve also introduced PRESERv2, starting in June 2007 to build demonstrators of diverse preservation services, including preservation planning, exemplar preservation action services and metadata and further development of ROAR and associated OAI services to support distributed digital preservation.

 

Speaker: Gareth Knight, AHDS

Preservation Services: the final piece in the repository jigsaw?

Gareth Knight, Project manager of SherpaDP at the Arts and Humanities Data Service (AHDS), opened the presentation by introducing SherpaDP, which was funded to develop digital preservation services for e-prints institutional repositories and to identify organisational, policy and workflow issues involved with digital preservation.

The SherpaDP model disaggregates the OAIS framework, mapping the six entities of an OAIS-compliant repository - ingest, archival storage, administration, data management, preservation planning and access - onto an existing structure. This required clarification of how institutions repositories could interact with the preservation service and their respective roles and responsibilities. Tools and processes were then developed to implement the preservation services and actions.

SherpaDP has identified a few benefits of using a third party preservation service, including cross-repository support, a dedicated infrastructure, consolidation of funding and saving on staff time. It can also help in terms of compliance with specialist agencies (eg for Trusted Digital Repository - TDR - certification).

After taking the audiences through the SherpaDP workflow, which details how data is transferred back and forth between the institutional repositories and the preservation service, Gareth covered a set of minimum requirements, at technical and policy level, for preservation services. SherpaDP also identified some best practice requirements, which are concerned with the exposure, description and collection of metadata by institutional repositories. In addition, institutional repositories are encouraged to work together with the Preservation Service Provider to review and potentially revise ingest policies to ensure content is deposited in formats appropriate for preservation.

Gareth explained in detail how the SherpaDP preservation service demonstrator uses OAI (Open Archives Initiative) output from e-prints repositories to identify new submissions and obtain data. Additional OAI output was created to contain all of the information stored about the paper in the repository. In addition a model was developed for the characterisation of repository content, based on METS, PREMIS and format-specific extensions. SherpaDP used a number of format identification and analysis tools and found that a pragmatic approach was to combine the benefits of several applications,  such as DROID, PRONOM, JHOVE, as no single tool was perfect in every way.

Gareth concluded the presentation by saying that that there is no out-of-the-box solution to preservation. SherpaDP gained some valuable practical experience of the preservation of e-prints repositories. While the location is unimportant, appropriate services must exist to support repositories in digital preservation. Cross-repository interoperability is achievable using appropriate standards, and most importantly, preservation begins on ingest!

SherpaDP2 is now funded by JISC to take the work forward. SherpaDP2 will investigate alternative methods for connecting to digital repositories and downloading repository data and refine the SherpaDP set of protocols and software to interact with Institutional repositories using a wider range of repository software and with a broad range of digital object types.

Bookmark and Share