e-Journals: Archiving and Preservation Briefing paper
Libraries have in the past assumed preservation responsibility for material they collect, while publishers have supplied the material they need. These well understood divisions of labour do not work in the digital environment and especially so when dealing with licensed e-journals.
The crucial difference between licensing access to an e-journal, as opposed to purchasing and then owning a print journal, is that, unless there are explicit and legally binding arrangements for archiving, it cannot be guaranteed that online access will continue indefinitely.
Despite these major concerns, the trend towards e-only access is continuing at a rapid rate. Academic users increasingly prefer the online versions of journals, for convenience and timeliness. Libraries and publishers are responding to that demand. Although most refereed e-journals are still parallel print and electronic, a British Library commissioned study by Electronic Publishing Services Ltd concluded that half of all serial publications will become online only by 2016.
These trends make finding practical solutions to the well documented challenges of preserving electronic publications a matter of urgency. While much progress has been made in addressing these challenges, a systematic approach to the issue of ensuring continued access to licensed e-journals has remained elusive.
In the UK, the National Electronic Site Licence Initiative (NESLI) licence has included archiving clauses for a number of years which have provided a measure of reassurance for libraries. However, adherence to the clauses remains at the discretion of the publishers and monitoring adherence remains the responsibility of libraries. The development of independent, trusted, not-for-profit e-journal archiving services provides the prospect of solutions acceptable to both publishers and libraries. A recent survey funded by the Council on Library and Information Resources (CLIR), e-Journal Archiving Metes and Bounds: a Survey of the Landscape, reviews twelve not-for-profit archiving services which meet their criteria of trusted digital repositories. Many of these services, including Portico, PubMed Central (PMC), Lots of Copies Keep Stuff Safe (LOCKSS), and Controlled LOCKSS (CLOCKSS), are already familiar to the UK.
In addition, there is currently a JISC/Consortium of Research Libraries (CURL) LOCKSS pilot underway due for completion in early 2008 and UK PubMed Central, run by a consortium led by the British Library, was launched in 2006.
The role of legal deposit
The Legal Deposit Libraries Act 2003 extended previous legislation to include electronic publications in the UK under secondary legislation. As legal or voluntary deposit will be expected to include e-journals, it is tempting to believe that this will provide a safety net for UK further and higher education institutions who wish to move to e-only e-journal subscriptions. The British Library is currently assessing whether it would be feasible to provide archiving and preservation services over and above those envisaged for e-legal deposit material.
While legal deposit is undoubtedly a crucially important component of the digital preservation landscape, it has limited applicability to licensed e-journal content and exercising purchased rights for continuing local access by subscribers. This is primarily because it is assumed that access will be limited to on-site use at the national libraries. In addition, it would be difficult to achieve comprehensive coverage of the world’s e-journals in the absence of universal legislation which enshrines the right of national collecting institutions to have e-publications deposited with them. Finally, the concept of ‘national’ publications is becoming increasingly ambiguous in a world in which management and service delivery of publications may occur in a number of locations.
Perpetual access, archiving and long-term preservation
The terms ‘perpetual access’, ‘archiving’, and ‘long-term preservation’ are sometimes used interchangeably. Perpetual access is most commonly associated with e-journal licence clauses designed to provide assurance of continued access to subscribed material in certain circumstances, including post-cancellation. Archiving describes the process and procedures whereby e-journal content may be managed for the short or long term. Long-term preservation refers to the processes and procedures required to ensure content remains accessible well into the future, regardless of any technical or organisational changes.
The question is whether continued access in the medium to long term can be safely left to publishers or whether this is better undertaken by an independent repository. Such a repository should be capable of achieving certification status if a system of certified repositories was developed. In the meantime, it should at least operate under conditions which are sufficiently open to enable potential clients to judge their credentials for the task.
The role of institutional and open access repositories
The rapid development of institutional and open access repositories has been another significant factor which may appear to offer a digital preservation solution. However, two major factors need to be considered before assuming that institutional or open access repositories will meet the archiving needs of institutions for most of the content they need. Firstly, despite a powerful momentum, much peer-reviewed research literature still remains outside the realm of institutional and open access repositories. Secondly, the emphasis to date has been, unsurprisingly, on populating the repositories, rather than preserving their content, so it cannot be safely assumed that electronic research articles deposited in institutional and open access repositories are automatically preserved for the future.
JISC-funded projects such as Securing a Hybrid Environment for Research Preservation and Access: Digital Preservation (SHERPA DP) and Preservation Eprint SERVices (PRESERV) are investigating models for archiving and preserving content in distributed institutional repositories. This research will pave the way for more coherent, coordinated preservation strategies to safeguard the valuable content being held in UK institutional repositories. This is a rapidly evolving area which justifies continued investment but should not be seen as a substitute for other services specifically designed to preserve scholarly e-journals.
Current e-journal archiving options
The NESLI Model Licence includes a post-cancellation option for the publisher to supply an archival copy of the licensed material to a central archival facility operated on behalf of the UK higher education and further education community. This would be the optimum option but, until recently, invoking this option has been hindered by the lack of such facilities. This situation is rapidly changing.
The CLIR report profiles twelve e-journal archiving services selected on the following criteria:
- Having explicit commitment to digital archiving for peer-reviewed scholarly e-journals
- Maintaining formal relationships with publishers, including the right to ingest and manage e-journals over time
- Being able to provide evidence that work addressing long-term accessibility was underway
- Being a not-for-profit organisation independent of publishers
- Benefiting academic libraries that have a preservation mandate
Four of these services, LOCKSS, CLOCKSS, PubMed Central, and Portico, are familiar to the UK scene and have also been at least trialled by a number of UK institutions. They all offer different technical and business model approaches and have different content coverage (though with some overlap). Several publishers are participating in one or another service. All rely on the publisher to offer current access but all can provide access either following a specific trigger event (LOCKSS, CLOCKSS and Portico), or after a ‘moving wall’ for non-open access material (PMC).
LOCKSS is an open membership organisation and allows libraries to exercise preservation responsibility for their own collections. Presentation files of material licensed by participating libraries are harvested from the publishers’ web sites with the permission of participating publishers and using open source software developed for the task. When publishers’ web sites become inaccessible, libraries have access to their own, local copy of authorised content. It is a good option for libraries wanting to exert control over their own e-journal collections, willing to invest in the modest hardware and staff costs involved, and also (if they wish to reap the benefits of being a member of the LOCKSS Alliance) willing to pay an annual membership fee.
CLOCKSS is a dark archive and stores and preserves publishers’ source files for the long term as part of a community partnership between selected libraries and publishers. CLOCKSS is a good option for libraries wishing to take on a preservation role within a consortium on behalf of a broader community.
Portico acts as a trusted third party service provider and preserves the source files of e-journals sent by participating publishers. It provides insurance to libraries that the e-journal content they have subscribed to will be preserved for the long term. If a publisher has designated them as such, Portico can also provide post-cancellation access. Portico is a good option for libraries who want to insure against potential loss without undertaking this task themselves but willing to pay a third-party service.
PubMed Central is an open access archive of research articles and other journal content in the biomedical and life sciences. In the UK, the British Library is leading a partnership to run UK PubMed Central, which is committed to ensuring that those articles from the medical and life sciences deposited in PMC will be safeguarded.
Certification
In recent years, much effort has gone into developing a consensus on criteria which might be used to assess those archives which reach a sufficiently high standard to qualify for formal certification as trusted digital repositories.
A particularly comprehensive set of metrics has been developed by the Research Libraries Group (RLG)/National Archives and Records Administration (NARA) Task force. The Draft Audit Checklist was released in August 2005 and has strongly encouraged feedback from the international community which will assist in refining the final checklist. The document is also undergoing practical testing on four archiving repositories, the results of which are expected in 2007. In addition, a working group has recently been formed within the Consultative Committee for Space Data Systems (CCSDS) to develop an international standard against which repositories may be certified and which will take into account other related work, including the RLG/NARA checklist.
In this early stage, it is important to be able to articulate practices and procedures that characterise trusted archives. In the absence of formal certification, it is still useful for both archives and potential users of their services to have a broadly accepted set of standards to which all services would be expected to adhere. This is especially important where there is a diverse and potentially bewildering range of options. Certification criteria, whether formally or informally administered, have the potential to provide a valuable tool to help libraries to select a service which best meets their needs and to help the services themselves achieve an internationally acknowledged standard.
For the foreseeable future, there needs to be a range of options for ensuring long-term preservation and access of the world’s scholarly e-journals. No single service can hope to comprehensively cover the full range of titles; moreover, at this early stage it would be unwise to try to select a single definitive approach to e-journal archiving which, in any case, is unlikely ever to emerge. Different approaches designed to satisfy different needs is an entirely valid strategy.
A distributed but coordinated system of e-journal archiving, with judicious overlap and duplication, seems most likely to offer the best way forward.
Conclusion
There are promising developments which make it simpler to make an informed choice than was the case when JISC commissioned a consultancy on e-journal archiving in 2003. Independent, trusted e-journal archiving services are emerging and those which offer the most promise for the UK environment should be supported. They are still at an early stage and the community needs to invest in different options in order to retain a vigorous and healthy diversity of services most likely to be able to accommodate the range of requirements.
The landscape is still rapidly evolving and it will be necessary to monitor developments closely and to encourage and facilitate regular communication between the three principal parties – libraries, publishers, and archiving services.
Further information and resources
This paper has been written by Maggie Jones, Digital Preservation Consultant, on behalf of JISC