- Home
- » News
- » Interview - Robert Kiley, Head of Systems Strategy at the Wellcome Library
Interview - Robert Kiley, Head of Systems Strategy at the Wellcome Library
The Medical Journals Backfiles Digitisation Project, which will
digitise around 1.7 million pages of complete back files worth £1.25
million, is jointly funded by JISC and the Wellcome Trust who are working
with the National Library of Medicine, based in the US. The digitised
content will be made freely available on the internet – via PubMed Central. We interview
Robert Kiley, Head of Systems Strategy at the Wellcome Library, who
is managing the project from the Wellcome side.
JISC: Could you provide me with an overview of the project
itself?
RK: The project aims to digitise a number of historically
significant medical journals. It came out of a consultancy study
we did with HEDS. We looked at a number of
collections held in the Wellcome Library – archives, images, printed
books etc – to try to determine what project would benefit most
people. Digitising medical journals was identified as a project that would
meet the needs of significant number of users.
The aim of the project is to identify around 15 journals - which we
consider historically significant - and digitise them in their entirety and
make them freely available through PubMed Central.
It isn’t just the archive, however, that we intend to make freely available
– but also current and future issues published by participating publishers.
In essence, Wellcome and JISC agree to fund the backfile conversion and in
return the publishers (as a condition of participation) have to deposit
their current issues into the PubMed Central archive. Research
articles deposited within PubMed Central must be made freely within 12
months of publication, whilst all other content, such as editorials,
letters, or reviews, must be made available within 3 years
JISC: Who do you feel will benefit from using the material once it
has been archived?
RK: The research and the clinical communities within the
UK and overseas, and medical historians are the key audiences. For
example, if you want to understand today’s MMR autism scare, you have to
look back to the medical literature of the 1940s and 1950s to really
understand the background to the issues. Most of that material is
just not available online. This project is one way of facilitating
access to these backfiles.
JISC: You mentioned MMR, do you think there is something in this
project for the public at large, or do you think it is too
specific?
RK: Everything we digitise will be made freely available
online – and I hope that the public will make use of this
archive.
JISC: Can you explain the digitisation process?
RK: We are taking the archives of journals, such as the
Journal of Physiology and the Biochemical Journal, and
are going to scan every single page. Once scanned, the page is
subjected to optical character recognition indexing – thus facilitating
full-text indexing of every word in the archive.
For every discrete article, (such as a research paper, editorial, letter,
etc.), we will also create an XML citation, which will be added to PubMed
Medline. As a consequence, anyone will be able to log onto Medline
(the preferred search tool for health professionals) and find an article
from the archive – even if the article dates from the 19th
century. From the citation, the user will be able to link dynamically
to the full text.
JISC: Could you tell me a bit about your collaboration with the
partners in the US and what motivated this?
RK: This is a joint project funded by JISC and The
Wellcome Trust, working in collaboration with the US National Library of
Medicine (NLM). The NLM are managing the digitisation process and
will undertake the quality assurance processes on the archives, to make
sure all pages are there and of suitable quality. The NLM are also
responsible for hosting the archive – though in time, it may be possible to
mirror this data to a European PubMed Central node.
The NLM were the obvious partner as they were already digitising back files
and had a product (PubMed Central) already online. One of the findings
from the HEDS study, was that the successful digitisation projects tend to
be those that have a critical mass of digital surrogates. Little “digital
islands” of data do not get used.
JISC: You mentioned quality assurance – can you tell me a bit more
about this?
RK: When the contractor returns the scanned paper archive,
the NLM run a series of automated checks. This will pick up obvious
problems, such as missing pages, or pages out of order. In addition
to this however, the NLM also do a manual 10% sampling check – where the
returned PDF is compared with the original published journal. This is
labour intensive, but it does help to ensure that the archive is of high
quality – something all three partners are keen to see.
JISC: Are the NLM taking responsibility for the physical
archiving?
RK: Yes, but remember, there are two elements to this
project - a paper one and a digital one.
With regard to the paper archive, we look to the publisher to provide
this. Because of the way journals are digitised – issues are
de-spined – we ask the publisher for a disposable copy. Once the
publisher has supplied the archive, the NLM produce an inventory and check
for completeness. They also put together a style sheet to indicate how the
journal should be scanned, and how the XML should be marked up. Once
all this information has been prepared the archives are shipped for
scanning.
At the same as the paper archive is being prepared, publishers are asked to
send sample digital files to the NLM for evaluation. The purpose of
this is to ensure that digital files can be added to the archive – with
little (or no) human intervention.
JISC: What about technology moving forward very fast, what is the
take on that?
RK: The whole concept behind PubMed Central is that it is
a long-term archive. Indeed, PubMed Central's approach underlies the
NLM's basic archiving philosophy. The xml is the digital
archival copy of record. By creating an online view directly from the
xml, the NLM are ensuring that they have an accurate archival record - what
you see is what the publisher has archived.
JISC: Can we move onto the open access issues?
RK: The Wellcome trust has published two reports related
to the Open Access debate. Our view is that the current publisher model
does not work in the interest of researchers, libraries, or the
public. To try to remedy this, Trust-funded researchers are encouraged
to publish in open access journals. Additional funding is made
available to cover the author costs associated with this new business
model.
Hopefully, over the next few months, the Wellcome will further develop its
OA policy. Encouraging researchers to move to the OA model is one thing,
but I suspect that we need to be more pro-active. We are the UK's
biggest funder of medical research, spending over £400 million per year.
With this level of spend we can help to influence change.
It is interesting to note that the National Institutes of Health have
drafted a consultation paper, which, if implemented, would require NIH
grantees to deposit their research papers in PubMed Central. Such
papers would then be freely available, within six months.
The Trust recognises that dissemination of research is part of our mission.
The results of the Human Genome project, (the Trust was a major funder) are
made publicly available, via the Internet. We recognise that more and
better research would result by making the outcomes of the Genome project
freely available. This approach now needs to be applied to research
papers.
JISC: What benefits are there for the Wellcome Trust to be working
with JISC?
RK: In terms of the Medical Journals Project, JISC has
been an instrumental player in bringing this project into reality.
JISC committed its funding contribution early on in the project
negotiations – and this support was used to lever additional funding from
the Wellcome.
More generally, JISC’s positive stance on OA means that we share the same
philosophy, in terms of making research freely available to all.
JISC: What key landmarks will there be in the project?
RK: To date we have secured the agreement of about seven
or eight titles, from a mixture of publishers. A number of these have
already been shipped for scanning – and I anticipate that a couple of
titles (probably the Biochemical Journal and Medical
History) will be available online by Spring 2005. The other
titles will made available over the following 2-years.
JISC: Where can I find more information about the project?
RK: There is, of course, a website for further information
Wellcome Library:
Medical journals backfiles digitisation project where you can get
also get an up-to-date list of journals that have agreed to participate in
this project.