FAIR Synthesis: Metadata
This webpage has been archived. Its content will not be updated.
View web retention policy
Metadata issues are fundamental to the design and functioning of an e-print
repository and the provision of access to information in general.
This section explores some of the issues involved and highlights
achievements of FAIR projects in this area.
The SHERPA project has also produced a general resource for e-print
repositories, ‘An introduction to metadata
requirements for an e-print repository' which gives an overview of
the issues involved.
Quality and QA of metadata
Metadata quality is a key issue in developing repositories. The
quality of metadata obviously has implications for resource discovery and
user satisfaction. It also has implications for institutional ‘buy
in’, as an institutional repository is a very visible service.
Several FAIR projects are investigating issues related to metadata quality,
and the articles below reflect their findings:
-
Guy, M., Powell, A. and Day, M., Improving the quality of metadata
in Eprint archives, Ariadne, 2004, Issue 38 (ePrints UK)
-
Barton, J., Currier, S. and Hey, J.M.N., Building Quality Assurance into
Metadata Creation: an Analysis based on the Learning Objects and
e-Prints Communities of Practice. In: Proceedings DC-2003 (2003 Dublin
Core Conference), Supporting Communities of Discourse and Practice -
Metadata Research and Applications, 28th Sept-2nd Oct 2003, Seattle,
Washington, USA (TARDis)
Descriptive metadata
Descriptive metadata is used for indexing, discovery, and identification of
items in repositories. OMI-PMH specifies unqualified Dublin Core as a
basic requirement. Dublin Core is simple and flexible, but it wasn’t
developed with repositories in mind. It needs to be used in a consistent
way to enable searching and browsing across repositories. FAIR
projects have developed guidelines to enable Dublin Core to be used
consistently for different types of repositories.
ePrints UK developed a useful guide to using Dublin Core for e-prints,
so that the metadata they harvested would be consistent. During the
project they explored various issues associated with using Dublin Core in a
consistent way, e.g. how to encode full text links so they point to the
correct document.
Similarly, in the area of electronic theses and dissertations (EDTs), FAIR
projects have worked together to develop guidelines for use of Dublin
Core. Electronic Theses, Theses Alive!, and DAEDALUS have
collaborated to develop a UK Metadata Core Set for ETDs.
Dublin Core was developed for bibliographic and other print materials and
doesn’t adequately describe images and museum objects. Accessing the
Virtual Museum, BioMed Image Archive, Harvesting the Fitzwilliam, and
Hybrid Archives explored this and the implications for discovery in a metadata issues
paper.
Subject categorisation and vocabularies
Using subject categories or controlled vocabularies in conjunction with
metadata has the potential to improve resource discovery. TARDis has
been a leader in moving forward issues related to subject categorisation.
At the 2nd Workshop on the Open Archives Initiative (OAI): Gaining
Independence with ePrints Archives and OAI, 17-19 October 2003, at CERN in
Geneva, a forum for discussion of 'subject' issues was thought to
be an important next step. This is being introduced by Southampton on
the oai-eprints mailing
list which was created as a result of the workshop. A series of discussion points on the
subject categorisation of e-print archives.
Accessing the Virtual Museum has developed a specialist Egyptology
thesaurus to support and describe the records of museum objects
created. This comprises four separate vocabularies covering object
names, place names, dates (and mechanisms for describing these), and
material types. Details of these vocabularies are available from
project staff. They can also be seen in action through the Petrie museum search
page and selecting ‘Search the online catalogue’ from ‘The Petrie
Museum’ menu. The vocabularies can be browsed to select a search
term.
The PORTAL project investigated as part of its work how external resources
should be surfaced within an institutional portal. A key element in
presenting these resources is how they are described according to their
subject area. A discussion paper has been made available to raise
some of the issues involved and encourage ongoing discussion of these.
Preservation metadata
Hybrid Archives has developed a new model for the preservation of
datasets. The model allows for data to be deposited at the AHDS
through harvesting via OAI-PMH, but for the content to also be held by the
data owner who then provides access to it. This differs from current
practice where data preservation traditionally involves handing over the
entire dataset to the AHDS or similar body, who then preserves it and also
provides access to it. For further information about the hybrid
model, see the section on repository
models. The model will be supported by reports on preservation
requirements and preservation metadata, to be posted in the project web
site in summer 2005.
Rights metadata
A key objective of the RoMEO project was to develop a solution for
protecting the IPR of e-prints in an OAI environment. They first
surveyed academic authors and data and service providers about the rights
they wished to protect. The rights solution involved developing
simple rights metadata by which authors could describe the rights status of
their e-prints, and a means by which OAI data providers and service
provides might assert the rights status of their metadata under
OAI-PMH. This can be done using Creative Commons licenses. The
work of RoMEO influenced and fed into the formation of the OAI-rights
Technical Working Group in the US, which seeks to extend the findings from
RoMEO as a generic solution when using OAI (as opposed to just for
e-prints). Draft
implementation guidelines have been produced and are available for view
and comment at .
User Metadata
The PORTAL project undertook extensive studies to specify the requirements
for institutional portals across UK institutions. This included a
report detailing the available metadata standards for the description
of users within an institutional portal environment.