At a time when digitisation technology has become well established in library operations, the need for a degree of standardisation of metadata practices has become more acute, to ensure digital libraries the degree of interoperability long established in traditional libraries. The complex metadata requirements of digital objects, which include descriptive, administrative and structural metadata, have so far mitigated against the emergence of a single standard. However, a set of already existing standards, all based on XML architectures, can be combined to produce a coherent, integrated metadata strategy.

Metadata for Digital Libraries: State of the Art & Future Directions

Reportby Richard Gartner
April 2008

Executive Summary

At a time when digitisation technology has become well established in library operations, the need for a degree of standardisation of metadata practices has become more acute, to ensure digital libraries the degree of interoperability long established in traditional libraries. The complex metadata requirements of digital objects, which include descriptive, administrative and structural metadata, have so far mitigated against the emergence of a single standard. However, a set of already existing standards, all based on XML architectures, can be combined to produce a coherent, integrated metadata strategy.

An overall framework for a digital object's metadata can be provided by either METS or DIDL, although the wider acceptance of the former within the library community makes it the preferred choice. Descriptive metadata can be handled by either Dublin Core or the more sophisticated MODS standard. Technical metadata, which is contingent on the type of files that make up a digital object, is covered by such standards as MIX (still images), AUDIOMD (audio files), VIDEOMD or PBCORE (video) and TEI Headers (texts). Rights management may be handled by the METS Rights schema or by more complex schemes such as XrML or ODRL. Preservation metadata is best handled by the four schemas that make up the PREMIS standard.

Integrating these standards using the XML namespace mechanism is straightforward technically although some problems can arise with namespaces that are defined with different URIs, or as a result of duplications and consequent redundancies between schemas: these are best resolved by best practice guidelines, several of which are currently under construction.

The next ten years are likely to see further degrees of metadata integration, probably with the consolidation of these multiple standards into a single schema. The digital library community will also work towards firmer standards for metadata content (analogous to AACR2), and software developers will increasingly adopt these standards. The digital library user will benefit from developments in enhanced federated searching and consolidated digital collections. The same developments are likely to take place in the archives and museums sectors, although the different metadata traditions that apply here are likely to make the form they take somewhat different. The adoption of integrated metadata strategies should be pressed for at the highest managerial levels

The combined benefits of the shared XML platform and the fact that they have already proved themselves in major projects makes these standards the best strategic choices for digital libraries. Although their adoption in integrated environments is still at a relatively early stage, particularly amongst software developers, increasing community-wide use of these will render the production of digital collections easier by freeing resources from metadata to object creation, and facilitate the adoption of service-oriented approaches to core infrastructures. The adoption of integrated metadata strategies should be pressed for at the highest managerial levels.

Download the full report below

Documents & Multimedia

Bookmark and Share