This chapter looks at the elements of good practice in managing metadata covering topics such as the use of recognised standards which supports interoperability.
As has been outlined previously, a metadata schema is a set of metadata elements that have been grouped together for a specific purpose. Metadata schemas often emerge from a single community group (for example the library community) or can be developed to describe a specific format (a digital audio file for example). Sometimes metadata schemas go through a formal validation process by a standards organisation, such as the International Standards Organisation (ISO) or an equivalent body such as the Dublin Core Metadata Initiative (DCMI). In which case they become official metadata standards.
It is also true, however, that not all metadata schemas that are thought of as 'standard' have gone through the validation process of a recognised standards setting body. Indeed 'de-facto' standards are more common that any official standard and are no less applicable in most cases. It is useful therefore to think of metadata standards not only as those officially recognised, but also as commonly used and consistently applied metadata schemas, that are measurable, well documented, and endorsed by someone.
There are some useful points to consider when researching an appropriate set of metadata elements, namely:
- Where there are clear and obvious standards for your resource type, community, or task, make use of them
- Where the standards are unclear and competing, follow models of ‘good practice’ within your community
- Where you can find no appropriate standards, adapt an existing standard to better fit your needs and document the changes you make very carefully using the documentation methods and mappings deployed by existing standards as a guide
The benefits of using a metadata standard
Theoretically it is possible to develop a metadata schema from scratch, and if working on a resource or in a context where there are genuinely no suitable standard candidates it may be necessary to do so. However, it is generally preferable to use or adapt existing standards. Using an existing standard can offer:
- Cost saving – the schema and its usage guidelines have been developed thus saving time and effort
- Access to help and advice – a standard is likely to have a community of users, which means there will access to help and advice about how best to use the standard
- Usability – users are likely to be familiar with a standard and its terminology, thus they can more quickly and easily use your collection
- Resource discovery – using a standard means the collection can more easily be opened up to be searched and shared with others
- Sustainability – use of common standards will make it easier to pass your collection on to someone else to look after if you ever need to
The metadata application profile
There are times when there is genuinely no suitable metadata standard available for a given context or 'application'. An example of this may be where metadata elements from more than one metadata standard are needed to adequately describe a resource.
It is therefore desirable to develop a new 'application profile', where various metadata elements (and the elements' guidelines and documentation) are taken from different metadata standards and mixed together. This approach to developing new metadata applications is more desirable that starting from scratch because:
- Application profiles are drawn from recognised standards and are therefore compliant with best practice
- Application profiles are well documented
- Application profiles can be re-used by others if necessary
Sharing metadata: also known as interoperability
Interoperability – the ability of one digital resource to work seamlessly alongside other digital resources – is one key aim, not only of the culture, heritage and education sectors, but of the internet as a whole. The concept of interoperability underpins all notions of a joined up 'web of data' or semantic web as envisaged by organisations like the Word Wide Web Consortium (W3C) and people like Tim Berners-Lee.
Interoperability depends on either the strict use of common standards or by understanding how non-standard metadata can be mapped to or transformed to common standards. Therefore when developing a metadata schema for a digital resource, in addition to thinking about the collection itself and its specific needs, it is also pertinent to think about how this metadata may interoperate with other analogous collections. Which, in turn, is likely to influence the choice and usage of metadata standards.
Metadata in XML format
While metadata is often stored for the long-term in a file, in a database, or both, XML (Extensible Mark-up Language) is commonly used a means of transporting, importing, exporting and sharing metadata from and between these systems. XML is related to HTML (the original coding used on the world wide web). However, while HTML tags are primarily focused on presentation (eg <b>bold</b>) or document structure (eg <header></header>), XML tags are used to indicate meaning (eg <organisationName>Jisc</organisationName>). This approach lends itself well to expressing metadata, enabling metadata schema categories to be turned into tags and wrapped around specific terms, as the simplified example below shows.
<creator>Leonardo da Vinci</creator>
Metadata in RDF/XML format
Another, increasingly common, method of sharing metadata between systems is to use the RDF (Resource Description Framework). RDF is a model for describing any resource either digital or analogue and in any format or media type.
RDF defines a 'resource' as any object that can be identified uniquely using a URI (Universal Resource Indicator). An RDF resource is made up of three separate parts: the Resource (which is RDF terms is called the Subject), the Property (known as the Predicate), and the Object (known as the Literal). The Mona Lisa example used earlier would therefore have the following parts:
Leonardo Da Vinci
And the RDF statement would look something like:
<s:Creator>Leonardo Da Vinci</s:Creator>
RDF is different from typical metadata schema in that it is able to remove ambiguity over the meaning or 'semantics' associated with different terminologies. In practice what this does is enable bits of data to be merged or shared even if the underlying metadata schemas are different. RDF is a core part of the emerging 'linked data' community which is dedicated to providing ways to connect structured data on the web to facilitate greater sharing of resources.