This project will use off-the-shelf text mining techniques to enrich the functionality of the SHERPA-LEAP consortial repository cross-searching service, LASSO.

Metadata Enrichment for Repositories in a London Instiutional Network

This project (MERLIN) will use off-the-shelf text mining techniques to enrich the functionality of the SHERPA-LEAP consortial repository cross-searching service, LASSO. LASSO offers search across aggregated, normalised metadata which is collected from London-based institutional repositories using OAI-PMH harvesting. MERLIN will use the TerMine term extraction tool to derive terms from the full text digital objects held at LASSO's source repositories and, after a weighting process, enrich the LASSO database with derived keywords. The derived terms will be exposed at various points in the LASSO interface to support discovery. In a supplementary strand, MERLIN will apply the tools developed by the HILT project to construct a pilot hierarchical, browsable subject tree from the text-mined keywords. The remodelled interface will undergo usability testing, and an end-user evaluation process will be carried out to inform the development work of the project.

A summative evaluation report on all the outputs of the project will be prepared, and an open source, re-usable web application will be created to allow the MERLIN metadata enrichment technology to be incorporated in any repository on any platform.

Project Staff

Primary Contact

 

Bookmark and Share
Summary
Start date
1 April 2009
End date
1 September 2010
Funding programme
Information Environment Programme 2009-11
Strand
Resource discovery strand of Information Environment 09-11
Committees
  • JISC Integrated Information Environment committee
Topic