Making our shared activity information count (MOSAIC)
This project is complete - See final report
Background
A number of universities are actively interested in recommender and other user activity driven services. MOSAIC is therefore investigating the technical feasibility, service value and issues around exploiting activity data, primarily to assist users in resource discovery and selection.
Such data might be combined from:
- Circulation modules of Library Management Systems (the initial project focus)
- ERM systems / Open URL Resolver data covering journal article access
- VLE resource and learning object download
- Reading lists (without activity data)
The project will assist others working on these issues by assessing scalability and service models, by making data available and by gathering feedback from the community.
Overview
MOSAIC is building on the findings and recommendations of the JISC TILE project, which investigated ‘pain points’ in UK HE library take up of ‘web-scale’ Web2.0 opportunities, in particular relating to the ‘context’ of users (e.g. their course) and their related use of resources. The TILE findings were closely linked to the work done by Dave Pattern at the University of Huddersfield with local activity data. MOSAIC aims to build on this by aggregating library activity data from several institutions and making it available for re-use and experimentation. The Talis podcast with Dave provides further background.
Dave Pattern's blog
Aims and objectives
The MOSAIC objectives are to:
- Generate a test activity dataset (beyond just circulation or a single institution)
- Promote experimentation by allowing anyone to freely share, modify and use this data under an Open Data licence
- Assist the community in agreement of a durable data schema for these purposes
- Use the contributed data alongside machine generated data to test the performance and utility of available indexing and retrieval technologies
- Gather initial user feedback from librarians and students on potential applications and interfaces in autumn 2009
- Identify the constraints placed by Data Protection legislation on such an undertaking
Project methodology
Each partner has a specialist role:
- Sero will lead on the assessment of the business case and formulation of recommendations to JISC.
- Dave Pattern will support to librarians and systems staff undertaking activity data extraction from library systems and will host the dataset
- PLE will manage the technical demonstrator development using agile methodology; the demonstrator will focus on scale, data faceting, integration of mixed data and search interface.
- Ken Chad & Paul Miller will gather librarian and patron feedback on the demonstrator.
Anticipated outputs and outcomes
Outputs
- Guidance for extracting activity data from library and similar systems
- Initial data schema
- Activity datasets from participating institutions under Open Data licenses
- Demonstrator of scalable database implementation and end user application
- Focus groups for librarians and students to evaluate service potential
- Report to JISC on technical and business options for development of such services both at local and web-scale
Outcomes
- Recognition of the implications of Open Data licensing and Data Protection legislation
- Identification of the potential value of context linked activity data
- Recommendation regarding opportunities for local and national services
Technology / Standards used (if applicable)
- Database – Solr / Lucene
- Data Exchange – XML
Project Staff
Project Manager
Project Team