A number of universities are actively interested in recommender and other user activity driven services. MOSAIC is therefore investigating the technical feasibility, service value and issues around exploiting activity data, primarily to assist users in resource discovery and selection. Such data might be combined from: Circulation modules of Library Management Systems (the initial project focus); ERM systems / Open URL Resolver data covering journal article access; VLE resource and learning object download; Reading lists (without activity data.) The project will assist others working on these issues by assessing scalability and service models, by making data available and by gathering feedback from the community.

Making our shared activity information count (MOSAIC)

This project is complete - See final report

Background

A number of universities are actively interested in recommender and other user activity driven services. MOSAIC is therefore investigating the technical feasibility, service value and issues around exploiting activity data, primarily to assist users in resource discovery and selection.

Such data might be combined from: 

  • Circulation modules of Library Management Systems (the initial project focus) 
  • ERM systems / Open URL Resolver data covering journal article access 
  • VLE resource and learning object download 
  • Reading lists (without activity data)

The project will assist others working on these issues by assessing scalability and service models, by making data available and by gathering feedback from the community.

Overview

MOSAIC is building on the findings and recommendations of the JISC TILE project, which investigated ‘pain points’ in UK HE library take up of ‘web-scale’ Web2.0 opportunities, in particular relating to the ‘context’ of users (e.g. their course) and their related use of resources. The TILE findings were closely linked to the work done by Dave Pattern at the University of Huddersfield with local activity data. MOSAIC aims to build on this by aggregating library activity data from several institutions and making it available for re-use and experimentation. The Talis podcast with Dave provides further background.

Dave Pattern's blog

Aims and objectives

The MOSAIC objectives are to: 

  • Generate a test activity dataset (beyond just circulation or a single institution) 
  • Promote experimentation by allowing anyone to freely share, modify and use this data under an Open Data licence 
  • Assist the community in agreement of a durable data schema for these purposes 
  • Use the contributed data alongside machine generated data to test the performance and utility of available indexing and retrieval technologies 
  • Gather initial user feedback from librarians and students on potential applications and interfaces in autumn 2009 
  • Identify the constraints placed by Data Protection legislation on such an undertaking

Project methodology

Each partner has a specialist role: 

  • Sero will lead on the assessment of the business case and formulation of recommendations to JISC. 
  • Dave Pattern will support to librarians and systems staff undertaking activity data extraction from library systems and will host the dataset 
  • PLE will manage the technical demonstrator development using agile methodology; the demonstrator will focus on scale, data faceting, integration of mixed data and search interface. 
  • Ken Chad & Paul Miller will gather librarian and patron feedback on the demonstrator.

Anticipated outputs and outcomes

Outputs
  • Guidance for extracting activity data from library and similar systems 
  • Initial data schema 
  • Activity datasets from participating institutions under Open Data licenses 
  • Demonstrator of scalable database implementation and end user application 
  • Focus groups for librarians and students to evaluate service potential 
  • Report to JISC on technical and business options for development of such services both at local and web-scale
Outcomes
  • Recognition of the implications of Open Data licensing and Data Protection legislation 
  • Identification of the potential value of context linked activity data 
  • Recommendation regarding opportunities for local and national services

Technology / Standards used (if applicable)

  • Database – Solr / Lucene
  • Data Exchange – XML

Project Staff

Project Manager
Project Team

 

Summary
Start date
1 April 2009
End date
30 November 2009
Funding programme
Information Environment Programme 2009-11
Project website
Lead institutions
Sero Consulting Ltd
Partner institutions

Ken Chad Consulting Ltd

Mark van Harmelen, PLE Ltd

Dave Pattern, University of Huddersfield

Paul Miller, Cloud of Data Ltd