This project centred on developing an exemplar summarisation service for the social science domain in the form of a case study of systematic reviews for the Evidence for Policy and Practice Information Centre (EPPI). The project also featured a community call around the social science domains to develop further case studies showing the benefits of text mining. Through all activities the project focused on providing for broader institutional involvement in text mining.

Automatic Summarisation for Systematic Reviews using Text Mining

This project centred on developing an exemplar summarisation service for the social science domain in the form of a case study of systematic reviews for the Evidence for Policy and Practice Information Centre (EPPI). The project also featured a community call around the social science domains to develop further case studies showing the benefits of text mining. Through all activities the project focused on providing for broader institutional involvement in text mining.

Executive Summary

The Automatic Summarisation for Systematic Reviews using Text Mining (ASSERT) project draws upon the need for wider inclusion of social scientists in e-Infrastructure. In social science research, text documents and textual data form a large portion of the material being analysed and investigated. Recent innovations in text mining are now providing direct assistance to analytical and investigative methodologies in areas of the biomedical sciences. However, little uptake of these innovations has been observed elsewhere, outside of large corporations. The ASSERT project aimed to help remove the barriers previously identified by practitioners – namely a perceived lack of relevance of, and limited trust in, text mining due to traditional ‘black box’ approaches.

The project therefore aimed to integrate a set of text mining technologies into a software framework capable of appealing to a wide spectrum of social scientists, with special focus on the task of carrying out literature surveys. This task was chosen as it is a methodological step common to many researchers, who are thus well acquainted with the challenges and frustrations involved in the task, and are also aware of those techniques that have proved successful for their needs. This task then offers an excellent case study for targeted dissemination to users willing to test a system that has the potential to make their everyday work easier.

A team was formed containing text mining expertise from the National Centre for Text Mining, and research methodology experience from the Evidence for Policy and Practice Information Coordinating Centre, who specialise in performing systematic reviews for large organisations. The project itself would focus on the development of a tool for systematic reviewing and, as part of this, a summarisation service. Furthermore, a community call was drawn up in collaboration with the JISC to fund activities building upon the delivered technology to act as further case studies for the target audience and to showcase the benefits and possibilities of using text mining technology.

The delivered system combines term extraction tools with document clustering and summarisation. Additional automated query expansion tools were added to support existing search techniques and these were also noted to be of potential benefit to the e-learning community as a method for exploring new domains without having to necessarily know the technical jargon used. Evaluation and testing were successfully conducted across a number of subject areas chosen by core stakeholders. Numerous areas of future extension and inclusion have been identified and in some cases work has already begun on these, with promising results and further user feedback. Thus, although the project has now terminated, the project partners continue to develop and extend functionality of the system, in the light of strong interest from and interaction with the user community.

This project and the associated output from the community call represent a significant contribution to the e-Infrastructure in the UK, as demonstrated by the level of interest and uptake of the results. Broader involvement in these activities by the social science community as well as active use of the delivered technology by the digital libraries and repositories community show the success of a platform already generating profitable synergies and engagement.

Download the full report. This report is available electronically only.

Documents & Multimedia

Bookmark and Share
Summary
Author
Brian Rea (Project Manager) & Sophia Ananiadou (Project Director)
Publication Date
31 January 2009
Publication Type
Programmes
Projects
Services
Topic
Strategic Themes