This project will create mechanisms to allow for the rapid and flexible access to over 9,000 hours of spoken audio files, drawn from some of the leading British and American spoken word corpora.

Mining a Year of Speech

This project analyses a year-worth of speech from a variety of corpora on an unprecedented scale in spoken language research. It is impractical for a researcher to listen to a year of audio to search for certain words or phrases, or to manually measure the resulting data.

Through the data mining capabilities developed by this project, researchers will be able to conduct such tasks in just a few seconds. This will help explore questions such as possible changes in dialects linked to social status and the frequency of linguistic features across different age groups, genders, social classes, and regions.

Project Staff

Project Directors

Bookmark and Share
Summary
Start date
4 January 2010
End date
31 March 2011
Funding programme
Digitisation and Content
Strand
Digging into data challenge
Project website
Lead institutions
University of Oxford (UK)
Partner institutions

University of Pennsylvania (US)

Committees
  • JISC Infrastructure and Resources Committee
Topic