Mining a Year of Speech
This project analyses a year-worth of speech from a variety of corpora on an unprecedented scale in spoken language research. It is impractical for a researcher to listen to a year of audio to search for certain words or phrases, or to manually measure the resulting data.
Through the data mining capabilities developed by this project, researchers will be able to conduct such tasks in just a few seconds. This will help explore questions such as possible changes in dialects linked to social status and the frequency of linguistic features across different age groups, genders, social classes, and regions.