The Digging into Data Challenge is an international grant competition sponsored by four leading research agencies, the Joint Information Systems Committee (JISC) from the United Kingdom, the National Endowment for the Humanities (NEH) from the United States, the National Science Foundation (NSF) from the United States, and the Social Sciences and Humanities Research Council (SSHRC) from Canada. What is the "challenge" we speak of? The idea behind the Digging into Data Challenge is to answer the question "what do you do with a million books?" Or a million pages of newspaper? Or a million photographs of artwork? That is, how does the notion of scale affect humanities and social science research? Now that scholars have access to huge repositories of digitized data -- far more than they could read in a lifetime -- what does that mean for research? Applicants will form international teams from at least two of the participating countries. Winning teams will receive grants from two or more of the funding agencies and, one year later, will be invited to show off their work at a special conference. Our hope is that these projects will serve as exemplars to the field.

Digging into data challenge

The Digging into Data Challenge is an international grant competition sponsored by four leading research agencies, the Joint Information Systems Committee (JISC) from the United Kingdom, the National Endowment for the Humanities (NEH) from the United States, the National Science Foundation (NSF) from the United States, and the Social Sciences and Humanities Research Council (SSHRC) from Canada. 

What is the 'challenge' we speak of?  The idea behind the Digging into Data Challenge is to answer the question "what do you do with a million books?"  Or a million pages of newspapers? Or a million photographs of artworks?  That is, how does the notion of scale affect humanities and social science research? Now that scholars have access to huge repositories of digitised data -- far more than they could read in a lifetime -- what does that mean for research?  

8 projects have been selected for the first round. JISC is in discussion with the other funding bodies about a second round.

Projects

Structural analysis of large amounts of music information

University of Illinois, Urbana-Champaign, University of Southampton, McGill University
This project will gather c.23,000 hours of digitised music with a breathtaking range of styles, regions and time periods: A Capella to Zydeco, Appalachia to Zambia, and Medieval to Post-Modern and develop tools to tag and analyse the underlying structures that underpin global music.

Digging into the enlightenment: Mapping the Republic of Letters

University of Oklahoma, University of Oxford, Stanford University
This project will focus on a corpus of 18th-century 53,000 letters, and will extract and interpret details relating to people, places, times, and subjects, and identify new ways of visualising and annotating these relationships.

Data mining with criminal intent 

George Mason University, University of Alberta, University of Hertfordshire
This project will create an intellectual exemplar for the role of data mining in an important historical discipline – the history of crime – and illustrate how the tools of digital humanities can be used to wrest new knowledge from one of the largest humanities data sets currently available: the Old Bailey Online.

Towards dynamic variorum editions

Mount Allison University, Imperial College, London, Tufts University
This project will develop a range of tools that allow for dynamic comparison, generation of lexica, identification if topics and extraction quotations over 10,00 Greek and Roman text, that helping continue develop a fundamental resource for classical studies.

Digging into image data to answer authorship related questions

Michigan State University, University of Illinois, Urbana-Champaign, University of Sheffield
This project will take three specific resources (manuscripts, maps and quilts) and develop tools to analyse and identify authorship of visual images

Harvesting speech datasets for linguistic research on the web

McGill University, Cornell University
This project will harvest audio and transcribed data from podcasts, news broadcasts, public and educational lectures and other sources to create a massive corpus of speech. Tools will then be developed to analyse the different uses of prosody (rhythm, stress and intonation) within spoken communication.

Railroads and the making of modern America--Tools for spatio-temporal correlation, analysis, and visualization

University of Portsmouth, University of Nebraska-Lincoln
This project will integrate a vast collection of textual, geographical and numerical data to allow for the visual presentation of the railroads over time, concentrating initially on the Great Plains and NE USA.

Mining a year of speech

University of Oxford, University of Pennsylvania
This project will create mechanisms to allow for the rapid and flexible access to over 9000 hours of spoken audio files, drawn from some of the leading British and American spoken word corpora.

Contact

Summary
Start date
1 January 2010
End date
31 March 2011
Committees
Funding programme
Digitisation and e-Content
Topic
Strategic Themes