At the start of March the Environment Agency warned that Britain needs to become more resilient to drought and flooding, as extremes of weather may be on the increase.
That didn’t come as a surprise to me, as someone who has spent the last eight months in flood-prone Yorkshire, working on a project to help climate scientists unearth reliable information on flooding and flood defences. Though I did find some of the figures quite arresting - they showed that one in every five days in 2012 saw flooding somewhere in the country, while one in four saw drought.
Most revealing of all, though, was the way the story was reported in news media. The stories were full of what ‘might’ or ‘could’ be on the cards in the future, what meteorologists ‘fear’ may happen, and what modelling ‘suggests’. It all sounds pretty tentative, which is understandable because it’s really hard to work out what the future holds when it comes to the climate. The information is hard to come by, and often contradictory in nature.
If even a few of the Agency’s possible predictions come about, they’ll have a profound impact on the way people have to live and work. As a nation, we’ll need to plan and to adapt: to get time to do that, we need to be able to gather good data and analyse it well.
Which is why support for environmental science research is now being developed as a matter of urgency. Initiatives include the Economic and Social Research Council’s ESRC's funding for leadership fellowships focusing on climate science, and numerous public sector organisations, including our project partner the British Library, have come together to collaborate on the Living With Environmental Change (LWEC) partnership to undertake environmental research and observations.
Part of the problem environmental researchers face is the broad interdisciplinary nature of the subject area. Biochemistry, earth sciences, physics and engineering all have their part to play, but that makes things difficult when it comes to searching for relevant data, whether that’s in research papers, online or elsewhere.
In the run-up to our joint project, the British Library surveyed 107 researchers working on flooding and found that they struggle with information filtering, particularly in a record-breaking year in which the over-abundance of water was on everyone’s mind. Searches return too few or – increasingly - way too many results to be useful. This problem makes a powerful case for working on ways to shape better search tools to support discovery of useful information in the literature.
Last summer our GATE team at the University of Sheffield, which focuses on developing open source software to solve text-processing problems, the British Library and HR Walingford started on a pilot project - EnviLOD - with funding from Jisc. It focused very specifically on flooding and flood defences and had the aim of developing new and more intelligent search tools to help researchers dig useful nuggets of data out of the vast amount of environmental science literature.
We’re trying to create a practical and publicly available, shared vocabulary enrichment web service for the environmental science community to use on its content, and to use accessible databases containing geospatial information, such as the Linked Open Data (LOD) resources - GeoNames and DBpedia .
The EnviLOD text mining tools can determine that the name ‘Wytham Woods’, for example, refers to a place that is both in Oxfordshire, and in South East England. It enables researchers who are hunting for documents on flooding in South East England to locate relevant documents about Wytham Woods, even if South East England is not referred to explicitly in the text.
We plan to make the EnviLOD intelligent search tool freely available in 2013 and, before we do that, we wanted to give people a chance to try it and to give us their feedback. I was delighted that our workshop at the British Library in January attracted 22 participants including environmental scientists, developers and potential service users such as the Press Association and the British Red Cross. Inevitably, copyright issues meant we had to limit the scope of our demonstration, but in the longer term we are hoping to deploy the EnviLOD tools in such a way that many thousands of collections of data can be made available in enriched form, through the forthcoming Envia information discovery platform.
Workshop participants tried the EnviLOD intelligent search tool alongside keyword search, and the results were largely very positive, though the reservations they shared about the interface gave us some practical areas for improvement. We’re working now on ways to address some of these concerns prior to the final tool release.
This is hugely exciting not just for the field of environmental science as researchers grapple with confusing and frequently contradictory climate change data, but for other disciplines too. The text-mining and intelligent search technology that we have deployed has valuable applications in biomedical research and a host of other special interest areas.
If you’re interested in climate change science or in text-mining and its potential to enrich research, it’s worth taking a look at the EnviLOD blog. You could also have a look at last year’s Jisc report on the values and benefits of text-mining, and keep an eye out for the forthcoming EnviLOD tool release.
Image: CC BY flickr/nasa_ice