Data Curation for e-science in the UK
An audit to establish requirements for future curation and
preservation
The DTI and the Research Councils are committing £118M to a
government-industry programme on e-Science. The reason for this investment
is that GRID technology is seen as the natural successor to the world wide
web and the UK wants to take a leading role in order to develop solutions
for its scientists and developing opportunities for its industry.
The world wide web has revolutionised the way companies do business and
fundamentally altered people's personal lives but it can no longer cope
with the demands being placed on it by science. The world wide web allows
very easy access to information, Grid allows that same easy access to
computing power, data processing and communication of the results. The
opportunities are immense, it will allow the efficient manipulation of vast
amounts of information such as that contained in the human genome or the
results from experiments in CERN's new Large Hadron Collider. It will
also allow the ability to mine data again and again by comparing existing
data sets collected for one purpose with new and previously unrelated
information, so generating new knowledge.
This consultancy will establish the current provision and future
requirements for curation of primary research data being generated within
e-science in the UK. This will include the e-science core programme but is
anticipated to extend beyond this to other e-science research and primary
research data. A consultancy report will provide a synthesis of findings
and make recommendations for future action.
The consultancy will support aims to manage JISC involvement in
e-Science and the Research Grid, and to work in partnership to support the
research community through activities such as its digital preservation
programme.
To establish the current provision and future requirements for curation of
primary research data within UK e-science the consultancy will:
-
through desk-top research, synthesise:
-
existing reports to provide a context for the study;
-
existing practice, policy and guidance to provide an overview of the
current provision for curation of primary research data in the UK;
-
requirements within the relevant research communities for future
curation in the medium term (5-10 years) and long-term (10 years
plus);
-
It is anticipated that sources for this will include:
-
sources noted in 1.4 Further Information above;
-
relevant practice, policy and guidance from the e-science programme,
JISC, Research Councils, Arts and Humanities Research Board, Data
Centres and Services;
-
reports from the e-science sub-group of the Research Support
Libraries Group on disciplinary needs (copies will be made available
to the consultant).
-
through a combination of postal/telephone survey and interviews, audit:
-
Ownership and responsibility for long-term curation of primary
research data;
-
Perceived future value and re-use;
-
Provision made for future curation;
-
Grant conditions for curation and re-use;
-
Relevant guidelines, standards, tools, and funding available for
projects to prepare data for future curation;
-
Primary research data being produced by the e-science programme
projects and identify what kind of material (eg closed or dynamic
datasets) and how big (relatively) are they likely to be;
-
Other materials being produced or dependencies critical to their
future curation and re-use (eg metadata, technical documentation);
-
Procedures and standards followed in the creation and validation of
data and other materials;
-
report on:
-
Key findings and issues relating to current provision for
curation of primary research data in the UK;
-
Future curation requirements for e-science in the UK;
-
Recommendations to JCSR;
-
Proposed implementation plan and funding required for implementation
of recommendations and options outlined in the report.
Relevant reports from the
e-science Curation Task Force can be found on the Digital Preservation
website.