Submission, preservation & exposure of Chemistry teaching & research data from theses (SPECTRa-T)
Final Report available
Overview
Chemical information is essential to many sciences outside chemistry, including materials, life and environmental sciences, and supports major industries including pharmaceuticals. The reporting of the synthesis and properties of new chemical compounds is central to this. Although the essentials of syntheses are published in peer-reviewed journals, the detailed experimental recipes (as found in a thesis) are often omitted. Moreover the chemical metadata which would help classify the thesis is not normally provided.
The project will develop text-mining tools that address the need to extract and classify the wealth of experimental research data currently untapped in chemistry theses.
Aims and objectives
The aims of SPECTRa-T are to investigate the needs of the academic chemistry research community with respect to how data associated with theses may best be managed and to facilitate routine and automatic extraction of domain-specific data, its transformation into metadata and ingest into institutional repositories.
SPECTRa-T will realise these aims through the development of automated validation and indexing tools specific to crystallography, computational chemistry and synthetic chemistry, and providing interfaces with Open Standards-compliant repository platforms.
Methodology
We shall automatically extract domain-specific data and metadata using text-mining OSCAR3 software to identify chemical data and objects. These will be converted to CML (Chemical Markup Language) and thence expressable as nodes in W3C SKOS (Simple Knowledge Organisation System) format to enable semantic querying of the deposited data in institutional repositories .
Anticipated outputs and outcomes
- A portable, free/Open Source, desktop tool through which chemists can extract, classify and deposit data contained within a chemistry thesis
- Allow automatic checking of the technical correctness of a thesis
- Encourage chemists, and through their example other researchers, to appreciate the value of their data by depositing research materials in their institutional repositories
- Strengthen the role of libraries, as managers of institutional repositories, by demonstrating their willingness and ability to develop services in response to researchers’ needs