Kindura
Overview
The broad aim is to pilot a strategic and joined-up approach
-
to the management of research information and research objects, and
-
the provision of compute resources for processing those research objects, which has the flexibility needed to cope with the variety of requirements encountered in real-life research scenarios,
through the creation of hybrid cloud infrastructure that combines the advantages of the commercial/external cloud with an institutional/consortium cloud. The Kindura pilot will provide cloud storage, which is more cost-effective for archival, because it does not have to be elastic. Users who need rapid elasticity will use other cloud resources, and migrate data to Kindura for cost-effective storage - and back again for processing. A migrator integrated with DuraCloud will help make this transition.
Aims and objectives
Kindura will develop a pilot cloud storage and computing solution at King's College and STFC that incorporates both external cloud providers (such as Amazon) and internal cloud infrastructure. The purpose of this pilot is to support researchers, both by managing research data (integrated with a Fedora digital repository system) and by allowing the execution of computation-intensive processing.
The project aims to deploy a pilot hybrid cloud infrastructure as a data storage and compute platform for research. The following aspects will also be studied:
-
Security and confidentiality.
-
Authentication.
-
Data provenance.
-
Cost implications.
-
Performance and latency.
-
Standards.
-
Value to researchers.
-
Metadata creation.
Project methodology
-
Study user requirements across a range of disciplines, including environmental science, financial mathematics, biomedical science and the humanities, that require management and processing of large datasets.
-
Install and evaluate DuraCloud, an open source application developed by the DuraSpace organisation that enables multiple cloud and cloud-like services to be accessed through a common REST API.
-
Integrate cloud storage and compute providers with DuraCloud including iRODS, NGS cloud, commercial providers (Amazon AWS, Azure) and private cloud infrastructure (based e.g. on Eucalyptus).
-
Investigate the use of the grid storage solution iRODS as a cross-institutional private cloud storage solution and the provision of tape backups for this service.
-
Develop brokerage services to provide mediation between cloud storage providers according to security, financial and other constraints.
-
Integrate DuraCloud with the Fedora Commons open source repository application to provide long term storage facilities for researchers.
-
Integrate common services for replication, data transformation and digital preservation with DuraCloud to provide archival services for research data.
Anticipated outputs and outcomes
Kindura will pilot a model of ICT provision that uses both cloud and shared services, with particular reference to repository systems and the management of research outputs. Specific outputs include:
-
Prototype hybrid cloud storage infrastructure.
-
Use cases, requirements and results of evaluations with researchers.
-
Evaluation of emerging technologies and standards associated with cloud computing.
-
Analysis of the financial implications of migrating to hybrid cloud services to support research.
-
Analysis of the issues associated with storing data in the cloud including security and confidentiality.
Technology / Standards used
SNIA CDMI, OGF OCCI, iRODS, DuraCloud, Fedora Commons.