A real-world disaster recovery scenario
How the Janet Network enabled a transfer of 790TB of research data in just 27 hours.
When a technical incident necessitated the restoration of part of a vast environmental science archive, Matt Pritchard (JASMIN user services and operations manager) and his team at the Centre for Environmental Data Analysis (CEDA) knew the response would test more than just local systems. They needed to restore nearly a petabyte of critical research data as quickly and safely as possible.
Matt said:
“We're in the process of migrating all the CEDA archive data to some new storage. During the process of that migration, we had a technical issue and the online copy of some of that data was accidentally removed. It wasn’t lost – we knew we could restore it from tape, but that would be a slow task, and we had an urgent need to make it available to users again as soon as possible.”
Thanks to the capacity, performance, and international reach of the Janet Network, that recovery happened in little more than a day.
Supporting data-intensive science at scale
We operate the Janet Network, the UK’s dedicated research and education network, connecting universities, colleges and research organisations with high-capacity, secure, reliable and resilient connectivity designed for data-intensive work.
One of the largest research sites on Janet is the Science and Technology Facilities Council (STFC)’s Rutherford Appleton Laboratory (RAL) on the Harwell Campus in Oxfordshire. Based there, CEDA provides vital data services for environmental science, including stewardship of major national datasets through the Natural Environment Research Council (NERC) Environmental Data Service.
CEDA’s services run on JASMIN, the UK’s high-performance data analysis platform for environmental science, jointly operated by two teams in STFC: the CEDA team in RAL Space, and infrastructure specialists in STFC Scientific Computing.
A real-life disaster recovery activity
Following the incident, the team at CEDA faced a clear choice. A local tape backup of the data existed, but restoring close to one petabyte that way would take a considerable amount of time. Instead, Matt and his colleagues realised that the data in question were geo-replicated to a partner research institution in the US, part of a large international collaboration to facilitate access to climate model data, Earth System Grid Federation (ESGF).
Retrieving the data over the network would significantly reduce recovery time and, just as importantly, act as a real‑world test of CEDA’s infrastructure and the research networks - including Janet and its international connectivity - it depends upon.
From Matt’s perspective, success would depend on two things working in harmony: carefully configured local systems and a high‑capacity, reliable international network path.
Matt said:
“We set up a bulk transfer of exactly the missing data but using tools that help move the data in a highly parallel way, as efficiently as possible. Essentially, that's the context to it, that it was a kind of real-life disaster recovery activity. But it really helped us out.”
Moving hundreds of terabytes across the Atlantic
The transfers ran across five dedicated data transfer nodes on JASMIN, each equipped with dual 100Gb network interfaces, and a similar setup at the source institution. To optimise performance, the team used Globus data transfer software. The implementation of CEDA’s research data transfer zone ensured a performant, ‘friction free’ end-to-end data path, following similar principles to ESnet’s Science DMZ model, a design approach that minimises bottlenecks between storage systems and high-speed networks.
Once the transfer began, Matt and his colleagues were pleasantly surprised at the initial rate. It then progressed over a continuous 27‑hour period, as approximately 790TB of data flowed across the Atlantic back to RAL, with an average throughput of around 65Gbps. For Matt and his team, the speed and reliability of the transfer was impressive.
Matt said:
“The outcome was almost unexpectedly good, really. We'd set it going one evening. We dived into the dashboard to have a look and we were amazed to find out how well it was doing: we had anticipated a few weeks to complete, but in fact it looked like it would finish in just over a day. Other than creating the batches of tasks, we hadn't done anything special for this particular transfer, but it builds on top of the data transfer zone infrastructure that has been built over many years."
The role of Janet’s high-capacity backbone
The RAL campus benefits from the largest Janet connection of any UK site, with 400Gbps of resilient capacity supporting a broad range of scientific activity, from environmental data analysis to particle physics and synchrotron research.
Across the UK, Janet’s core network includes backbone links of up to 800Gbps and a 400Gbps resilient path into the pan-European GÉANT research network, which in turn connects to partner research and education networks worldwide, including in North America, where the CEDA data resided.
For Matt and his team, this end-to-end research networking ecosystem enabled the sustained, high-performance transfer required for the recovery.
Enabling global research collaboration
As the data continued to arrive, it became clear that the exercise represented more than a successful disaster recovery story. It illustrated how dependent modern research has become on the ability to move vast datasets quickly and reliably between organisations and across borders. For environmental science in particular, where satellite observations, climate models and long-term monitoring generate enormous volumes of data, network performance can directly influence the pace of discovery.
Watching the final datasets complete their journey over Janet and its international counterparts, Matt saw clear evidence that high‑performance networking allows researchers to collaborate effectively with colleagues around the world, and access the data they need, without relying on slow physical media transfers.
Matt said:
“We’re benefitting from many years of collaboration with both the ESnet and Janet guys, so this was a great opportunity to try out this capacity for real. The great thing is that any of our users can benefit from using the same approach in their own data transfers, with minimal extra effort or expertise.”
From potential to proven capability
The successful data restoration demonstrated that the Janet network, together with international research and education networks, can support extremely demanding real-world scenarios, not just theoretical benchmarks.
Combined with well‑designed local infrastructure and proven best‑practice tools, the network supported one of the most demanding real‑world scenarios CEDA had encountered. For Matt and the team, it reinforced an important lesson: with proper planning and the right partnerships, moving hundreds of terabytes of data, even across continents, is entirely achievable.
Further information
Join the research network engineering (RNE) community group and access the recording of a talk on the technical details behind this transfer.
If you’re new to Jisc and would like to speak to one of our experts regarding a Janet Network connection, make a customer enquiry.
Alternatively, if you’re an existing member or customer, please contact your relationship manager.
Continue reading
Sign up to Headlines
Stay at the forefront of technology in education and research with our tailored fortnightly newsletter.
