In collaboration with the SKA project, we demonstrated the successful use of EESSI to run radio astronomy analyses on the globally distributed SRCnet infrastructure. The SKA project faces an immense challenge as it must process and analyze an estimated 700 PB of data each year while operating across a globally distributed infrastructure. It is crucial to ensure that the right software is delivered to the correct locations with optimal performance in order to effectively handle this massive amount of data.
By deploying software across multiple SKA regional centres, including those in the Netherlands, Japan, Korea, and Canada, we showcased how EESSI enables seamless and efficient data processing. This proof of concept highlighted the flexibility of EESSI across a variety of systems, such as HPC, Cloud, and Kubernetes, meeting the complex requirements of the SKA’s high-performance data analysis needs.
As a proof of concept, we deployed various pieces of software that are normally used as part of a radio astronomy analysis pipeline (AOFlagger, Casacore, IDG, EveryBeam, DP3 and WSClean) through EESSI. This allowed the SKA regional centers to run this pipeline on any node of their distributed infrastructure without the need for downloading complete containers first. EESSI’s capability to optimize software for various CPU models and reduce network traffic and startup latency proved invaluable, which has been shown to deliver up to 30% performance improvements for certain use cases.
While EESSI may not be a one-size-fits-all solution for SKA, its key technologies can play an important role in helping to meet these demands. By adopting and integrating select components of EESSI, SKA can improve the efficiency and performance of its software stack.