This is LUMI, Europe’s Most Powerful Supercomputer and a Benchmark in AI

We visit the facilities of the largest supercomputing centre in Europe and fifth largest in the world.

LUMI (Large Unified Modern Infrastructure) is the most powerful supercomputing centre on the European continent and fifth in the world, according to the Top500 list of June 2024. It is located in the small town of Kajaani in central Finland, just under 600 kilometres north of the Finnish capital Helsinki.

Silicon was invited to get a first-hand look at the facility and all the features that have turned LUMI into a supercomputing powerhouse, with the combined power of 1.5 million state-of-the-art laptops in the space of two tennis courts. It must be said that this is a fairly rough comparison to illustrate the magnitude of this centre, which is not easy to grasp.

The origin of the LUMI project

LUMI is part of the EuroHPC JU (European High-Performance Computing Joint Undertaking) initiative, a joint effort by the European Union, the CSC (Finnish Science Centre) and several member states (in this case Finland, Belgium, Denmark, Estonia, Norway, Poland, Czech Republic, Sweden and Switzerland) to build a world-class supercomputing infrastructure in Europe. EuroHPC aims to position Europe as a global benchmark in the supercomputing race, promoting scientific advancement and industrial competitiveness.

The CSC facilities and the LUMI supercomputer in the city of Kajaani (Finland)

It was in 2019 that Finland was selected to host one of the most advanced supercomputers in Europe within the EuroHPC project. Kajaani was chosen as the ideal location for the project due to its geographical and energy advantages, such as its cold climate, which helps reduce cooling costs, and the availability of 100% renewable hydropower, making LUMI one of the greenest supercomputers in the world. Not surprisingly, the city is surrounded by lakes with several hydroelectric power plants that generate energy for the entire region.

Construction (as reported in Silicon at the time) began in 2020 in an industrial building that had previously served as a paper mill, but which, due to globalisation and low prices in South America, was forced to close in 2008. Shortly after its closure, the building was adopted by the CSC to house its data centre, and was the seed for the construction of LUMI.

The former paper mill that houses the LUMI supercomputer has space for future extensions

The facility was completed in 2021 and LUMI became operational at the end of that year in its first phase. Today, the supercomputing centre is in its third phase of deployment and at full capacity in multiple applications, as we will see below.

LUMI technical specifications

During our visit to the LUMI supercomputing centre, we had the pleasure of having Pekka Manninen, director of the LUMI Leadership Computing Facility, as an exceptional guide. He is responsible for the design and construction of the facility.

Pekka Manninen, director of the LUMI Leadership Computing Facility, during our visit to the supercomputing centre.

LUMI is built on the HPE Cray EX architecture, a system specialised in high-performance computing, and its configuration is based on GPUs and CPUs from AMD, the only manufacturer to develop both units for this type of workload. Manninen said the selected AMD MI250X GPUs are unique in their class due to the technical supremacy and performance per watt they deliver.

Specifically, the GPU partition (LUMI-G) consists of 2,978 nodes, each with a 64-core AMD Trento CPU and four AMD MI250X GPUs, for a total of 11,912 AMD GPUs.

The CPU partition (LUMI-C) has 2,048 dual-socket CPU nodes with 64-core third-generation AMD EPYC chips and between 256GB and 1024GB of memory. In total, more than 262,000 CPU cores.

LUMI (‘the queen of the north’ as its creators call it) and its various technical partitions

The system has an additional 32TB memory partition. On the storage side, LUMI consists of different tiers depending on the workloads. There is 10 PB of Flash storage for quick short-term access, 80 PB of longer-term traditional hard disk storage and 30 PB for data sharing and storage for the lifetime of each project.

All partitions (CPU, GPU and storage) are connected via 200 Gbit/s Cray Slingshot connections.

These and other specifications (its configuration is much more complex than what we have just reflected), have allowed LUMI to be placed fifth in the world in the TOP500 list, with a sustained speed of 379.70 PFlops/s and capable of reaching peaks of 531.51 PFlops/s.

As our readers will be aware, the Flop/s measure refers to the Floating Comma Operations per Second that a computer is capable of performing. It has become the benchmark for measuring the performance of high-performance computing systems. We have seen the GigaFLOPS (GFlop/s), the TeraFLOPS (TFlop/s), the PetaFLOPS (PFlop/s) and there are already supercomputers that have broken the ExaFLOPS (EFlop/s) barrier. In this case, LUMI is capable of performing more than 379 PFlops/s, or 379 quadrillion Floating Comma Operations per Second on a sustained basis.

LUMI’s applications

Supercomputing centres of this kind are designed to solve the largest and most complex computations humans face. While waiting for quantum computing to become a reality, high-performance computing is making remarkable progress in multiple fields such as scientific research, health and biomedicine, digital twins, artificial intelligence and machine learning.

And LUMI is no exception. It is designed to help solve the most complex problems in modern science: researchers can perform large-scale climate simulations, model the behaviour of subatomic particles and explore new frontiers in theoretical physics. It also allows the modelling of molecular interactions and simulations that accelerate drug discovery, as well as early detection of cancer and more efficient treatments to reduce its mortality rate.

The shell housing the LUMI supercomputing centre

For example, during the COVID-19 pandemic, LUMI played an important role in modelling the spread of the virus and researching possible treatments.

More recently, the company ICEYE uses LUMI’s computational capability to analyse radar-generated data from its microsatellite system in real-time and convert it into images of the scanned terrain, allowing it to detect fires, floods or other environmental disasters independently of weather conditions for such information.

Destination Earth Climate Adaption Digital Twin is a particularly relevant use case that is already running in the supercomputing centre. Basically, it is a new type of climate information system that can be used to assess climate change impacts and adaptation strategies at local and regional scales over several decades. It is a digital twin of the earth in which all kinds of circumstances and artefacts are simulated and analysed with unprecedented resolution, making it possible to anticipate almost any kind of natural catastrophe.

Other LUMI use cases that we were able to learn about during the visit to the centre are closely related to artificial intelligence, such as the development of a large, open language model for the scientific community, called OLMo. Because it is an open model, scientists from anywhere in the world can collaborate and extract the potential of a language model that already has 70 billion parameters since its first version was released earlier this year.

The role of the Finnish CSC

As mentioned above, the Finnish CSC is the organisation responsible for the maintenance, cooling and upgrades of the LUMI supercomputer, yet it also serves as a facilitator of scientific research and any other use that may be made of it. As a scientific institution, the CSC aims to facilitate its use by researchers, academic institutions and companies in the above-mentioned countries.

In this way, the CSC ensures that LUMI is available for projects of various kinds, such as those mentioned above, and other disciplines requiring high-performance computing capabilities.

This process is done through competitive applications, where the most promising projects that can benefit from the use of the supercomputer are selected.

Furthermore, the CSC’s role includes providing technical support to researchers, helping them to make the most of LUMI’s computing power.

AMD’s open technology at the heart of LUMI

Over the past two or three years, especially since ChatGPT burst into our lives as the most popular generative AI system, we’ve been talking and writing about artificial intelligence. The entire IT ecosystem, from manufacturers to independent software developers to integrators and distributors, has jumped on this bandwagon without hesitation and, as a result, technology is developing around it that is changing the lives of millions of people in the same way that smartphones did at the end of the first decade of the 21st century.

But artificial intelligence is not new. It has been in the works for several decades, right now these complex algorithms and large language models that require high doses of computational capacity can be executed thanks to the cloud and supercomputing centres like the one we are discussing in this article.

In all these systems capable of processing AI based on huge amounts of information and algorithms, the common denominator is the graphics subsystems, the GPUs, as well as the supports to efficiently manage all these processes and resolve the input and output requests typical of generative AI: the CPUs.

While NVIDIA has become an industry benchmark for the capabilities of its GPUs in processing these types of AI workloads, AMD is not lagging behind in this particular competition.

AMD’s strengths include the ability to design and deliver both GPUs and CPUs, its efficiency per watt consumed, and its strategy to develop and support an open software ecosystem through AMD ROCm, which allows developers to optimise AI and HPC workloads on AMD GPUs.

One of AMD’s bets was to create an open software ecosystem, ROCm.

Alexander Troshin, AMD’s EMEA HPC and Enterprise Product Marketing Manager for AMD, told us during our visit to the centre: ‘In the areas of AI and High Performance Computing you need to think of all the elements as a whole and not separately. Developing GPUs and CPUs and using an open ecosystem to get the most performance and efficiency out of them is critical to succeed in these complex projects.

The result of this AMD strategy greatly facilitates versatile and efficient AI implementations, both at the machine learning and inference levels, the two main tasks in these workloads. Not only AMD says so, but the TOP500 organisation itself, where two of the five most powerful supercomputers worldwide are built with AMD technology. The fifth is LUMI, as we have emphasised above, while the first on this list, the Frontier system, was the first to break the ExaFLOP barrier (1.2 EFlop/s), a milestone in the history of supercomputing.

Alexander Troshin, director of AMD EMEA HPC and Enterprise Product Marketing, AMD, sporting the latest generation of AMD Instinct MI325X GPUs.

This is just the beginning. Over the coming months, we will see combinations that will break new ground with the combination of 5th generation AMD EPYC CPUs, AMD Instinct MI350 GPUs (expected in 2025) and connectivity for UALink and Ultra Ethernet HPC environments.