Getting your Trinity Audio player ready...
In October 2023, El Capitan’s doors were installed for an Open House to show visitors what the supercomputer will look like when fully installed. Photo by Garry McLeod/LLNL.

Rows of tall black cabinets, webbed with multi-colored wires and hoses, lined the computing floor. 

A clear pane separated us. That and another security door, warning of the noise beyond.

Once opened, the whirring was apparent.

It was mid-January and the second floor of LLNL’s Computing Center, Building 453, was alive with noise.

Contributing to the sound was El Capitan, LLNL’s forthcoming supercomputer, which was undergoing some equipment testing amid functioning systems on the floor. 

El Capitan is expected to be the world’s fastest supercomputer, or high-performance computing system, when it becomes operational this year in late summer or early fall.

The system will enable researchers from the National Nuclear Security Administration weapons design laboratories to create models and run simulations, previously considered challenging, time-intensive or impossible, for the maintenance and modernization of the United States’ nuclear weapons stockpile.

The nuclear security research is necessary to ensure the stockpile is safe and reliable, ultimately to maintain nuclear deterrence in the absence of underground nuclear testing, according to lab officials.

In 2019, the U.S. Department of Energy’s NNSA signed a $600 million contract with Cray Inc. (acquired by Hewlett Packard Enterprise), a designer, manufacturer and servicer of supercomputers, to create the NNSA’s first exascale supercomputer (the DOE’s third exascale-class supercomputer): El Capitan.

Delivery of El Capitan’s components began in summer 2023. Seen here are some of the supercomputer’s first cabinets to be installed, 2023. Photo by Garry McLeod/LLNL. Credit: Garry McLeod

The supercomputer is projected to operate at over 2 exaFLOPs – 2 quintillion (10^18) floating point operations per second – at peak performance, according to Jeremy Thomas, public information officer at LLNL. This means it’s anticipated to compute calculations, like addition and multiplication, at 2 billion billion calculations per second. 

To put this speed in perspective, an exascale machine operates about 1 million times faster (or more) than the average home system, according to Thomas. In terms of other supercomputers, it’s expected to operate 10-15 times faster than LLNL’s fastest supercomputer, Sierra, which runs at 125 petaFLOPS (1 quadrillion floating point operations per second) at peak performance, according to Thomas.

Scientists from the NNSA weapons design laboratories – LLNL, Los Alamos National Laboratory and Sandia National Laboratories –  will use El Capitan’s power to design weapons, parts and delivery systems to meet the changing demands of nuclear deterrence, according to Rob Neely, head of the advanced simulation and computing program at LLNL. Scientists will also be able to simulate and run calculations on existing weapons that are aging beyond their intended lifespan, to ensure their safety and effectiveness. 

The supercomputer’s purpose aligns with the NNSA’s Stockpile Stewardship and Management Program, which aims to maintain and modernize the nuclear stockpile, according to the NNSA’s website. 

El Capitan will enable scientists to combine high resolution (accurate), 3D modeling and ensemble calculations (multiple calculations with slight variation that allow researchers to understand the simulation’s sensitivity to uncertainties like environmental conditions and modeling errors), according to Neely.

“Previous machines have allowed us to explore one or maybe two of those axes at a time, which has been hugely valuable and helps us continue to advance the science. But El Capitan is where we plan to bring it all together for the first time – the culmination of 30 years of effort,” wrote Neely.

El Capitan is expected to reach unrivaled speed thanks to its AMD MI300A APUs (accelerated processing units) and high-speed network connections.

AMD’s MI300A APU architecture includes coupled GPUs (graphics processing units) and CPUs (central processing units) which share memory. The memory-sharing eliminates the need to transfer or copy data between processors, expediting the system’s processing, according to Neely. 

These APUs are paired on compute nodes, which simultaneously run different parts of a job. A high-performance network, composed of miles of cabling, connects these nodes, allowing them to communicate at super speeds, according to Neely.

A contractor works on a rack of El Capitan. The racks will hold compute blades, which contain the AMD MI300A APUs, Oct. 2023. Photo by Garry McLeod/LLNL.

El Capitan’s GPU-heavy architecture will make it prime for exploring AI techniques, though it was not originally designed to do so, Neely said. 

“AI is this new and emerging approach to computers where the computers actually learn, not from a human sitting down and telling it exactly what to do, but instead by giving it lots and lots of data or examples and it using these complex networks to learn how to do something,” Neely said. 

He expects initial research on El Capitan to follow a hybrid approach: using AI techniques to analyze the results of a traditional simulation and using machine learning (a type of AI) to improve models, in a combination termed cognitive simulation. Over time, he expects to see the workload increasingly take advantage of AI, as its reliability makes itself clear.

Getting this far has been a haul, including a $100 million Exascale Computing Facility Modernization project at LLNL to increase energy and water supply to the computing center.

When El Capitan becomes operational, it will be in an open research phase for assembly, testing and work (like fusion simulation) for five to six months before becoming a classified system, according to Neely.

Until then, El Capitan awaits delivery of compute blades (which contain the AMD MI300A APUs and memory); installation of the remaining equipment; testing of hardware, software and system and acceptance of the final system, according to Terri Quinn, deputy associate director for high performance computing at LLNL.

The mid-January tour of the supercomputer made its imminent life clear. From the rush of fluid traveling through the supercomputer’s cooling system to the network cables strung through its cabinets, El Capitan was well on its way to deployment.

LLNL officials said that after El Capitan deploys later this year and undergoes assurance testing, dignitaries, local elected officials and high-level representatives from DOE/NNSA, HPE, AMD and others will celebrate the supercomputer in a dedication and ribbon-cutting event.

El Capitan, expected to be the world’s fastest supercomputer, will be housed at LLNL’s Computing Center, Building 453, Jan. 2024. Photo by Jude Strzemp.

Leave a comment