Advertisement

If you have an ACS member number, please enter it here so we can link this account to your membership. (optional)

ACS values your privacy. By submitting your information, you are gaining access to C&EN and subscribing to our weekly newsletter. We use the information you provide to make your reading experience better, and we will never sell your data to third party members.

ENJOY UNLIMITED ACCES TO C&EN

Computational Chemistry

What exascale computing could mean for chemistry

A new generation of supercomputers will perform more than a quintillion calculations per second. With that computing power, chemists could run faster simulations of bigger molecular systems over longer time frames

by Ariana Remmel
September 2, 2022 | A version of this story appeared in Volume 100, Issue 31
A view looking at one corner of a the Frontier supercomputer. The machine's black cabinets receed into the background in a bright, white room. The back of these cabinets have been removed to show red and blue hoses.

Credit: Oak Ridge Leadership Computing Facility at ORNL | The Frontier supercomputer at Oak Ridge National Laboratory is the first of a new generation of machines that will help chemists take on more complex molecular simulations than ever before.

 

In brief

Computational chemists are getting ready to run experiments on the fastest supercomputers ever built. This year, Oak Ridge National Laboratory debuted a supercomputer called Frontier that is the first to officially calculate more than a million trillion floating-point operations per second, breaking what is known as the exascale computing barrier. The outsize computing power of Frontier and other exascale machines soon to become available could help chemists simulate molecular models with more atoms, with greater complexity, and on longer timescales than ever before. This capability will push scientists to provide more experimental data to validate the computer’s findings, which could reshape our understanding of chemical theory.

Surrounded by lush forests, Oak Ridge National Laboratory sits in a secluded mountain valley in eastern Tennessee. Visitors often travel to the federal research complex to tour historic sites where scientists made some of the first forays into nuclear physics and fission reactions in the 1940s. Now a windowed corner of the Oak Ridge Leadership Computing Facility lets the laboratory’s visitors peer into a new scientific wonder: a bright, white room filled with rows of sleek, black cabinets that each hold more than 3,600 kg of high-performance computing hardware. The banks loom over the engineers who weave between the rows as they work. The computer’s name is written in prominent letters across the cabinets. This is Frontier, the most powerful supercomputer ever tested and the first of a new generation of machines.

Some of the latest processors available in a desktop computer can perform more than 100 billion floating​-point operations per second, a unit of computational power called flops. This gigascale computing power allows our personal devices to crunch numbers in spreadsheets, keep track of the ever-growing number of tabs in our web browsers, and render hundreds of thousands of polygons in 3D molecular models. Frontier is the first supercomputer to officially break the exaflops computing barrier, performing more than 1 million trillion operations per second.

With this leap in computing power, Frontier and other exascale computers in production around the world could enable unprecedented breakthroughs in many scientific fields. For example, it could allow people to study turbulence in aeronautics, the expansion of the universe, and the secret lives of nuclear particles. “It is an instrument that has more in common with the James Webb Space Telescope and the Large Hadron Collider than it does with the [computer] that’s on your desk,” says Bronson Messer, the director of science for the Oak Ridge Leadership Computing Facility at Oak Ridge National Laboratory (ORNL).

Exascale computers like Frontier also signal a new era for chemistry, says Theresa Windus, a computational chemist at Iowa State University and Ames National Laboratory. When Windus first started in computational chemistry, state-of-the-art simulations could perform calculations on only a subset of interactions between small molecules composed of maybe 10 heavy atoms in total. In the last few decades, it has become almost routine for supercomputers to model chemical systems with tens of thousands of atoms at a time.

It is an instrument that has more in common with the James Webb Space Telescope and the Large Hadron Collider than it does with the [computer] that’s on your desk.
Bronson Messer, director of science, Oak Ridge Leadership Computing Facility

Exascale computers will allow chemists to simulate even bigger molecular systems and over longer timescales. The data collected from those models could help researchers invent novel fuel sources and design new climate-resilient materials while providing scientists with new insights into chemical theory. “We’re at a point where we can start really asking questions about what is it that’s missing in our theoretical methods or models that would get us closer to what an experiment is telling us is real?” Windus says. Exascale supercomputers could push the boundaries of what’s possible in chemistry by narrowing the gap between the reactions in a flask and the virtual simulations used to model them. “It’s an amazing time to be a chemist,” Windus says.

Breaking the exascale barrier

The bright, white room that houses Frontier is divided roughly into fourths. The looming supercomputer cabinets take up over 680 m2—greater than the size of a basketball court—in one quadrant, while the adjacent quadrant is filled with a second bank of cabinets that will store around 700 petabytes of data. This file storage system will feed data into the supercomputer’s calculations. The third quadrant is a miniature warehouse of chrome shelves stacked with boxes holding the specialty hardware that makes Frontier run. “It is a machine that is full of more than a handful of technologies that are first of its kind,” Messer says.

Many of the computer’s components are bespoke to its needs, including the graphics processing units (GPUs) that account for much of Frontier’s computing speed. GPUs, which also power the graphics cards in desktop machines and high-end gaming systems, can run many calculations in parallel; in contrast, a traditional central processing unit (CPU) works in series. Many modern supercomputers use a combination of both GPU and CPU components to help them tackle hefty calculations efficiently. Each of Frontier’s 74 cabinets is packed with 128 nodes that comprise one CPUs and four GPUs each.

All this powerful hardware helped Frontier earn first place on the June 2022 edition of the Top500 list, the official ranking of high-performance computers. The list is based on a set of benchmarks for the machine’s ability to solve a standardized set of linear equations. Frontier reached maximum performance of 1.1 exaflops, with a theoretical peak closer to 1.7 exaflops. That’s approximately three times as fast as the Top500’s second-place machine, the supercomputer Fugaku, which was completed in 2021 in Japan.


How fast is exascale computing?
A new generation of supercomputers will break the exascale computing barrier and perform more than 1 quintillion floating-point operations per second (flops). Here is how that computing power compares with that of other machines.
An infographic comparing the speed of five computers: Frontier in first place on the Top500 list, Summit in forth place, a computer at the bottom of the list, an Xbox Series X, and a twelth generation Intel Core Processor in a fast desktop computer. If every person on earth could perform one calculation per second, it would take them less than a minute to do what the Intel processor could do, but around four years to do the same as an exascale computer like Frontier.
Sources: Top500, Intel, XBox.com.

a Calculated by taking the number of operations the machine can perform in 1 s and dividing by the approximate number of people on Earth (7.95 billion). The resulting time in seconds was then converted to minutes, days, months, or years. b The Top500 list is the official ranking of the most powerful supercomputers in the world. c A personal computer running on a 12th-generation Intel Core processor
# Machine Computing Speed (Teraflops) If every person on Earth performed 1 calculation per second, it would take about ___ to do what this computer can do in 1 s.
1 Frontier and upcoming exascale supercomputers More than 1 million 4 years
2 Summit, 4th place in Top500 list 200,000 10 months
3 Supercomputers at the bottom of the Top500 list 2000 3 days
4 Xbox Series X 12 26 mins
5 The most recent generation of Intel processors in personal computers 0.1 <1 min

ORNL visitors can take a short walk down the hall from Frontier to check out Summit, the computer currently in fourth place on the Top500 list. This hybrid GPU-and-CPU supercomputer has a theoretical peak performance of 200 petaflops, or 200 quadrillion operations per second. Summit took first place on the Top500 list when it debuted in 2018, just 10 years after the first supercomputer broke the petascale barrier. At the start of the COVID-19 pandemic, Summit allowed scientists to visualize the 305 million atoms in the now-iconic models of the whole SARS-CoV-2 virus. A research team led by Rommie Amaro at the University of California San Diego and Arvind Ramanathan at Argonne National Laboratory received the Gordon Bell Special Prize for High Performance Computing-Based COVID-19 Research in 2020 for that work (Int. J. High Perform. Comput. Appl. 2021, DOI: 10.1177/10943420211006452).

A computer simulation of the delta SARS-CoV-2 virion. It looks like a rough, purple ball with grey and yellow splotches. About a dozen spike proteins stick out from the ball, somewhat evenly spaced. Thes are tappered toward the attachement point to the ball.
Credit: Lorenzo Casalino, Abigail Dommer (Amaro Lab, UC San Diego)
Researchers in Rommie Amaro’s lab used Nanoscale Molecular Dynamics code on Oak Ridge National Laboratory's Summit supercomputer to simulate a molecular representation of the Delta variant SARS-CoV-2 viral particle. The model contains 305 million atoms that comprise the viral membrane (purple), membrane proteins (gray and yellow), and spike proteins (blue).

By current estimates, Frontier’s computing power should be 5–10 times that of Summit, Messer says. Yet because of a new, less-energy-demanding cooling system, Frontier’s energy needs do not scale with that added computing power. Red and blue hoses weave their way through each cabinet like veins pumping about 22,700 L of water per minute through nearly every component in the machine. Warm water enters the cabinets at about 29 °C and absorbs heat that the computer’s hardware produces as it crunches numbers. By the time the water flows away from the machine, its temperature has risen above 38 °C. The water gets chilled via evaporative cooling before reentering the cabinets. The new warm-water cooling system helped earn Frontier a top place in the Green500 list, a companion to the Top500 list that ranks supercomputers according to energy efficiency.

More exascale machines are expected to join Frontier at the top of the Top500 list in the coming years. In the US, the supercomputer Aurora at Argonne National Laboratory and El Capitan at Lawrence Livermore National Laboratory are expected to become available in 2023. All three of these US supercomputers were developed through an initiative within the US Department of Energy called the Exascale Computing Project. Meanwhile, researchers in China have already begun installing multiple exascale machines that have yet to be evaluated by the Top500 list, according to a report by HPC Wire. In June, the European High Performance Computing Joint Undertaking announced that a supercompter facility in Germany has been chosen to host the European Union’s first exascale machine. It’s the exascale future that chemical scientists have been awaiting and planning for.

Bigger, Longer, Faster

Bronson Messer points to red and blue hoses in an open cabinet of Frontier.
Credit: Ariana Remmel
Bronson Messer (shown) describes Frontier's warm-water cooling system as a key innovation that allows the supercomputer to excel in both computing power and energy efficiency.

Because Frontier and these other supercomputers can push past the exascale barrier, chemists will be able to run bigger, longer simulations of molecular systems at record speeds. Data from these models could help researchers analyze features of chemical systems they haven’t seen before. In addition to leading the way on building these record-breaking machines, the Exascale Computing Project and supercomputer user facilities have been working with computational chemists to ensure that many of the codes used to simulate molecular systems are ready to harness the power of exascale computing from the get-go.

Windus leads one such project. She and her team are working on a program called NWChemEx that simulates large chemical systems used in biofuel research (Chem. Rev. 2021, DOI: 10.1021/acs.chemrev.0c00998). The researchers have been rewriting an earlier version from scratch so that it can take full advantage of exascale hardware to create models of complex chemical environments, such as systems that contain multiple phases of matter. “This enables us to look at really detailed kinetics—for example, of catalysts working inside of different zeolites or metal-organic frameworks,” Windus says. With funding from the US Department of Energy, Windus and the NWChemEx team will begin their exascale journey by studying the transformation of propanol to propene through a zeolite material perfused with solvent.

The Nanoscale Molecular Dynamics (NAMD) code is another one of the first programs that will run on Frontier. NAMD works by defining the position of a set of particles in space, such as the atoms within a protein, and calculating the interactions and forces between them. Through these calculations, the program can determine how the particles might move over a small interval of time and then repeat the calculations according to the new positions. It’s a powerful tool for capturing how large molecules move in biological systems, says Emad Tajkhorshid, a biochemist and biophysicist at the University of Illinois Urbana-Champaign who leads the NAMD team.

But these systems tend to be quite large and complicated, Tajkhorshid says. “You’re dealing with hundreds of thousands or millions or billions of atoms, even if you’re looking at a very small fraction of a cell,” he says. “And then between all of those atoms, you have to recalculate all these forces and calculate how they move and with very small time steps,” he says. As a result, these simulations require a whopping number of time-consuming calculations to see even a few nanoseconds of molecular movement. Meanwhile, most biologically interesting interactions between molecules occur on millisecond timescales or longer.

“There’s a huge difference there, and that’s what we’re trying to overcome,” says David Hardy, a computer scientist and lead developer for NAMD at the University of Illinois Urbana-Champaign. NAMD was a crucial program in simulations of SARS-CoV-2, and Hardy was among the researchers recognized in 2020 for bringing the model to life. Now he and his colleagues hope that exascale computers could allow researchers to use NAMD to not just simulate larger biological systems but also model them over longer times with enough detail to match what scientists can observe in the lab.

Exascale computers’ GPU resources also could lead to breakthroughs for researchers looking to apply artificial intelligence to chemical problems. Because GPUs can run computational tasks in parallel, they are well suited for machine learning, says Victor Fung, a computational scientist at the Georgia Institute of Technology.

Machine learning has already made impressive strides outside chemistry. Fung points to the popularity of AI algorithms, such as Craiyon and Dall-E 2, that can generate elaborate images from users’ text prompts. People type in what they want an image to include, and the AI usually delivers. These programs are trained on hundreds of millions of labeled images and language-learning models that require a massive amount of storage and computing power to develop. Fung says that if chemists want to produce AI programs with comparable capabilities, they will need to train the systems on similarly massive data sets that are detailed enough to capture the complexity of real materials. With the power of exascale computers, scientists like Fung hope to one day ask a chemically savvy AI to suggest the best inhibitor for an enzyme or a carbon-coupling catalyst that works at a specific temperature and have the algorithm propose a selection of structures that could fit that description.

The added power of exascale machines could also uncover previously hidden complexities in reactions. ORNL researchers Ada Sedova, Vyacheslav Bryantsev, and their colleagues have been interested in studying molten salts for the chemical separation of fission products from nuclear fuels, such as uranium dioxide. These salts have melting points that can exceed 1,000 °C and have historically been used to generate energy in nuclear fission reactions. When these salts turn to liquids, the ions form myriad, fleeting coordination complexes that are tricky to measure using standard spectroscopic techniques, Sedova says. Sedova and Brayntsev say computer simulations can help chemists glean new details about the material’s properties with methods that complement data obtained from challenging and costly lab experiments, but current computational models are still too simplistic. Exascale computers will make it possible to create more complete models of molten salts to better interpret these spectra, they say.

Advertisement

David Bross, a chemist at Argonne National Laboratory, thinks that having more complete models of chemical systems also could help researchers design more efficient catalysts. Up to this point, limited computing power has hindered the numbers and kinds of variables that chemists can calculate when simulating reaction kinetics and thermodynamics. “But if you can calculate everything, you may discover side reactions that are important. You may discover things that you weren’t able to see on the first pass,” he says. His team is developing new workflows so that the software available for exascale computer users helps researchers get a more complete picture of reactions. These insights could help researchers home in on what factors matter most for designing catalysts, which could also highlight gaps in theory where more experimental data are needed, he says.

Challenges ahead

Frontier’s first users will likely be scientists who already have experience working on petascale supercomputers. Some programs will be ready to run on exascale computers on day 1, but there could be a steep learning curve for scientists running brand-new programs on first-of-its-kind machines.

Also, chemists will be competing with scientists from other fields to run experiments on Frontier, and later Aurora at Argonne National Laboratory and El Capitan at Lawrence Livermore National Laboratory. Anyone in the world can submit a proposal to DOE programs to conduct research on these exascale machines, but priority will likely go to projects that can demonstrate the need for exascale capabilities, as opposed to Summit or other petascale systems. Despite the hardware differences between the three DOE computers, the decision of which system to use may ultimately come down to which one has the shortest queue, Hardy says.

If we don’t have a direct connection to experiment, then we’re just playing very large video games.
Bronson Messer, director of science, Oak Ridge Leadership Computing Facility

Even with these machines’ impressive computing power, the simulations and models that run on exascale computers won’t be useful without experimental data to help validate and refine the computational data. “If we don’t have a direct connection to experiment, then we’re just playing very large video games,” Messer says. All those experimental data will also need to be stored in formats that can be accessed by these machines, which will themselves produce massive data sets. “At some point, it becomes impractical for people to download the data, analyze them, visualize them, and save them somewhere,” Tajkhorshid says. The NAMD team has been thinking about ways that scientists can visualize molecular dynamics simulations generated on supercomputers remotely from their office computers. These researchers also want to find more efficient ways to compress data for storage while researchers work on follow-up experiments and finish writing manuscripts on the data.

But most importantly, exascale computers simply cannot replace human chemical intuition. Heeding the proverb “garbage in, garbage out” becomes even more important when considering the amount of data that scientists working on these systems will have to sift through, as well as how little time they may have to work on these machines that could be in high demand. Chemists may need to be more thoughtful about how they design their computational experiments to avoid going down virtual rabbit holes that yield unreproducible results with no bearing on reality. Avoiding that mismatch requires combining computational theory, data from experiments, and the chemical know-how to spot the gaps.

Building Frontier, especially given the logistic challenges of the COVID-19 pandemic, was a feat unto itself, Messer says. But Frontier’s success will come when it is a crucial part of a breakthrough discovery, he says. “There has to be science outcomes that cannot have been achieved without us actually having the power of this computer,” Messer says. As he walks among the rows of cabinets, Messer speaks excitedly about what researchers might use the computer to do, capabilities that may emerge unexpectedly and unpredictably from projects that will be conceived only after the technology becomes available.

Just past the edge of the instrument, Messer points out a fourth quadrant of the room: a patch of white floor that sits obviously empty. This, he says, is where Frontier’s successor will go. Frontier is remarkable as it is, but its designers are already thinking of ways to improve it. Proposals for the next most powerful computer in the world are in the works.

Article:

This article has been sent to the following recipient:

0 /1 FREE ARTICLES LEFT THIS MONTH Remaining
Chemistry matters. Join us to get the news you need.