What is high-performance computing (HPC)?

Server racks forming a supercomputer inside a data center.
(Image credit: Getty Images)

IT today is all about interconnectivity, and that comes from a long history of combining computing resources in processing, networking, or storage. High-performance computing (HPC) takes that concept and supercharges it, aggregating computing power to achieve the maximum possible output.

Solving the big computational problems of the world is going to take a lot of grunt and HPC connects elements together to form a parallel processing grid that can solve them more quickly. With the ability to run trillions of calculations per second, the cloud computing and AI age make the perfect environment for HPC to flourish, providing results that are impractical or simply impossible in traditional computing.

"Generally, HPCs are huge computers in a large data center, but sometimes cheaper versions have been cobbled together out of repurposed gaming systems or even a network of distributed systems using consumer devices," says Eleanor Watson, IEEE member, AI ethics engineer, and AI Faculty at Singularity University.

It works by pitting specialized algorithms against a problem, breaking a dataset or processing task into smaller units dynamically, and sending them throughout an HPC network where processors can work on discrete parts of the overall problem. This is parallel computing, in which the constituent parts are worked out simultaneously and then put together at the end for a complete answer.

But it's about much more than just processor clusters. HPC is also about the highest possible networking technologies, so data transport is as fast and efficient as the processing. The sheer size of datasets to be broken into pieces and worked on means exponentially larger memory and storage needs across the architecture.

What are the use cases for HPC?

Imagine trying to keep track of the human population using a pad and pen – you couldn't write down changes as fast as they happen, let alone compute results or metrics.

The only possible way to compute such large and rapidly changing measures is to create a digital model, and certain digital models are so large they're too complex even for standalone computers or servers. That's where HPC steps in.

As Eleanor Watson puts it; "HPC allows scientists and researchers to tackle computationally intensive problems that are impractical or impossible to solve on traditional desktop computers."

Intensive problems often need simulations or analysis so we can find and isolate elements that make an impact and measure the impact they make. Assumptions or guesswork don’t cut it, and calculations on even sophisticated workstations can’t match the level of detail and speed achievable through HPC. Leveraging HPC at an enterprise level has resulted in better results, higher efficiency, and higher profits everywhere from aerospace and automotive to pharmaceuticals and urban planning and beyond.

In one example, HPC was pivotal when medical science was racing to find a vaccine for COVID-19. Supercomputers were used to predict the spread, model the structure of the virus, and develop the vaccines that had the best chance of working.

The Oak Ridge National Laboratory's Summit supercomputer played a critical role in identifying which treatment compounds might work by screening billions of virus molecule models.

But COVID modeling isn't the first example of HPC in medicine. Storing the genomic profiles of thousands of people lets it compare and identify genetic markers for diseases, helping scientists design personalized treatments.

Google DeepMind has used HPC since 2018 in a solution called AlphaFold that predicts and builds protein structures. Understanding the way proteins fold gives us more information to design drugs our bodies will respond to than ever before.

Modeling climate change accurately isn't just one of the most complex problems we have, it's one of the most urgent. The European Centre for Medium-Range Weather Forecasts uses HPC to build simulations for better forecasting, giving scientists important data about climate change over time and as much warning as possible about severe weather events.

Beyond prediction modeling, HPC can combat climate change by helping design better wind turbines and solar panels, both the best configurations in which to build and situate them and more efficient energy storage and transport systems.

The building blocks of HPC

In basic terms, HPC does the same job as your desktop PC, albeit on a vastly larger scale and with components that aren't always physically connected.

The first piece is processors on microchips taking parts of a problem and doing the calculations in parallel, their work is communicated between them and reassembled by the architecture into a cohesive result.

Second is storage, where the code not just of the application but the incoming data needs to be located. No single magnetic tape or solid-state disk (SSD) technology is big enough to handle HPC, so multiple storage media (disks, etc) must be networked together.

That means the networking technologies inside each disk (the mechanisms that store and retrieve bits) as well as those that connect them together to make a larger virtual system have to be fast enough to handle the volumes of data HPC demands.

And that, in turn, means bandwidth and transport protocols robust enough to parcel out and collect packets of information fast enough for processors to do their work. Large datasets can be hard enough to deal with, but in examples where constant changes have to be taken into account in real-time, that means networking has to perform at the highest level not just to ingest the initial dataset but receive constant updates.

The applications that modeled the COVID-19 genome or that can monitor solar weather activity are computationally huge, both in resources and data footprint and combined with the data they use can rarely fit on a single disk or server.

That then leads to the final piece of the puzzle – the purpose-built systems that orchestrate everything. Taking into account the power needs, network availability, storage space and input/output architectures, there's a field of specialized applications that manage HPC deployments, directing data to where it needs to go and allocating resources in the most efficient way.

Advances in networking mean all the above can be completely distributed. A good example is the AI Research Resource (AIRR) from UK Research and Innovation, a £300m program to develop HPC for AI research.

"HPC systems are often arranged in clusters, with many individual computer nodes networked together," Watson says. "One example is AIRR, which will include supercomputers at the Universities of Cambridge and Bristol. The Cambridge-based 'Dawn' supercomputer will feature over 1000 high-end Intel GPUs, while the Bristol 'Isambard-AI' system will have over 5000 GPUs."

The biggest HPC providers

The global market for high-performance computing was worth a little over US$50bn in 2023 and is expected to exceed $100bn by 2032, per Fortune Business Insights.

The most active providers are, as you'd expect, the largest manufacturers in tech. With a background in software as a service (SaaS) and cloud computing, Hewlett Packard Enterprise is one of the top providers of supercomputer products like the HPE Cray XD2000 and HPE Cray EX2500.

Dell Technologies is also throwing more weight behind HPC, having backed the Dawn supercomputer at Cambridge University, and is working on expanding its platform further in the coming years.

At the other end of the scale, Nvidia is known only for hardware, manufacturing the next-generation chipsets it hopes will provide the bedrock of the AI revolution. Together with its in-house software architectures, its new GPUs are going to make crunching the data across HPC deployments even faster.

Microsoft is more known for software, but the company's Azure cloud architecture brings the hardware and computation together to offer scalable HPC products. In recent years, it’s created supercomputers for training AI models, like the one it built for OpenAI to train GPT-4.

The new frontier

While HPC is becoming a more entrenched paradigm, data and hardware scientists are already building its next evolution, exascale computing.

RELATED WHITEPAPER

Referring to any system that can perform at least one exaflop (floating point operations) – a billion billion calculations per second, exascale computing will let us model and analyze systems even current-generation HPC struggles with.

It's also going to provide the stepping stone to the next generation of HPC, with Antonio Córcoles, Principal Research Scientist at IBM Quantum, explaining how it's going to lead to the world of quantum computing.

"The next generation of high-performance computing will bring exascale computing using CPUs and GPUs and the integration of quantum processing units [QPU] together to solve different parts of complex problems that are best suited for their respective strengths and capabilities. It'll open up new, large, and powerful computational capabilities and our vision is that QPUs naturally become another element of and critical part of any HPC system, including exascale computing.

Drew Turney
Freelance journalist

Drew Turney is a freelance journalist who has been working in the industry for more than 25 years. He has written on a range of topics including technology, film, science, and publishing.

At ITPro, Drew has written on the topics of smart manufacturing, cyber security certifications, computing degrees, data analytics, and mixed reality technologies. 

Since 1995, Drew has written for publications including MacWorld, PCMag, io9, Variety, Empire, GQ, and the Daily Telegraph. In all, he has contributed to more than 150 titles. He is an experienced interviewer, features writer, and media reviewer with a strong background in scientific knowledge.