Google Cloud doubles down on AI Hypercomputer amid sweeping compute upgrades

(Image credit: Future)

published 9 April 2024

Google Cloud has announced a widespread expansion across its cloud infrastructure, promising a varied and powerful approach to AI training, inference, and data processing.

At Google Cloud Next 2024, its annual conference held at Mandalay Bay in Las Vegas, Google Cloud has laid out a range of new advancements for its ‘AI Hypercomputer’ architecture, aimed at helping customers unlock the full potential of AI models.

With a need to serve the ever-growing demand of its customers, the AI Hypercomputer brings together Google Cloud’s TPUs and GPUs as well as its AI software to offer a wide portfolio of generative AI training options.

One of the central pillars of the AI Hypercomputer architecture Google Cloud’s tensor processing units (TPUs) – circuits specifically tailored for neural networks and AI acceleration – and Google Cloud has announced that the latest iteration TPU v5p is now generally available.

First announced in December 2023, Google Cloud has claimed the TPU v5p can train large language models three times as fast as the preceding generation. Each TPU v5p contains 8,960 chips which can unlock memory bandwidth improvements of 300% per chip,

Google Cloud’s A3 virtual machines (VMs), announced in May 2023, will be joined by a new ‘A3 mega’ VM. Featuring an array of Nvidia’s H100 GPUs in each VM, A3 mega will offer twice the GPU to GPU networking bandwidth making them a strong fit for running and training the largest AI workloads on the market.

A new service called Hyperdisk ML will help enterprises leverage block storage to improve data access for AI and machine learning (ML) purposes. It follows the 2023 announcement of Google Cloud Hyperdisk, a block storage service that helps businesses connect durable storage devices to individual VM instances.

Hyperdisk ML was made with AI in mind, with the ability to cache data across servers for inference across thousands of inferences if necessary. In a briefing, Google Cloud stated Hyperdisk ML can load models up to 12 times faster than alternative solutions.

In its sprint for greater power to support workloads, Google Cloud will also soon play host to Nvidia’s Blackwell family of chips. This pushes Google Cloud VMs to the frontier of performance, as the Nvidia GHX B200 is capable of more than tenfold improvements over Nvidia’s Hopper chips for AI use cases.

The GB200 NVL72 brings even more power to the table, combining 36 Grace CPUs and 72 Blackwell GPUs for power that could handle trillion-parameter LLMs of the near future. With liquid cooling and its high-density, low-latency design the data center solution can deliver 25 times the performance of an Nvidia H100 while using the same power and less water.

This puts it more in line with green data centers, a key selling point for Google as it balances AI energy demand with its own sustainability goals.

AWS also recently announced it is bringing Blackwell to its platform, pitting the two hyperscalers against each other in terms of performance potential. The key differentiator here is that the partnership between Google Cloud and Nvidia runs parallel to the former’s wider investment approach into competitive hardware for business use cases.

“Our strategy here is really centered around designing and optimizing at a system level, across compute, storage, networking, hardware, and software to meet the unique needs of each and every workload,” said Mark Lohmeyer, VP/GM Compute & ML Infrastructure at Google Cloud.

Leaning into open, general purpose compute

Against the backdrop of specific AI needs, Google Cloud has also taken the decision to ramp up its line of general-purpose cloud compute offerings. With this in mind, the firm has announced a general-purpose CPU for data centers named Google Axion.

Axion is Google Cloud’s first Arm-based CPU and is already being used to power Google services such as BigTable, BigQuery, and Google Earth Engine. The firm said Axion is capable of performance improvements of as much as 50% compared to current generation x86 instances and a 30% improvement over Arm-based instances.

The chip is also expected to deliver up to 60% better energy efficiency than comparable x86 instances, which businesses could exploit to lower running costs and hit environmental targets more easily.

Built with an open foundation, the aim is for Axion to be as interoperable as possible so that customers don’t need to rewrite any code to get the best out of the CPU.

RELATED WHITEPAPER

Black text that say serious about sustainability?

Discover a solution that will make your data center energy-efficient

Google Cloud has announced two new VMs, C4 and N4, which will be the first in the industry to boast Intel’s 5th gen CPUs. Google Cloud stated that N4 excels at workloads that don’t need to be run flat-out for long periods of time, giving the example of microservices, virtual desktops, or data analytics.

In contrast, C4 is intended to buck up mission-critical workloads, boasting a 25% performance improvement over the C3 VM and ‘hitless’ updates so ensure critical services can run uninterrupted.

Across its new infrastructure offerings, Google Cloud has emphasized the importance of meeting customer demands for performance without compromising on price or energy efficiency. In casting a wide net, the hyperscaler has shown concrete improvements across all compute demands rather than simply doubling down on the most high-end workloads.

Rory Bathgate is Features and Multimedia Editor at ITPro, overseeing all in-depth content and case studies. He can also be found co-hosting the ITPro Podcast with Jane McCallion, swapping a keyboard for a microphone to discuss the latest learnings with thought leaders from across the tech sector.

In his free time, Rory enjoys photography, video editing, and good science fiction. After graduating from the University of Kent with a BA in English and American Literature, Rory undertook an MA in Eighteenth-Century Studies at King’s College London. He joined ITPro in 2022 as a graduate, following four years in student journalism. You can contact Rory at rory.bathgate@futurenet.com or on LinkedIn.

Get the ITPro daily newsletter

Leaning into open, general purpose compute

RELATED WHITEPAPER