Google Cloud announces major computing boost with Ironwood chip, new hypercomputer upgrades
The hyperscaler’s latest TPU has been built with the agentic AI era in mind


Google Cloud has announced a new chip intended for AI inference, which it says is more powerful and better for AI requirements but also far more energy efficient.
Ironwood, Google Cloud’s 7th generation tensor processing unit (TPU), offers five times more computing power than the firm’s previous-generation Trillium chips.
Each Ironwood pod is capable of 42.5 exaflops of performance, which Google Cloud noted is 24 times the per-pod performance of the world’s fastest supercomputer El Capitan.
Ironwood will be available in two configurations: a 256 and 9,216 chip model, with each chip liquid-cooled, equipped with 192GB of memory, and capable of 7.2Tbits/sec of bandwidth.
The announcement comes as Google Cloud works to meet growing demand for AI inference across its cloud ecosystem. It cited the demanding workloads of frontier AI deployment - including mixture of experts and advanced reasoning models - as driving enterprise demand for intense data computation with little to no latency.
Google Cloud also said the rise of AI agents, which work autonomously and proactively within an organization’s cloud environment, will greatly drive up inference demand.
For example, a security agent running 24/7 will input and output data constantly as it scans for malicious activity, putting constant strain on cloud infrastructure.
Get the ITPro daily newsletter
Sign up today and you will receive a free copy of our Future Focus 2025 report - the leading guidance on AI, cybersecurity and other IT challenges as per 700+ senior executives
The announcement was made on the opening day of Google Cloud Next 2025, the firm’s annual conference at which AI is once again taking center stage.
As generative AI demand has risen and models have become sophisticated, the energy draw on data centers has been an area of concern.
In a briefing with assembled media, Amin Vahdat, VP & GM of ML, Systems, and Cloud AI at Google Cloud, stated that the past eight years have seen demand to train and serve AI models increase by a factor of 100 million.
With this in mind, Google Cloud developed Ironwood to be its most energy efficient ever, offering 29.3 teraflops per watt – a nearly 30x improvement over Google’s first TPU from 2018.
Google Cloud uses its TPUs to train and inference its own Gemini models and credits this for its competitive cost to performance ratio. It noted that Gemini Flash 2.0 can provide 24x more inference capability per workload than GPT-4o and five times more than DeepSeek-R1 – a model known for its extremely low cost.
Hypercomputer expansion
Beyond the hardware announcements, Google Cloud has also unveiled sweeping advancements for 'AI hypercomputer', its cloud architecture for compute, networking, storage, and software.
Chief among this is Pathways, the firm’s internal machine learning (ML) runtime developed by Google DeepMind, which will now be made available to Google Cloud customers.
Pathways allows multiple Ironwood pods to connect to one another and work in tandem, allowing enterprises to leverage many tens or even hundreds of thousands of Ironwood chips to operate as one massive compute cluster.
Rather than simply connect the pods together, Pathways can dynamically scale the different stages of AI inference across different chips, to maintain low latency responses and high data throughput.
Google Cloud stated this has transformative potential for the future operations of frontier generative AI models in the cloud.
The firm also announced new capabilities for Google Kubernetes Engine aimed at supplying customers with the right resources for AI inference. GKE Inference Gateway provides model-aware load-balancing and scaling, which Google Cloud said can reduce serving cost up up to 30% and tail latency by 60%.
To assist customers further, the firm is also releasing GKE Inference Recommendations in preview, which can configure Kubernetes and infrastructure based on a customer’s AI model choice and performance targets.
In addition to its own TPUs, Google Cloud customers can access Nvidia hardware via the cloud giant’s A4 and A4X virtual machine (VM) offerings.
Google Cloud announced general access to Nvidia's highly capable B200 via its A4 VMs at Nvidia GTC 2025 last month and customers will now also be able to access its GB200 chips via the A4X in preview.
MORE FROM ITPRO

Rory Bathgate is Features and Multimedia Editor at ITPro, overseeing all in-depth content and case studies. He can also be found co-hosting the ITPro Podcast with Jane McCallion, swapping a keyboard for a microphone to discuss the latest learnings with thought leaders from across the tech sector.
In his free time, Rory enjoys photography, video editing, and good science fiction. After graduating from the University of Kent with a BA in English and American Literature, Rory undertook an MA in Eighteenth-Century Studies at King’s College London. He joined ITPro in 2022 as a graduate, following four years in student journalism. You can contact Rory at rory.bathgate@futurenet.com or on LinkedIn.
-
Cheap cyber crime kits can be bought on the dark web for less than $25
News Research from NordVPN shows phishing kits are now widely available on the dark web and via messaging apps like Telegram, and are often selling for less than $25.
By Emma Woollacott Published
-
Redis unveils new tools for developers working on AI applications
News Redis has announced new tools aimed at making it easier for AI developers to build applications and optimize large language model (LLM) outputs.
By Ross Kelly Published
-
Google shakes off tariff concerns to push on with $75 billion AI spending plans – but analysts warn rising infrastructure costs will send cloud prices sky high
News Google CEO Sundar Pichai has confirmed the company will still spend $75 billion on building out data centers despite economic concerns in the wake of US tariffs.
By Nicole Kobie Published
-
Microsoft’s Three Mile Island deal is a big step toward matching data center energy demands, but it's not alone — AWS, Oracle, and now Google are all hot for nuclear power
News The Three Mile Island deal comes after concerns over Microsoft’s carbon emissions surge
By George Fitzmaurice Published
-
Data center carbon emissions are set to skyrocket by 2030, with hyperscalers producing 2.5 billion tons of carbon – and power hungry generative AI is the culprit
News New research shows data center carbon emissions could reach the billions of tons by the end of the decade, prompting serious environmental concerns
By George Fitzmaurice Published
-
AWS, Microsoft, and Google see massive cloud opportunities in Japan - here’s why
Analysis AWS’ bid to expand infrastructure in Japan will see it it invest over $15 billion, but it’s not the only hyperscaler that has its eye on big gains in the country
By Solomon Klappholz Published
-
Google Cloud deep into second day of fire and flood data center fiasco
Services have still not fully recovered in the French cloud region
By Rory Bathgate Published