Google Cloud Unveils Ironwood TPU With 42.5 Exaflops Per Pod

Google Cloud has revealed its seventh-generation tensor processing unit (TPU) at its annual Next conference in Las Vegas, marking a significant escalation in the AI compute infrastructure race.
The new Ironwood TPU delivers 42.5 exaflops of compute per pod and contains more than 9,000 chips per unit, which Google describes as “more than 10x improvement” from its previous TPU generation. This announcement comes as hyperscalers continue massive investments in specialised AI hardware to support the growing demand for large language model training and inference.
“These regions are connected by more than two million miles of terrestrial and subsea cables, and have more than 200 points of presence across 200+ countries and territories, creating a truly global and resilient foundation for the AI-powered future,” said Thomas Kurian, CEO of Google Cloud, in a blog post ahead of the conference.
The Ironwood TPU announcement is complemented by several other infrastructure innovations that will impact data centre architecture and operations.
Google AI Hypercomputer introduces cluster management tools
Alongside the Ironwood hardware, Google introduced Cluster Director, a new capability that enables organisations to deploy and manage large numbers of accelerators as a unified compute unit. This tool addresses the growing complexity of managing distributed AI systems that span multiple racks and pods.
The TPU announcement is part of Google’s broader AI Hypercomputer strategy, which includes a comprehensive suite of hardware and software optimised for AI workloads. Google claims its system delivers “more intelligence – or useful AI output – at a consistently low price,” with Gemini 2.0 Flash reportedly achieving “24x higher intelligence per dollar compared to GPT-4o and 5x higher than DeepSeek-R1.”
Google has also expanded its GPU portfolio with partnerships with Nvidia.
“We’ve significantly enhanced our GPU portfolio with the availability of A4 and A4X VMs powered by Nvidia’s groundbreaking B200 and GB200 Blackwell GPUs. We were the first cloud provider to offer both of these cutting-edge options,” Thomas notes.
Google also announced it will offer Nvidia’s next-generation Vera Rubin GPUs, which provide “up to 15 exaflops of FP4 inference performance per rack.”
Storage architecture innovations address AI performance bottlenecks
Google has also introduced several storage innovations designed to address data throughput challenges in AI workloads, which have become a significant concern for data centre designers as compute capacity increases.
Hyperdisk Exapools offers “the highest aggregate performance and capacity block storage of any hyperscaler, with up to exabytes of capacity and terabytes per second of performance per AI cluster,” according to Google.
The company also unveiled Anywhere Cache, which “intelligently keeps data close to accelerators, reducing storage latency by up to 70% and significantly accelerating training times.” This technology addresses one of the key bottlenecks in AI system design – moving data efficiently between storage and compute resources.
A third storage innovation, Rapid Storage, was described as Google’s “first zonal object storage solution, offering an impressive 5x lower latency for random reads and writes compared to the fastest comparable cloud alternative.”
Cloud Wide Area Network opens Google's global infrastructure
Google has announced Cloud Wide Area Network (Cloud WAN), making its global private network available to enterprise customers.
“It makes Google's global private network available to all Google Cloud customers. Cloud WAN is a fully managed, reliable, and secure enterprise backbone to transform enterprise wide area network architectures,” said Thomas. The service reportedly delivers “up to 40% improvement in network performance, while simultaneously reducing total cost of ownership by up to 40%.”
This offering leverages Google's extensive network infrastructure, which spans “more than two million miles of terrestrial and subsea cables” with “more than 200 points of presence across 200+ countries and territories.”
For organisations requiring on-premises AI capabilities due to regulatory, sovereignty, or data volume considerations, Google also announced that its Distributed Cloud offering will now bring Google’s models to on-premises environments.
For organisations with strict data requirements, Google has formed a partnership with Nvidia to run Gemini on Nvidia Blackwell systems, with Dell as a hardware partner. This enables on-premises deployment in regulated environments, potentially addressing adoption barriers in healthcare, finance and government sectors.
“Nvidia and Google Distributed Cloud provide a secure AI platform, bringing Gemini models to enterprise data centres and regulated industries,” explains Justin Boitano, VP, Enterprise AI Software, Nvidia. “With Nvidia Blackwell infrastructure and confidential computing, Google Distributed Cloud enhances privacy and security, and delivers industry-leading performance on DGX B200 and HGX B200 systems, available from Dell.”
“We have partnered with Nvidia to bring Gemini to Nvidia Blackwell systems, with Dell as a key partner, so it can be used locally in air-gapped and connected environments,” said Thomas.
The announcement noted that Google Distributed Cloud is now authorised for U.S. Government Secret and Top Secret levels, potentially opening new deployment options for high-security data centres.
Explore the latest edition of Data Centre Magazine and be part of the conversation at our global conference series, Tech & AI LIVE and Data Centre LIVE.
Discover all our upcoming events and secure your tickets today
Data Centre Magazine is a BizClik brand

