AWS Unveils Next-Gen Data Centres for AI Computing Demands
Data centre operators worldwide are re-engineering their facilities to handle the power and cooling demands of AI workloads. Graphics processing units (GPUs) used for AI consume up to four times more power than traditional processors, creating challenges for data centre power and cooling infrastructure designed for standard computing loads.
Now, Amazon Web Services (AWS), the cloud computing division of e-commerce firm Amazon, has revealed specifications for a new generation of data centres designed to handle increased power demands from these AI workloads.
The announcement, made at the company’s re:Invent conference in Las Vegas, outlines changes to power distribution, cooling systems and hardware configurations that AWS states will provide a 12% increase in compute power per site. The changes come as cloud providers race to secure GPUs and modify infrastructure to support growing customer demand for generative AI applications.
The news follows AWS’ announcement for the first time of its power usage effectiveness (PUE) metrics – reporting that its global data centre fleet achieved a PUE of 1.15 in 2023, with its most efficient site achieving a PUE of 1.04.
AWS advances liquid cooling for AI processors
A centrepiece of the redesign is a hybrid cooling system that combines traditional air cooling with direct-to-chip liquid cooling, a technique where coolant is brought into direct contact with processors to remove heat more effectively. The system was developed with semiconductor manufacturers to support high-density AI accelerators, including AWS’s own Trainium2 chips and Nvidia’s GB200 NVL72 processors.
The cooling design accommodates both liquid-cooled AI processors and air-cooled network and storage infrastructure. This flexibility allows AWS to maintain its range of more than 750 Amazon Elastic Cloud Compute instances, which provide different combinations of processor, storage and networking options.
These data centre capabilities represent an important step forward with increased energy efficiency and flexible support for emerging workloads
“These data centre capabilities represent an important step forward with increased energy efficiency and flexible support for emerging workloads,” says Prasad Kalyanaraman, VP of Infrastructure Services at AWS. “They are designed to be modular, so that we are able to retrofit our existing infrastructure for liquid cooling and energy efficiency to power generative AI applications and lower our carbon footprint.”
Power distribution and control systems modernised
The infrastructure changes include a redesigned power distribution system that reduces potential failure points by 20%. AWS has developed new power delivery equipment to enable a sixfold increase in rack power density over two years, with capacity for further expansion.
Data centres must evolve to meet AI’s transformative demands.
The company has simplified its electrical systems to achieve what it claims is 99.9999% infrastructure availability. This modification reduces the number of racks that can be affected by electrical issues by 89%.
AWS has also implemented new control systems across its electrical and mechanical equipment to standardise monitoring and operational sequences. The company uses internally developed telemetry tools for real-time diagnostics and troubleshooting.
Industry partners validate AWS data centre strategy
Nvidia has collaborated with AWS on the cooling system design. “Data centres must evolve to meet AI’s transformative demands,” says Ian Buck, Vice President of Hyperscale and HPC at Nvidia. “By enabling advanced liquid cooling solutions, AI infrastructure can be efficiently cooled while minimising energy use.”
AI research company Anthropic cites the infrastructure improvements as a factor in selecting AWS as its primary cloud provider. “Having access to secure, performant, and energy-efficient infrastructure is crucial to our success,” says James Bradbury, Distinguished Engineer of Compute at Anthropic.
Environmental improvements target carbon reduction
The redesign incorporates environmental modifications including a cooling system that AWS reports will reduce mechanical energy consumption by 46% during peak cooling conditions. The company has specified concrete with 35% lower embodied carbon compared to industry standards.
- AWS data centres achieved a global PUE of 1.15 in 2023, with best performing site at 1.08
- New cooling system design reduces mechanical energy consumption by 46% during peak conditions
- Power distribution redesign cuts potential failure points by 20%
AWS will deploy renewable diesel in backup generators, a fuel it states can reduce greenhouse gas emissions by 90% compared to conventional diesel over its lifecycle. The company has begun using renewable diesel in existing European and American facilities.
The company states its infrastructure is 4.1 times more energy efficient than on-premises data centres. AWS reached its goal of matching all electricity consumption with renewable energy in 2023, seven years ahead of schedule.
Construction of new data centres incorporating these changes will begin in early 2025 in the United States. The infrastructure will be deployed across AWS’s network of 34 regions and 108 availability zones.
“AWS's continuous infrastructure advancements allow us to concentrate on innovating new services that help our customers make more informed financial decisions rather than the undifferentiated heavy lifting of running data centres,” says Alex Lintner, CEO of Technology, Software Solutions and Innovation at Experian, the consumer credit reporting company.
Explore the latest edition of Data Centre Magazine and be part of the conversation at our global conference series, Tech & AI LIVE and Data Centre LIVE.
Discover all our upcoming events and secure your tickets today.
Data Centre Magazine is a BizClik brand