NVIDIA DGX SuperPOD: The Next-Generation AI Supercomputer

Nvidia hopes that its innovations will help businesses and cloud providers with critical computing abilities to succeed in a rapidly changing technology landscape
At its GTC event, NVIDIA unveiled its new AI supercomputer powered by its new Blackwell superchips to accelerate and redefine enterprise AI efforts

As key industries worldwide utilise AI for analysing large amounts of data, power efficiency remains crucial to maintaining compute-intensive workloads.

With this in mind, Nvidia has revealed its new, next-generation AI supercomputer. The NVIDIA DGX SuperPOD supercomputer features a new and highly efficient liquid-cooled rack-scale architecture. In addition, it features the company’s superchips which includes its new Blackwell GPUs, which are expected to deliver a much faster performance for large language model (LLM) workloads.

Having announced multiple new partnerships at GTC this week, Nvidia hopes that its innovations will help businesses and cloud providers with critical computing abilities to succeed in a rapidly changing technology landscape governed by AI.

Helping data centres reduce downtime and costs

The supercomputer is also designed to be able to process trillion-parameter models with constant uptime to scale generative AI (Gen AI) training. It is essentially a data-centre-scale AI supercomputer that can integrate successfully with high-performance storage to meet the demands of Gen AI workloads.

It features intelligent predictive management capabilities to monitor thousands of data points continuously across both hardware and software. It can identify areas of concern and make maintenance plans and adjust compute resources accordingly, scheduling hardware replacements to avoid downtime.

As a result, the supercomputer can help to remove inefficiency to save time, energy and computing costs. It is built, cabled and tested to dramatically speed up deployment at the data centres of its customers.

It is clear to see why Nvidia continues to dominate the AI chip market, with its chips being used for a wide range of business applications to advance Gen AI capabilities. In fact, the company’s data centre business has generated the majority of its total revenue in recent months, having surpassed all expectations and hitting revenues of US$22.10bn, with data centre revenue increasing by 409% to US$18.4bn.

Blackwell: Advancing AI workloads 

With Blackwell expected to achieve higher performance levels, the company is looking to further dominate the industry. The Blackwell GPUs contain 208 billion transistors and can enable AI models that scale up to 10 trillion parameters. 

“Blackwell GPUs are the engine to power this new industrial revolution. Working with the most dynamic companies in the world, we will realise the promise of AI for every industry,” Nvidia CEO Jensen Huang said during his keynote address at GTC on Monday (19th March 2024).

With liquid-cooled architecture, Blackwell will be incorporated into Nvidia’s GB200 Grace Blackwell Superchip that connects two of the new B200 GPUs to a Grace GPU. The new chips are expected to be made available later in 2024, with the company stating that leading technology companies such as AWS, Dell Technologies, Google, Meta, Microsoft, OpenAI and Tesla are planning to use the GPUs to power their own services.

Nvidia is also planning to partner with the likes of Schneider Electric and Lenovo to efficiently develop and deploy new AI use cases that seek to drive innovation and growth.

******

Make sure you check out the latest edition of Data Centre Magazine and also sign up to our global conference series - Tech & AI LIVE 2024

******

Data Centre Magazine is a BizClik brand

Share

Featured Articles

Hitachi Vantara Explores Building Green Data Centres

Hitachi Vantara’s Lynn Collier, Director of Global Solutions, discusses why data centre decarbonisation is essential & how AI can be a key tool

AI and Data Centres: Ensuring the Next Era is Sustainable

Matt Pullen, EVP Managing Director Europe at CyrusOne, explains how increased AI workloads will impact data centres - highlighting a need to be sustainable

Rittal: Data Centres Sustainability & Energy Efficiency

Industrial machinery manufacturer Rittal produces energy efficient cooling solutions for data centres, which can reduce the environmental impact of AI

In a Data Centre first, Iceotope Shares Liquid Cooling Lab

Technology & AI

Digital Realty Acquires US$200m Slough Data Centre

Data Centres

CyrusOne: US$7bn Warehouse Credit Facility for AI Expansion

Technology & AI