NVIDIA GTC: Targeting Data Centre Scale in the AI Token Era

Share this article
Share this article
Prioritise Us on Google
NVIDIA CEO, Jensen Huang | Credit: NVIDIA
In a keynote at NVIDIA GTC, Jensen Huang showcased Vera Rubin, Groq and DLSS 5 among other products, expecting US$1tn worth of orders by 2027

NVIDIA CEO Jensen Huang took to the stage at NVIDIA's annual tech conference, GTC, to set out a future shaped by data centres built for AI inference at scale.

From the first computer designed for deep learning, NVIDIA now powers leading endpoints and defines how infrastructure supports the future of the ‘token king’.

In his keynote, Jensen positioned every company as a future operator of a token factory, where effectiveness links directly to revenue.

With NVIDIA accruing US$500bn in orders for Blackwell, a figure which is expected to rise to US$1tn by 2027, the scale of that demand places data centres at the heart of this expansion.

Accelerated computing and data centre efficiency

Jensen highlighted how accelerating computing positively impacts data centre performance, as updated software lowers computing cost while sustaining performance gains across infrastructure lifecycles.

He added that NVIDIA must “continue to nurture and continue to update software over its life”, ensuring customers move beyond a “first time pop” and instead see a “continuous cost reduction of accelerated computing over time”.

Youtube Placeholder

Jensen noted how a large install base allows optimisation to reach millions of systems at once.

“This combination of dynamics is what makes NVIDIA architecture expand its reach, accelerating its growth, at the same time driving down computing costs. Which ultimately encourages new growth.”

This means efficiency gains come not just from hardware refresh cycles but also from ongoing software updates across existing estates, which is a key consideration for data centre operators.

Contextualising the 'Big bang of AI'

Jensen connects today’s data centre scale to NVIDIA’s origins in graphics.

CUDA is central to NVIDIA's accelerated computing empire, a software platform that allows developers to use GPUs to accelerate workloads such as AI inference and training.

“This is the house that GeForce made.” He added, “GeForce brought CUDA to the world,” enabling developers to use GPUs beyond graphics.

This shift allows researchers to see that “GPU could be their friend in accelerating deep learning”. Jensen stated, “It started the big bang of AI.”

Nvidia's headquarters drives AI innovation, shaping the future of digital transformation with technologies like Blackwell and CUDA

He continued, “Ten years ago we thought that AI would revolutionise computer graphics.

“Just as GeForce brought AI to the world, AI is now going to go back and revolutionise how computer graphics is done all together.”

This revolution feeds directly into data centre workloads, where graphics and AI converge. NVIDIA's DLSS 5 combines 3D graphics with AI to create new processing demands, which is called "neural rendering".

"Crown jewels"

Workloads increasingly sit inside data centres rather than on standalone systems, requiring high throughput infrastructure capable of handling real-time processing.

NVIDIA supports this with its CUDA X libraries, which Huang called “crown jewels”.

With more than 70 new libraries introduced at GTC, these tools allow developers to deploy accelerated computing across industries. Jensen highlighted CuDNN, a deep neural network library, as one that “completely revolutionised artificial intelligence”.

Youtube Placeholder

AI inference and next generation data centres

Jensen also described a rapid change in AI capability and its impact on infrastructure. “Something happened in the last two years. Particularly in the last year. We have been working with the AI-natives for a long time and last year it just skyrocketed.”

With more than US$150bn invested in startups, he called this the “largest in human history”.

He added: “the incredible value they are delivering already is quite tangible,” made possible “because we reinvented computing.”

Inside NVIDIA, this transformation is visible in software development. “100% of NVIDIA is using a combination of Claude Code, Codex and Cursor.

“There is not one software engineer today who is not assisted by one or many AI agents helping them code.” The agentic AI that Jensen discusses increases demand for inference capacity in data centres – inference is the process where trained models produce outputs from new data.

NVIDIA Vera CPU designed for the agentic age | Credit: NVIDIA

“An AI that can perceive became an AI that can generate.

“An AI that can generate became an AI that can reason.

“An AI that can reason now became an AI that can actually do work.”

Vera Rubin for next-gen data centres

As AI systems move into active roles, inference expands sharply. To support this, NVIDIA has introduced platforms such as Vera Rubin, which are designed for next-generation data centre infrastructure.

Vera Rubin, built on sixth generation NVLink, is a high-speed interconnect that links GPUs and liquid cooled systems and supports efficiency at scale.

Its integration with Groq LPUs, which are processors designed for low latency AI inference, reduces response times and supports high value token generation.

NVIDIA Rubin delivers one-tenth the cost per million tokens compared to NVIDIA Blackwell for highly interactive, deep reasoning agentic AI | Credit: NVIDIA

Its integration with Groq LPUs, which are processors designed for low latency AI inference, reduces response times and supports high value token generation.

These systems operate at rack scale, meaning hundreds of chips function together within a single data centre rack.

Now in production, the Vera CPU supports high throughput AI processing while expected to drive a multi-billion dollar business.

For data centre operators, it means infrastructure needs to keep up with constant large-scale inference demand while still keeping costs under control over time.

Company portals

Executives