Nvidia Targets Data Centre Bottlenecks with PC-based AI GPUs

The requirement for substantial computing power in AI development has created processing bottlenecks in data centres, prompting technology leader Nvidia to release new tools for local computing environments.
Shifting towards local processing represents a change in direction for the data centre sector, where centralised infrastructure has dominated AI development. This centralisation has raised concerns about data privacy and processing costs among enterprise clients.
Historically, personal computers have not provided sufficient processing capability to run complex AI models locally. This limitation has affected development teams and programmers working on AI projects who require access to cloud services for testing and deployment.
Nvidia NIM microservices target enterprise computing requirements
To support developers working with the new hardware, Nvidia has released NIM microservices, a suite of pre-packaged AI models optimised for PCs. These tools allow developers to integrate AI capabilities into applications without managing the complexity of model optimisation.
- GeForce RTX 5090 and 5080 GPUs perform 3,352 trillion AI operations per second for local processing
- Models that required 23GB of memory can now run on 10GB, enabling wider GPU compatibility
The company says it has established an integration partnership with Microsoft to enable NIM microservices in Windows Subsystem for Linux. This is designed to provide compatibility between desktop and data centre environments.
Deploying AI models successfully requires technical expertise. Models from software repositories such as Hugging Face require modification and optimisation before deployment on personal computers.
These modifications include quantisation – a process that reduces model size and enables integration with existing software tools.
NIM microservices provide pre-optimised models for deployment, reducing the technical requirements for implementation. The system enables developers to integrate artificial intelligence capabilities without managing model optimisation processes.
Black Forest Labs demonstrates Nvidia processing improvements
Software developer Black Forest Labs has documented performance improvements through its FLUX.1 development model. Using previous hardware specifications, the model required 23 gigabytes of video memory and 15 seconds for image generation tasks.
Nvidia says the RTX 5090 processor reduces processing time to five seconds while requiring 10 gigabytes of memory. The improvement derives from FP4 compression technology integrated in NIM microservices.
The compression technology enables AI models to operate on a broader range of graphics cards, increasing access to local processing capabilities. This development reduces dependency on cloud computing services for artificial intelligence development and deployment.
Nvidia introduces reference implementations for AI development
The hardware manufacturer has also published AI Blueprints, a set of reference implementations that demonstrate potential applications. These include a document conversion system that processes PDF files into audio content using seven AI models.
Reference implementations demonstrate complex AI workflows operating on personal computers rather than cloud services. This approach aims to reduce development time by providing working examples of local processing systems.
The technology also builds on developments introduced in 2018, when Nvidia first incorporated dedicated AI processing cores in consumer graphics cards. The fifth generation of Tensor Cores processes multiple AI models simultaneously, supporting applications from real-time rendering to automated assistance systems.
The manufacturer will release NIM microservices with compatibility for GeForce RTX 50 Series, GeForce RTX 4090 and 4080, and RTX 6000 and 5000 graphics processors. The company plans to expand hardware support in subsequent releases.
“These GPUs were built to accelerate the latest Gen AI workloads, delivering up to 3,352 AI trillion operations per second (TOPS), enabling incredible experiences for AI enthusiasts, gamers, creators and developers,” says Jesse Clayton, Product Manager at Nvidia.
Explore the latest edition of Technology Magazine and be part of the conversation at our global conference series, Tech & AI LIVE.
Discover all our upcoming events and secure your tickets today.
Technology Magazine is a BizClik brand

