The rapid growth of AI – along with other modern technologies – is feeding an insatiable demand for data centres worldwide. According to research by Statista, the total amount of data created, captured, copied and consumed globally in 2020 was 64.2 zettabytes. Leading up to 2025, global data creation is projected to grow to more than 180 zettabytes. (For an idea of just how mind-bogglingly large that number is, one zettabyte is equal to a trillion gigabytes.)
The need of organisations to meet the increased power requirements of high-performance computing has spurred several innovations in the field of data centre design and technology.
One pioneering solution at the forefront of this transformation is Kove:SDM™, Kove's Software-Defined Memory solution that enables enterprises to maximise the performance of their people and infrastructure.
Kove:SDM™ is a breakthrough technology that allows individual servers to draw from a common memory pool on a scale way beyond what’s possible using a physical server. Crucially, this means each job receives exactly the memory it needs.
Kove’s Founder and CEO John Overton describes this process as ‘memory virtualisation’, and says it is playing a crucial role in advancing computing capabilities – a feat that for decades has proven too tough a nut for conventional approaches to crack.
“Memory is the last component to have been virtualised, made generic and commodified in the way that every other facet of computing has been,” Overton explains. He adds that the company spent five years on “hardcore R&D”, before engaging in a disciplined, decade-long development effort to address the memory wall through a software-only approach.
“We took a look around and saw that nobody had any idea how to do it,” he says. “We tried everything that everybody else had done – and failed at repeatedly – but then we cracked the code. Now, here we are today. We've been doing extremely high-end computing for a long time and have worked our tails off to get here.”
Kove was founded 20 years ago with a mission to “think differently about what could be done with computing”, says Overton, who adds that today, the company is home to “passionate software engineers and technologists committed to delivering the products and personalised services that enable every enterprise to reach its full potential”.
AI workloads driving the need for better data centre performance
Central to Kove’s offering is memory virtualisation, and it’s a technology that is badly needed by a power-hungry, tech-driven world. According to research by Boston Consulting Group, in 2022 the power used by US data centres was around 130 terawatt hours (TWh), which is set to increase to a gargantuan 390TWh in 2030 – the equivalent of almost 10% of the US’ total electricity consumption in 2023.
Driving this ever-increasing need for more juice is the mass-scale adoption of AI, and the role that data centres will play here cannot be overstated.
AI-capable chips draw considerably more power than their non-AI counterparts, and this is having a dramatic impact on the data centre landscape. According to Overton, memory has emerged as a “pivotal component” in tackling the challenges posed by escalating data centre workloads.
“AI has been coming for a long time,” he says. “But what people often don't realise is if you don't solve the memory problem, you don't get to go where AI can take you.”
Overton equates Kove’s focus on performance in modern computing with that of cost. “If you have a high level of performance you can adjust that performance to control costs, and with smaller costs comes modular control. In the world of data centres, Overton says such a scenario is “beautiful”.
“I think it's going to create a form of edge computing that we’ve not seen before,” an animated Overton says, “because of all the places you can't afford cost inefficiency – where you can't amortise it across the infrastructure – is on the edge.”
He adds: “Data centre people have been struggling to reduce costs for some time. Commodification has been one answer, but even that cannot work anymore.
“Now, people have needs that exceed the limits of what a data centre can physically hold, and the only way to do it is either to be more clever on the software stack or to control the memory and power surfaces.”
Crucially, establishing control over the memory surface removes the limit of how much memory can be housed by a given host server.
“Once you control the memory surface you remove these limits,” says Overton. “So instead of figuring out how to cram computations into a physical platform you can focus on creating a broader and more flexible physical platform that can be deployed on-demand to support arriving needs that cannot otherwise be addressed.”
And this is key, because today’s large-language models require enormous amounts of memory. This means that – without solutions like Kove:SDM™ – data centres have little choice but to resort to parallelisation, whereby multiple servers are networked in a cluster. Although this can work well, the power demands are colossal and it can lead to resource stranding. The world’s largest supercomputer, Aurora – based at the Argonne National Laboratory – is reported to consume up to 60 megawatts of power at its peak: the equivalent of more than 42,000 US homes using a conservative 1000 kWh/month.
“What if instead you could move memory around to boost computation efficiency by a factor of 100,” asks Overton – rhetorically, because Kove:SDM™ allows memory to be shifted to where it’s needed, instead of having to shoehorn data into a platform.
He adds: “You can take what we do, internalise it, and run things at a significant discount, so that it’s even cheaper than the cloud. All of a sudden, data centres have huge potential.
“Data centre people need to understand there's a new way of thinking around data centres. We need to change the received wisdom that data centres are dead and the cloud is alive.”
Achieving better performance
Having data centres that are alive, kicking and potent will be crucial if businesses are to handle growing workloads while delivering exceptional user experiences.
The traditional way to enhance data centre performance is through hardware upgrades, more powerful servers, greater storage capacity and faster networking solutions. All provide a temporary boost in capabilities but it’s an approach that can chew through time and money at an alarming rate, especially for large-scale data centres. The initial outlay can strain budgets and impact overall financial health.
Then there are the implementation delays. Ordering, receiving and installing new hardware can take weeks or even months – downtime that results in costly productivity hits in a market where margins are constantly being eroded by cost inflation, especially for energy.
“Buying increasingly expensive equipment is not a sustainable long-term solution,” says Overton. “The need for memory expands quadratically for large-scale AI. So what do you do? Do you keep buying increasingly expensive hardware? We've got 50 years of experience to show that approach doesn't work.”
He adds: “Every aspect of computing – computation, storage, hardware and software – has been virtualised, except for memory …until now.
“But when you look at the cost, latency, speed, performance and environmental challenges associated with today’s data centres, memory is at the heart of many of these challenges.”
Although Compute Express Link (CXL) has promised to improve the way servers deploy memory, it is not without its issues. The technology – an open standard for high-speed, high-capacity data centre connections – is still yet to become available, and its widespread adoption may take years to materialise. It’s also built around assumptions of hardware, not software running on any hardware, like Kove.
Because it is a nascent technology, CXL-compatible hardware and software are not yet widely available, making it difficult to leverage any of the technology’s future capabilities. Relying on CXL for performance is a long-game strategy, and will require all new hardware, which is a problem when the need for more storage is immediate – and pressing. Kove’s design runs on any hardware and software without hardware limitations.
Kove:SDM™ is a more viable solution than CXL
“The difference between CXL and Kove:SDM™ is you can run Kove right now, and on your existing hardware,” Overton says. “This is a huge benefit because nobody's going to replace everything in one fell swoop. CXL is a promise, and even if it stays on schedule it is still several years away.”
He adds: “You can go with CXL, and go down the road of replacing all your hardware. Or you can add Kove, and get better performance and better energy savings now. Kove will support CXL hardware if/when/how/as that becomes available.
“Memory and CPU costs are typically between 65% and 85% of IT investment, yet most servers only achieve between 20% and 40% utilisation.” AI developers frequently target only 25-30% utilisation to avoid swapping to NVMe. This all goes away with Kove, where developers can target all local memory and create memory size needed on-demand through provisionable rules – of any size, not limited to local-memory hardware constraints.
Overton points out that Kove:SDM™ provides CPUs with ‘local’ memory performance, even when that memory lives in a data centre. This, he explains, delivers high CPU utilisation rates – “even CPU saturation” – when using remote memory.
Such unlimited dynamic-memory sizing means any-size computation can run completely in memory – and that, says Overton, is “seismic”.
He adds: “Because Kove:SDM™ virtualised memory allows individual servers to draw from a common memory pool, they receive exactly the amount of memory needed, even amounts far larger than can be contained on a physical server.
“When a job is completed, memory returns to the pool and becomes available for use by other servers and jobs, increasing memory utilisation.
“Better utilisation of data centre resources can significantly enhance overall performance, by maximising the efficiency of existing infrastructure and reducing the need for hardware upgrades.”
Which brings Overton to a crucial subject: sustainability. Kove:SDM™ means fewer servers are required, which reduces cooling needs and energy consumption, yet still delivers greater processing power. In short, it’s something of a sustainability Holy Grail.
Providing a real-world example of this carbon-crunching solution at work, Overton mentions Red Hat and Supermicro, who conducted extensive memory utilization testing using Kove:SDM™ on Red Hat Openshift. He says that Kove:SDM™ consistently delivered power savings of “between 12% and 54%, with zero code changes”.
“I think it's the future of computing,” says Overton. “You cannot deliver on the promises of AI without understanding the memory surface. Talk to anyone doing big-pharma gene-sequencing, or people training LLMs. The biggest limiting factor is always memory.”
He adds: “Because of the speed of light and the distance of cables, the current solution is to bond the timing and the speed of memory to compute. You buy that in a size that you need, and then you multiply that in parallel as many times as you can. And when you take that approach, you need to buy power systems big enough to power cities.
“At Kove, we’re looking at software infrastructure that uses less power. It’s not about more tightly coupled chip design but rather the commodification of chip design. When you understand the memory surface that Kove:SDM™ provides then you can change everything.”
In a world of rapid change, most of which is technology-driven, Overton points out how important it is that progress is not hobbled by the limitations of statically defined hardware.
“Our solution allows organisations’ data scientists to be more productive and to increase the performance of whatever they're trying to accomplish, all while reducing energy consumption.”
He adds that in a world full of volatility and uncertainty, anything that ups performance levels while controlling energy costs is a competitive advantage.
“You could run circles around your competitors, and be more environmentally responsible while doing so. Who would not want to cut data centre energy usage by 50% while being more productive? It’s a real win-win.”
**************
Make sure you check out the latest edition of DataCentre Magazine and also sign up to our global conference series - Tech & AI LIVE 2024
**************
DataCentre Magazine is a BizClik brand