Data centres consume between 1% and 3% of total worldwide electricity. While the compute, storage, and networking infrastructure use most of the electricity consumed in a data centre, a significant portion is consumed by the cooling systems needed to keep the systems running optimally and within design temperature envelopes.
With the latest generation of CPUs consuming close to 300 watts, and GPUs using upwards of 500 watts, the limits of air cooling are approaching. Therefore, liquid cooling offers a cost-effective way to keep the highest performing servers operating and reduce operating expenditure (OPEX) on an ongoing basis.
From a physics standpoint, liquid is much more effective at removing heat from a surface than air. Think of cooling a bottle of wine quickly. By immersing the bottle of wine in an ice and water bucket, the wine will cool quicker than putting it in a home freezer. This is because more molecules are touching the bottle of wine when immersed in a liquid than surrounding the bottle with air.
Comparing Liquid Cooling Options
Going back to 1985, supercomputers have been immersed in liquid for more efficient cooling. Fast forward to today, and the ability to cool individual CPUs and GPUs with liquid has made its way into many data centre environments.
While there are a few methods to cool CPUs and GPUs with liquid technology, the following are most popular.
- Direct To Chip – In this method, the chilled liquid is pumped directly over the CPU. The warmer liquid is then pumped to a second CPU (in dual-socket systems) or back to a unit that will chill the warmer liquid. This method is most prevalent today and effectively removes heat from systems. Fan speeds can then be reduced, further decreasing the power needed in a data centre dedicated to the cooling of servers.
- Immersive – Full systems, unmodified, can be entirely immersed in non-conductive liquid. The entire system is typically lowered into a container that is filled with various types of liquids. As this surrounds the CPUs and GPUs, the heat is drawn away, the warmer liquid rises and is then cooled externally to the immersive tank. Immersion cooling is the most efficient way to cool modern high-performance servers.
- Rear Door Heat Exchanger (RDHx) – Rear doors can also be used to cool the hot exhaust air from the servers before the air enters the data centre (hot aisle). These rear doors contain both fans and cool liquid that – after hot air transfers its heat to the liquid – must be cooled externally through a heat exchanger. Although this method does not use liquid to cool the system directly, the use of liquid in the data centre cooling infrastructure is critical.
How Liquid Cooling Reduces OPEX
While data centre costs may vary based on several factors, the use of liquid cooling, specifically direct-to-chip, reduces OPEX in two ways. Firstly, it reduces the power each server uses due to lower fan speed requirements, while, secondly, it can decrease or remove entirely the use of Computer Room Air Conditioners (CRAC).
A quick comparison of some cost reduction metrics, comparing a rack of eight GPU servers with direct-to-chip liquid cooling vs traditional air cooling, demonstrates the cost efficiency provided by D2C solutions.
Calculations can be made to determine the cost savings of implementing a liquid cooling solution. First, add up the total power that the entire cluster (or rack) of servers is expected to use. Liquid cooled servers will not require that the fans be used as much, so this power outlay can be removed from the calculation.
The power usage effectiveness (PUE) of a data centre is critical for this calculation as well. A liquid cooled solution will have a much lower PUE, usually in the range of 1.10 - 1.15, while air cooled data centres will have a PUE of around 1.5 (at best).
Using these values – and understanding the cost of electricity based on the rate plan and geographical location – a comparison can be made as to the cost savings over a period of time between an air cooled and liquid cooled server rack. Some quick calculations show that, for a single rack of eight systems that each have two CPUs and eight GPUs per system, the cost saving over three years would be in the £32,000 to £36,000 range.
Making the right choice
Choosing the most appropriate liquid cooling technology, whether a D2C, Immersive or RDHx, depends on many factors. These include whether this is a retro-fit situation, for example, installing new and hotter servers into an existing rack.
An RDHx system would be the best choice for this environment. If new servers generate more than 1 KW of heat, then the D2C would be the best option, primarily when a server consists of dual CPUs and multiple GPUs.
An immersive cooling situation may be the best choice for environments where systems are expected to run very hot, and there may not be sufficient two-phase cooling for D2C or ultimate cooling.
For example, suppose a new physical data centre is being designed. In that case, the architects and building managers need to plan for the future and install options for the next generation of servers, which will undoubtedly be generating more heat.
Whatever the preferred solution, liquid cooling will ultimately become a critical technology to keep the latest and future generations of CPUs and GPUs within operating thermal envelopes.
While liquid cooling may be considered more justifiable for very high-end clusters today, upcoming CPUs and GPUs will undoubtedly require novel solutions to maximise performance from future microprocessors.