How do you optimise the operating costs of a data centre?
Infrastructure is the biggest single cost in making a data centre ‘rack ready’ at an estimated 1/3rd of the whole life cost, but the ongoing operating costs, energy, maintenance and replacements of components, makes up the other 2/3rds. Therefore, the management of these expenses is critical to ensuring costs are controlled throughout the operating life of the facility. When you consider that the capital cost of a 100,000 square foot N+1 data centre can exceed US$300mn, the operating costs and lifespan of each individual element becomes key to maintaining an efficient and cost effective data centre.
Typically, maintenance and replacement regimes rely upon manufacturers recommendations combined with general building services and engineering standards - providing indicative economic life expectancy. However, it is clear that most new data centres and refurbishments are designed to be highly resilient, future proofed and designed to be capable of much larger workloads. Conversely new technologies and methodologies are used to meet performance requirements and tackle efficiency, some of which have less historical data for relatively accurate lifecycle guidance.
For example, what if the data centre in question started its life operating at 50% of the planned maximum workload, how would the individual elements of the infrastructure, the switchgear, UPS, redundant chillers, generators and so on, perform? As a good analogy - would you replace your car with only 10,000 miles on the clock after two years or wait until you got to 30,000? But what if changing early from a diesel to an electrical vehicle early reduces your total cost of ownership?
So, how do you know when to replace components or critical infrastructure? Data centres are designed with a life expectancy and the associated components and M&E infrastructure is likely to have an assumed operating lifespan of say between 10 and 30 years. However, this is based upon the designers and manufacturers assumptions and can often rely on maintenance life expectancy data. This is really useful, but it is adapted from guidance such as CIBSE Guide M guidelines so how can we be sure it provides the appropriate inputs for data centres? We can’t, although we do know that it provides a ‘safe’ set of assumptions, a fall back, but as experienced data centre professionals we also know that blindly following these data tables leads to the removal and disposal of perfectly working components or unexpected costs to maintain performance and operation. It is therefore important that we combine different sources of data to make robust lifecycle plans and give us the tools to determine the best cause of action, often having both technical and commercial considerations.
Operational management tooling
Tools such as Data centre infrastructure management (DCIM) and digital twin packages, although have been around for a number of years, are now well placed to tackle operating costs and should be viewed as enablers to optimising the data centre. Software packages which allow operators to combine the ‘generic’ guides with service history/condition reports and actual utilisation provide a much richer set of data to make informed decisions, importantly with often the added benefit of scenario modelling to support the business case for investments.
Encouragingly the recent advances in mechanical and electrical components and infrastructure give the operator the ability to make less compromise or trade-offs when selecting a given technology or approach. For example;
Highly efficient cooling technologies are now a given and have continued to improve with more efficient heat transfer processes which are less capital and/or operationally intensive. Battery technologies continue to make strides with Lithium-ion batteries for UPS systems having twice the operational life with lower maintenance requirements and Active Control systems can dynamically optimise the infrastructure in line with demand whilst maintaining operator service levels.
It fair to say that with the backdrop of a 24x7x365 dynamic environment, an industry skill shortage and a sector which embraces innovation, AI can also pay a part. In 2020 we launched KAI (Keysource Artificial Intelligence), a new facilities maintenance service which utilises the latest machine learning/artificial intelligence technology to help organisations to transform their edge operational management strategy. It has been designed to enable companies to make data led decisions around key issues such as reducing capital expenditure and unplanned downtime or gaining greater control over their edge infrastructure.
Whilst this revolutionary approach using proprietary AI can enable businesses to remove the guess work from when and if to replace components and reduce unnecessary operational spend on perfectly usable items, it is one of a number of solutions to an ongoing industry issue.
So, in conclusion, how do you design and maintain a data centre without industry specific guidelines?
We know from experience that constant monitoring such as that provided by KAI, and an accurate view of the historical demand and load history, coupled with the adherence to a regular maintenance schedule, provides some very useful insights into the capabilities of individual components. As an industry we should be able to define the operating lifespan of critical components such as the switchgear, UPS and generators - the unit of measure might not be as simple as miles driven or hours operated but with all of our collective experience and accurate data driven expectations, metrics can be developed, used in future design and maintenance schedules and fed back to manufacturers to ensure the required performance is understood and delivered.
Or is it time for our industry to formalise the design and operating expectations of some of these critical components as without developing an agreed and accurate set of performance data we will struggle to meet some of the efficiency targets set by our clients and match the net zero expectations that are being set across the industry?