The data centre is an essential piece of infrastructure in the modern enterprise. Data centres store, process and redirect the information that is critical to the operation of hospitals, governments, IT services, financial institutions and virtually every aspect of an organisation doing business in Industry 4.0. From cooling and power usage efficiency (PUE) to disaster planning and security, managing these facilities is a complex and ever-evolving task.
“The data centre is seen as a mission-critical facility worldwide. The sensitivity of data storage and availability continues to increase as data centres support many mission critical verticals ranging from hospitals, to Internet-based services such as banking and financial, to healthcare and industrial markets,” asserts , a Senior Industry Analyst at Frost & Sullivan. “However, reliability is the single most critical factor that has driven an increase in data centre complexity, beginning with high power consuming IT equipment such as servers, to the equipment that comes under the purview of facilities management.”
This mission critical mindset is becoming an increasingly essential approach for data centre facilities managers and their teams. In a , Robert Wooley, SVP of Critical Environment Services, and Patrick Donovan, a Senior Research Analyst for the Data Center Science Center at wrote that “Managing and operating a mission critical facility is very different from managing a commercial office building or a factory. For most data centres, failure is not an option. Some liken it to ‘maintaining an airplane while flying it.’ Today, businesses are often either wholly dependent on their data centre or the data centre is the business.” This month, Data Centre Magazine looks at the current state of data centre facilities management, examining operational best practices, and some of the companies that set the bar for excellence in the field.
Policies and protocols for every eventuality
A data centre is a complex, finely tuned instrument. Ensuring reliability, agility and resilience is a top priority for industry operators if they want their facilities to function efficiently and with as little disruption as possible. In order to best pursue this goal, data centre operators can benefit from clearly established policies and protocols.
“Policy and protocols must be drafted that govern related to the critical environment,” notes . First, facilities managers handle access control, determining clear guidelines and permissions for who is allowed to access the physical (and digital) space, allowing facilities management operatives to better understand who is accessing the data centre, when they are doing it and for what reasons.
Rimer adds that “Access control feeds into change management, which provides a clear communication channel for informing affected parties of planned (and unplanned) maintenance, upgrades, moves, and alterations.” This change management process keeps relevant parties informed of disruptions, as well as planned and unplanned maintenance. “The change management process identifies potential risks and related mitigation strategies, coordinates access, and provides an opportunity for stakeholders to ask questions to ensure bases are covered (e.g., placing the fire suppression system in bypass during an underfloor cable install),” he continues. “This level of communication helps to prevent mishaps, overbooking activity in a space, and stepping on each other’s toes. While this adds an administrative burden, any pain it creates is considerably less than the pain that would be felt from an incident that could have been avoided if protocol and process were in place.” He concludes by noting that critical environments require handling with great diligence and dedication if the facility is to avoid disruption.
Efficiency in all things
As data centre operators find themselves increasingly pressured to reduce waste and environmental impact, while also catering to an ever-expanding customer base, the margins for error between a well-managed data centre and one that could be accused of gross inefficiency are growing narrower by the day. Wooley and Donovan note that “[The] drive for energy efficiency is reducing capacity safety margins and system redundancy, increasing the importance of proactive maintenance and data centre infrastructure management.” As such, facilities managers need to have a totally transparent view of their operation, and utilise that depth of vision to constantly strive for greater efficiency in areas like cooling, PUE and network architecture. They add that: “Accurate and consistent tracking of all critical facility assets is the foundation of a good maintenance program. While a well maintained asset database provides the building blocks for effective maintenance, an inaccurate one will result in inefficiency or even equipment failures.”
“As the UK accelerates the adoption of digital technologies, all employees will require continuous training and retraining in order to build the skills needed by their organisations and apply them effectively” -
A good solution is to, paradoxically, schedule equipment replacements and upgrades before said equipment has broken down. The in the data centre industry means that, for the time being, the operational lifetime of a piece of equipment - be it a network switch, a cooling fan or an air handler - is longer than its effective lifecycle. In short, new technology is increasing in efficiency at a rate that means regularly upgrading data centre components produces a net efficiency that is greater than simply letting old parts run themselves into the ground before buying new ones.
Remote Monitoring during COVID-19
As the current pandemic wears on, many data centres are having to continue operations with reduced onsite teams, which has the potential to lead to oversights, system malfunctions and even up-time loss. In these times, ensuring that facilities management teams have remote access to status reports from critical systems is essential.
In , Michael Fluegeman, Principal and Data Center Support Systems Manager at PlanNet Consulting, assessed the state of remote systems monitoring as the Pandemic continues to disrupt industry players. “Newer equipment can push status, loading, and alarms to the building automation system and directly to PCs and smartphones when they are networked. Many devices provide too much information, which needs to be winnowed down to what is important. Getting remote status on older equipment can be more challenging; upgrades may be available, but it may be more cost effective to refresh the equipment at the early range of reliable life expectancy,” he explains, adding that, “if you are going to rely more heavily on remote monitoring, find out whether there is enough bandwidth to allow for remote access and whether effective security protocols are in place.”
A culture of continuous improvement
Even in a time where remote work is more common than ever, and automation is minimising the need for human interaction with facilities, relatively few data centres are fully ‘lights out’ operations. “Humans are still required to install, maintain, and operate data centre facility systems,” note Wooley and Donovan. “Eliminating human error as the number one cause of system interruptions requires the hiring and development of competent, team-oriented people who embody the mission critical mentality.” Facilities management companies need to instil a culture of continuous improvement and education throughout their teams and executives if they are to successfully continue operating without disruption. “Maximising availability and minimising human error in the critical systems environment depends, in large part, on well trained staff,” they note.
Unfortunately, the US and the UK are experiencing a well-documented skills shortage in the tech sector (the data centre industry is no exception), so companies may need to work on developing existing staff, while fighting even harder to attract new talent. “This has engendered a culture where businesses must fight to attract and retain the best talent, leading to inflated salaries, longer recruitment times, higher training costs and a rise in temporary staff as a short-term solution,” notes a recent by UK-based . The report goes on to add that Virtus itself is “a strong advocate of ongoing employee development - and we believe that is what is needed here. As the UK accelerates the adoption of digital technologies, all employees will require continuous training and retraining in order to build the skills needed by their organisations and apply them effectively, and this is also true in infrastructure management.”