Microsoft & CrowdStrike IT Outage Impact on Data Centres
As Friday’s go, this one was not the start to a restful weekend for the tech giant Microsoft or the cyber security firm CrowdStrike, as workers across the world opened their laptops to Microsoft’s ‘blue screen of death’.
CrowdStrike admitted that an update to its antivirus software, which is designed to protect Microsoft Windows devices from malicious attacks, had suffered a malfunction. Microsoft promised it would "work around the clock" to provide "ongoing updates and support", while CrowdStrike CEO George Kurtz apologised.
"Crowdstrike is actively working with customers impacted by a defect found in a single content update for Windows hosts... The issue has been identified, isolated and a fix has been deployed," he said.
So how did the data centre industry fare in the chaos?
CrowdStrike's reputation in cybersecurity tarnished by unexpected outage impacting Microsoft Azure
CrowdStrike was featured in our Top 10: Cybersecurity Companies in Data Centre, noted for its advanced cloud-native platform that protects critical areas of enterprise risk, such as endpoints and cloud workloads, identity and data. Its CrowdStrike Falcon platform protects workloads - such as containers, running in all environments, from public and private clouds to on-premises and hybrid data centres.
The CrowdStrike Falcon can combat malware and respond to attacks, by maximising customer visibility into real-time and historical endpoint security events.
Yet one accident on a quiet Friday morning propelled the company into a PR disaster.
One of the first data centres to feel the heat from the outage was Microsoft Azure, which offers services to on-premises data centres, allowing businesses to build and run hybrid applications. This ensures a consistent Azure experience across private clouds, as well as public ones, supporting services like IaaS, PaaS and SaaS.
"A backend cluster management workflow deployed a configuration change causing backend access to be blocked between a subset of Azure Storage clusters and compute resources in the Central US region,” Microsoft said in a statement. “This resulted in the compute resources automatically restarting when connectivity was lost to virtual disks.
The company also said that mitigation has been confirmed for all Azure Storage clusters.
“A small subset of services is still experiencing residual impact. Impacted customers will be continuing to communicate through the Azure service health portal," the statement concluded.
Seclore warns of single source vulnerability for data centres
For Justin Endres, CRO of Seclore, the issue is largely MS Windows OS systems impacting data centres.
“Google’s compute engine and Azure reported outages which is why we saw banks, airlines and so on all taken offline. North America saw only a fraction of what Asia experienced. CRWD runs at high privilege so the impact is significant. Recovery will be measured in weeks, not hours, given many of the impacted systems will need to be rebuilt manually.”
Justin is clear that where technology diversity is low in the enterprise, single points of failure can create these issues. Single source vulnerability in data centres refers to the risks of relying heavily on a single provider or technology for critical services.
“For all the ‘single throat to choke’ advocates, there won’t be enough neck to choke at CRWD and no one is going to feel better,” he explained. “The single source approach firms are taking, especially with MSFT, is a dangerous one. This incident is an example of that. Single OS/EDR/Cloud instances, just like single source supply chains, are fragile."
As the outage primarily impacted Windows systems, Brandon Hart, CTO of Everything Blockchain Inc., knew that this would lead to significant issues in data centres.
“As data centres face significant downtime, impacting service availability and reliability due to the BSOD issues,” he said. “The reliance on CrowdStrike for endpoint protection may lead to increased vulnerabilities until all systems are patched and data centres might need to reallocate resources to manage these outages, affecting other planned activities."
******
Make sure you check out the latest edition of Data Centre Magazine and also sign up to our global conference series - Tech & AI LIVE 2024
******
Data Centre Magazine is a BizClik brand