Holiday outages highlight fragility of data systems

By Joanna England

January 06, 2021

undefined mins

The recent outages and security breaches experienced by Google, Slack and Microsoft highlight the fragile nature of connectivity in a digitalised econom...

Cybercrime has increased exponentially since the COVID-19 pandemic accelerated the digital transformation of businesses.

Now, more than ever before, we are a planet reliant on technology. But according to findings by the University of Maryland, such platforms are under increasing strain with hackers attacking systems every 39 seconds, for an average of 2,244 times a day.

Google’s services, some of which have more than a billion users, failed in mid-December, affecting most of its flagship products. The internet technology giant is yet to reveal why the outage occurred, although such failures statistically boil down to two causes: human error or cyberattacks.

Functionality was restored to most users, who experienced the loss of Gmail, YouTube and other products. Google tools were also down in the US, the UK and across Europe, but began functioning again an hour later.

“We will continue to work toward restoring service for the remaining affected users,” the company wrote in a post on its service status page.

Disruptions

Outages are not uncommon. However, Google’s most recent incident is striking because of its pervasiveness.

According to reports, the company’s search function was fully operational, and third-party ads were unaffected. DownDetector, the website that collates user-reported errors on websites, mobile networks, and other platforms, logged tens of thousands of complaints in minutes, including reports that Google’s office tools such as Drive and Meet, Google Maps, and Google smart home products such as Nest were not functional.

The outage also caused Google Home, which controls smart devices like Chromecast and smart speakers, to stop working. Customers who rely on smart services to manage functions like home lighting and heating complained bitterly on Twitter that they’d been plunged into darkness.

“I’m sitting here in the dark in my toddler’s room because the light is controlled by @Google Home. Rethinking... a lot right now.,” tweeted one Home Service user.

Endemic

In November, Amazon’s cloud-computing division experienced a failure that affected the ability of customers to access approximately 24 services. The outage affected streaming hardware maker Roku, software seller Adobe and digital photo service Flickr.

The new year also got off to a poor start for the messaging app Slack, which experienced a three-hour service outage on January 4th. The workplace messaging platform’s failure prevented some users from sending messages or connecting to the service.

Slack released a statement on its status page that it was still working to completely resolve the issues and that some customers “may experience degraded performance,” after three hours of downtime, along with an apology for the disruptions that read, “Customers may have trouble loading channels or connecting to Slack at this time. Our team is investigating, and we will follow up with more information as soon as we have it. We apologise for any disruption caused.”

Slack is also yet to reveal why their services failed.

Microsoft breach

While services disruption is an inconvenience and results in complaints from users, outages caused by confirmed criminal activity rather than just human error are the biggest concern. For example, in 2018, a staggering 500mn consumers dating back to 2014, had their information compromised in the Marriott-Starwood data breach.

Microsoft experienced a security breach on December 21st, 2020 through hackers gaining access to the corporation's source codes.

The New York Times reported that the SolarWinds hackers had taken advantage of a Microsoft loophole to hack the email system used by the U.S. Treasury Department’s senior leadership. The cybercriminals infiltrated Microsoft Office 365 to create an encrypted “token” that fooled the Treasury’s system into believing the hackers were legitimate users.

The Microsoft Security Response Centre wrote in a blog post, “We detected unusual activity with a small number of internal accounts and upon review, we discovered one account had been used to view source code in a number of source code repositories.”

Microsoft later said the compromised account didn’t have permissions to modify any code or engineering systems. An investigation, they said, also confirmed no changes were made. Microsoft added it had investigated and remediated the internal accounts with unusual activity.

“We do not rely on the secrecy of source code for the security of products, and our threat models assume that attackers have knowledge of source code,” the corporation’s blog post stated. “So viewing source code isn’t tied to elevation of risk.”

At-risk

Microsoft’s admission came a week after CrowdStrike said suspected Russian intelligence service hackers unsuccessfully attempted to hack the endpoint security firm via a Microsoft reseller’s Azure account. The reseller’s Azure account was used for managing CrowdStrike’s Microsoft Office licenses, and the hackers failed in their attempt to read the company’s email, CrowdStrike said.

The US Cybersecurity and Infrastructure Security Agency (CISA) said the hackers had added authentication tokens and credentials to the privileged Microsoft Active Directory domain accounts as a “persistence and escalation mechanism”. The CISA said the tokens enable access to both on-premise and hosted resources.

Solutions

With such high-level data breaches and outages, experts predict data visibility and data protection management will be key growth areas in 2021.

According to the cybersecurity firm Forcepoint, working securely has never been so challenging. “To work securely, regardless of location, enterprises will need to introduce real-time user activity monitoring. Cloud-native solutions with a deep understanding of users’ behaviour will deliver permanent solutions, rather than stopgaps,” the company states.

Forcepoint notes that human error-caused outages are usually the result of risky behaviours that any employee might take, “like making errors, stockpiling data, or finding workarounds to achieve their goals.”

Nick Savvides, Forcepoint’s APAC senior director of strategic business, believes that the digital transitions of 2020 have altered the ‘traditional’ security perimeter. He told Data Centre News, “Organisations need to understand the cybersecurity implications of these changes to keep themselves and their data safe over the coming months.”

Savvides adds, “Understanding the emerging challenges and creating cybersecurity technologies which can address them, while also remaining ‘invisible’ to the end user and simple for the practitioner to implement, will be key to ensuring the ongoing security of people and data alike.”

Cybersecurity Data Centres Critical Environments Risk Management