Sep 16, 2021

It’s time to get a grip on data proliferation

bigdata
Cloud
AI
IT
Ezat Dayeh
5 min
Ezat Dayeh Senior Systems Engineering Manager, Western Europe at Cohesity, on why understanding and managing your data is a mission critical concern

Vendors can be guilty in the technology industry of constantly looking to the next big thing and forgetting that it takes a while for the latest technologies to catch on at a business level.

That’s certainly true of technologies like artificial intelligence (AI), Internet of Things (IoT) and machine learning (ML), which are now gaining traction and being implemented in production IT environments as businesses look for ways to improve efficiencies and better serve customer needs. An example of that is Enterprise Resource Planning (ERP) software, which can incorporate IoT data, AI and ML to help with all manner of business processes from monthly financial closure to streamlining production lines. 

However, to take advantage of IoT, AI and ML, data has to be in top class condition. The Covid-19 pandemic may have thrown even the most diligent data managers off course. Now is the time to get a grip on data, and reap the benefits that accelerated digitisation can bring. 

The pandemic’s triple track effect 

The need to provision people with the ability to work from home at short notice has led to the growth of local data stores developing early in the pandemic. Those who have not already started to get a grip on data proliferation should do so as a matter of urgency. The need derives from three key effects of the pandemic: an increase in use of cloud, growth in both the number and speed of digitization projects, and a rise in customers’ willingness and desire to use online methods for purchasing and communications. 

Research from McKinsey makes the pace of change abundantly clear. They found that many aspects of digitization happened 20-25 percent faster during the pandemic than firms would have expected in more usual times. The increase in using advanced technologies in operations was 25 times faster, as was the increase in use of advanced technologies for business decision making. Moving assets to the cloud was found to be 24 times faster.

McKinsey also found that customer interactions have become more digitized than ever, with the increase in customer demand for online purchasing or services accelerating 27 times faster than firms would expect under more usual circumstances, and customer needs or expectations changing 24 times faster.

These factors create a perfect storm of ingredients. More customers online means more data, which means more food for your AI and ML systems to digest and turn into powerful information. Accelerated growth of both cloud and digitization projects means the nuts and bolts are being put in place. 

Barriers to success

Data has exploded in volume and been scattered across a myriad of locations from multiple public cloud environments and data centres to remote offices and the edge, often with little global oversight. At each of these locations, data is isolated in specialised infrastructure for functions such as backup, disaster recovery, audit/compliance, network storage, archiving, dev/test and analytics, often from multiple vendors. 

To make matters worse, there are silos within silos. For example, a single backup solution can require several dedicated infrastructure components, such as backup software, master and media servers, target storage, deduplication appliances and cloud gateways—each of which may hold a copy of a given data source. Moreover, each infrastructure component may come from different vendors with their own user interface and support contracts. It is not unusual to find four or more separate configurations simply to perform backup for different data sources. 

These infrastructure silos have a knock-on impact on operational efficiency. There is typically no sharing of data between functions, so storage tends to be overprovisioned for each silo rather than pooled. Likewise, multiple copies of the same data are propagated between silos, taking up unnecessary storage space— according to IDC, 60% of storage budgets goes towards storing copies of data alone. 

An action plan 

On the flipside, those organisations with great data management in place may already be gaining more insights into customer behaviour than before the pandemic thanks to improved data management systems and increased volume of data. They’ll be ahead of the rest when it comes to implementing communication and customer retention strategies. For everyone else, the legacy laggards, an action plan is needed, built firmly around maximising the use of data. 

The first step will be to take stock. Find out what has changed since the period before the pandemic struck, where your data is stored, its quality and relevance, whether there is duplication, who has access to what, are there strategies in place for collection and retention, are policies applied, if so, are they the right ones? – all of this information can be used to inform what happens next.

Running alongside this data audit, it will be important to think about where you want to be in a year’s time – and five years’ time. Work out what you want from your data, how AI, ML and if it is relevant to your operation IoT can help get you there. Define the data and cloud based needs you have to reach these goals. 

Armed with an understanding of where you are and where you want to be, it is time to set out the route map to your goal. This is where data silos get eliminated, the choices between when to use cloud versus on-premises storage become clear, and you reduce the possibility of duplicate data existing. For some the route map might require attention to the organisation as well as to its technology. Where workgroups have traditionally handled their own datasets, and may have been free to develop this approach during the pandemic, they’ll need to become more attuned to a centralised approach.  

Data is a critical element of innovation, but in practice, few organisations are using their data as a strategic asset. Many IT teams struggle simply to meet basic SLAs for protection, and availability, let alone leverage their data for competitive advantage. The challenges brough about by the pandemic have provided those on the front foot with gains and opportunities, and for everyone else, it’s been one fire drill after another.

Working through all of this will be easier for some organisations than others, but the fact is that data-led analytics and innovation that makes the most of AI, ML, IoT and other techniques can’t work optimally unless there is access to the highest quality data. Many organisations have been given a gift of data during the pandemic, now is the time to use that gift wisely. 

Share article