The exponential growth of data has the potential to be a pretty dire stumbling block for enterprises, government organisations, and data centre operators. The amount of data being generated around the world every day is estimated to be around 2.5 quintillion bytes (that’s 2.5 with 18 zeroes on the end) - roughly equivalent to 7.5 million terabytes, or 937.5 billion HD copies of Dances With Wolves starring Kevin Costner. That figure is increasing at an alarming rate each year, with the total data created, copied, stored, analysed, and consumed in 2025 estimated to hit 181 zettabytes. You don’t want to know how many millennia worth of Kevin Costner movies that equates to.
To complicate matters further, all that data is becoming increasingly valuable. Go back a decade, and the vast majority of information gathered by sensors, programs, and the like was discarded. Now, data means money. The more data you have, the more you can analyse it to track and predict trends, schedule maintenance, be more efficient, train machine learning engines, and predict consumer behaviour.
“Just look at professional sports. It’s a hot-bed for next-generation technology and data at the moment. As each season passes, hours upon hours of digital content is created. From tracking player statistics and performance analytics, to video footage capturing each piece of the action in every single game, from multiple camera angles, in stadiums across the globe. That’s a lot of data and it all needs to be stored somewhere,” explains Davide Villa, Director of Business Development EMEAI at Western Digital, a California-based firm that manufactures hard disk drives.
The problem with figuring out where and how to store data is that the right course of action is hugely dependent on what you actually plan on doing with that data. In the world of sports, Villa explains, “in order to deliver data-rich files minute by minute, while continuing to capture live-action, data management teams must decide where to store each type of data, depending on how quickly and how often they need to access it.” Storing data that needs to be accessed often and quickly on a reel of tape creates backlog and inefficiency; storing everything on a disk drive or flash array can eat up an eye-watering amount of electricity, regardless of whether you’re accessing that data or not.
“This isn’t just a problem for the sports industry of course. Experts estimate that global data streams are growing around 30% annually, potentially generating 175 ZB by 2025,” Villa continues. “While not all of that data needs to be analysed right away, storing it is essential, and that’s where cold storage comes into play.”
What is Cold Storage?
Cold Storage when it comes to storing data isn’t a literal reference to temperature. It refers instead to methods of retaining “data that is not actively in use,” Villa explains. Cold storage - depending on the medium used - can, to differing degrees, cut the cost of retaining huge sets of typically unstructured data at the expense of recovery time. The ultimate upshot is that, if you have exabyte upon exabyte of financial records, benchmarking data, compliance information, or anything else you need to hang onto for a very long time, but don’t need to use every day, cold storage can do it more efficiently and, most importantly, more cheaply.
“The more data is stored, the more it costs,” says Villa. “As we reach the Zettabyte Age, cold storage is a segment of storage that is only increasing in demand. According to industry analysts, 60% or more of stored data can be archived or stored in cooler storage tiers until it’s needed. As the world generates and stores more archival data than ever before, cold storage is becoming the fastest growing segment in the industry. In fact, cloud providers are reinventing their architectures with accessible archives to keep pace and ensure effective management.”
Different Types of Cold Storage
There are different mediums and “tiers” of cold storage, each with their own benefits and drawbacks. Villa explains that “until recently, most secondary cold storage has been contained on either tape or hard disk drives, with hot data moving to solid-state drives.” Now, however, with archival data predicted to represent as much as 80% of all captured data worldwide by 2025, figuring out how to meet that demand, even with hyper-dense storage mediums like tape, “presents the next great storage challenge.”
Right now, though, if you ask David Barker, Founder and Chief Technology Officer at 4D Data Centres, “Tape is still the king of long term data storage, especially if reliability of that data years down the line is your top priority and immediate access isn’t required.” We’ve written before at Data Centre Magazine about the unexpected renaissance of tape storage - driven by its low cost, high storage volume, and the ease with which you can secure it against attacks.
“Tape often gets overlooked as an old technology which has been surpassed by non-volatile memory such as SSD or even “the cloud”; but if you want a medium that is very data dense (you can storage a lot of data in a physically small space), relatively quick (but not instant) to retrieve data from, doesn’t require any power to maintain the data and is incredibly reliable in the right conditions – Then tape is your answer,” says Barker. “You can recover data from tape that is 30 or 40 years old without too much trouble. Try finding a hard drive or NAND drive that will still work after 20 years reliably; even non-volatile memory can degrade and get ‘bitrot’ after being left unpowered for a few years.”
Despite en masse migration to public cloud, Barker points out that, depending on the nature of your data, “it can be a lot more cost effective than paying Amazon or Microsoft to store your data for years and it’s a lot more reliable than the USB hard drive someone keeps under their desk.”
However, because tape has read times measured in minutes or hours, as opposed to the seconds and milliseconds when using disk or solid state drives, it’s “an option only for very cold storage,” Villa explains. “Retrieving data from tapes can take several hours, while from HDD it takes milliseconds. HDDs are evolving to next-generation disk technologies and platforms to improve both the cost of ownership and the accessibility for active archive solutions. Recent advancements in HDD technology include new data placement technologies like zoning, higher areal densities, mechanical innovations, intelligent data storage, and new materials innovations.”
Written by Harry Menear