What is 'dark data' and how could it be impacting your cloud migration?

Comms room with cabinets either side with data connections

Multi-cloud solutions are no longer simply an alternative for enterprise data storage, but are rapidly becoming part of the new normal in the business world. By utilising public, on-premises, private and non-cloud infrastructure, organisations are able to keep better control of their data and overall business strategy.

However, more and more companies are beginning their migration to the cloud with no idea of the hidden pitfalls that could derail them further down the line. One of these is the growing issue of dark data', and the impact it can have on a company's cloud migration.

What is dark data?

Dark data is information that is collected, processed and stored before likely never being used again. Because it is hidden, IT departments have no idea if it contains sensitive information or data that should have been deleted long ago, and is therefore a ticking time bomb of potential issues.

"Similar to dark matter in physics, dark data often comprises most organizations' universe of information assets," Gartner describes. "Thus, organizations often retain dark data for compliance purposes only. Storing and securing data typically incurs more expense (and sometimes greater risk) than value."

According to Veritas' estimates, almost half of the information stored on the secondary data stores used for migration to the cloud is dark'; unlabelled data that could potentially land the organisation in hot water with government regulators.

The biggest problem with this kind of unknown data is that businesses are blind to what it includes. It's likely that customer and employee information forms part of it, and this leads to trouble with personally identifiable information (PII).

In an ever-changing digital landscape, enterprises must be vigilant about complying to rules and regulations around data storage and management.

Other issues with dark data

Dark data is considered one of the new types of risk that has sprung up from changes across the industry in how data is handled, with Veritas describing it as a threat "below the waterline" forming "Databergs". These findings also indicate that UK businesses are the worst offenders, after only Germany (66 per cent), Canada (64 per cent) and Australia (62 per cent).

Of course, some of this data could be valuable to the company, but no one can determine this while it is unlabelled and undiscoverable.

Aside from the legal issues transferring dark data can bring, it is also not very cost effective. Companies that don't clean up their redundant data before beginning the migration process end up paying for far too much storage they don't need, which in turn wastes time and money.

Too many businesses are tackling the problem of dark data or "Databergs" with additional storage, but this is an expensive and inefficient fix. Data must be better managed, classified and - if necessary - deleted from the get-go, ensuring a smooth transition to the ideal multi-cloud solution.

Caroline Preece

Caroline has been writing about technology for more than a decade, switching between consumer smart home news and reviews and in-depth B2B industry coverage. In addition to her work for IT Pro and Cloud Pro, she has contributed to a number of titles including Expert Reviews, TechRadar, The Week and many more. She is currently the smart home editor across Future Publishing's homes titles.

You can get in touch with Caroline via email at caroline.preece@futurenet.com.