Moving to the Azure cloud: unpacking dark data

Moving to the Azure cloud?

Today, more and more businesses are moving to the cloud – to automate and take advantage of AI and scalable storage, and to reduce costs over existing legacy infrastructure. In fact, in 2021, an estimated 19.2% of large organizations made the move to the cloud. And Microsoft Azure is close to leading that shift – with a 60% market adoption.

Often organizations focus on selected applications during a cloud transition. However, existing data might actually present the bigger complexity.  A majority of organizations use less than 50% of the data they own. At the same time, there is no oversight of data that is owned. This unused, unclassified, and unlabeled data is otherwise known as “dark data”, because it remains in the shade until abundant time is allocated to sort, label, and classify it.

Moving to the Azure Cloud is Like Moving House

We believe there is merit to comparing moving to the Azure cloud and moving house. You decide where to move, you choose your new infrastructure, and you get everything ready to move in. Then, you pack up your old belongings and move it with you. The problem is you likely already have plenty of boxes lying around. Think about your attic, your basement, and storage. Things from earlier relocations. You might have lost all knowledge of what’s in there. The same holds true when your organization’s applications and data must move house. But this time you also have to deal with ‘boxes’ of data left unlabeled by people leaving the organization, data left unused for a longer time, and data left behind from already obsolete applications. Moving this and other less well-known data may create bigger issues in the future.

  • Data is accumulating faster than it ever did before. You’ll have more of it tomorrow. Therefore now is the best time to go through data and categorize it
  • Proper governance of data is impossible without knowing its contents first. Older data collected from before GDPR regulations is still there. Compliance and Risk officers and CISOs dread this unknown data and fear it may fall out of compliance regulations.
  • It can be difficult to pass regulatory compliance audits with dark data ar If you can’t open a ‘box’ of data to show auditors what’s inside, you can’t prove you’re compliant.
  • You’re also not allowed to simply delete data. Industries and governments must comply with laws and regulations on archiving and maintaining open data.
  • When you know what data you have you can strategize and move towards controlled decisions on cold/warm/hot storage to optimize both costs and access. Moving data that is still dark may bring about irreversible data loss or at least expensive repairs in the future
  • Locating and accessing data requires the kind of information best-captured in classifications and labels, historical data analysis needs this metadata.
  • The parts of data that make up dark data leaves organizations vulnerable as it makes designing and taking security precautions extra hard.
  • Sometimes you can or must delete information. However, you can only do so if you know its contents beforehand and can determine regulatory compliance and have the foresight for future valuable analytics.

How can you optimize accessing this data? When one of our clients, the Drents Overijsselse Delta Waterschappen, looked at archiving and storing its past project documentation in the cloud, it found the necessary manual labeling a daunting task. The massive time-investment needed is very similar for other organizations making a cloud transition. Manually reviewing data is simply too labor-intensive for most organizations to undertake within a feasible timeframe.

Unpacking Data with Synerscope’s Ixivault

With Synerscope, you can achieve the data clarity you need. As a weakly supervised AI system, our solutions are built to perform where standard AI approaches would fail. Synerscope’s Ixivault implements onto your Azure Tenant – with no backend of its own. This means that all data stays inside your tenant, which is a big plus for all matters and concerns regarding security, governance, and compliance. Our friction-less implementation then allows you to open up, categorize, and label dark data using a combination of machine learning with manual review to speed up the full process by an average of 70%.

Ixivault analyzes your full data pool of structured and unstructured data, creating categories based on data similarities, pulling keywords and distinctive terms, and generating images of those data stacks – which your domain expert can then sit down to quickly label. Most importantly, Ixivault has built-in learning capabilities, meaning that it gets better at categorizing and labeling your specific data as you use it.

All this makes Ixivault the perfect tool to help you move – by unpacking boxes of data as you move them to the cloud. You can then choose appropriate storage, governance and access controls, even if you need or don’t need to keep the data. For the first time you can have a near edge-to-edge overview of all your data with zoom in options to very granular levels so you can make the best choice what to do next with this newly discovered data. Having new information about your data can make you money and save you money all at the same time.

If you need help with unboxing your dark data as you move, contact us for more information about how Synerscope can help. You may also purchase the Ixivault app directly at Microsoft’s Azure Marketplace.