5 Steps for Minimizing Dark Data Risk

Dark data may be your most elusive asset, but it can also be your most costly if you don't protect it.

Cameron Over, Partner, National Cyber & Privacy Lead, CrossCountry Consulting

June 22, 2023

4 Min Read
1s and 0s on a dark background
Source: Robert Eastman via Alamy Stock Photo

What is something that comprises more than half of companies' data repositories, but most aren't even aware they have? It's dark data, information companies unknowingly gather that is not integral to day-to-day business interactions and therefore often sits in the background. While that data is seemingly unnecessary to most companies, it's invaluable to cybercriminals.

What Is Dark Data?

At a time when many companies are focused on collecting, analyzing, and acting on data they receive from customers, it's not surprising that the amount of latent (or dark) data is accumulating far beyond what they planned to store, protect, and potentially purge. For example, when you consider that Netflix spent nearly $10 million a month in 2019 to store its data in the cloud, you can see how much dark data storage might be costing a company.

Gartner equates dark data to dark matter in physics. Dark data extends beyond any published sensitive data elements. It can include personal information from customers or past employees, but might also include nontraditional data such as systems backups, log files, configuration files, sensitive internal procedures, email backups or "spools," scanned document repositories, and human resources information. These are all dark data sources that attackers may want to sell or use.

And while there are some regulatory bodies that aim to protect information that might be considered dark data, such as the US Health Insurance Portability and Accountability Act (HIPAA) and the General Data Protection Regulation (GDPR) in Europe, many companies continue to store this data long after they are required to do so.

How to Protect Dark Data

Every company needs to prioritize its customer and employee data, so how can you protect something you don't even know you have? Further, how do you prioritize this among the other cyber vulnerabilities in your organization? Here are five steps you can take:

  1. Increase visibility of data: Start by building a data inventory to map the information you know about. Next, perform threat modeling to identify security needs, locate threats and vulnerabilities, assess severity, and prioritize solutions. This will help you understand what data you have and explore how it may be at risk. This process allows you to understand and quantify threats so that you can better prioritize remediation of identified security risks.

  2. Think like the adversary: Leverage offensive testing (such as using ethical hackers and professional security testers) to try to breach defenses like an attacker would. This will help you find and address vulnerabilities.

  3. Counter the adversary: Once you have a complete view of your data footprint and threat model, apply or reinforce security controls in target areas (for example, endpoint detection and response, logging and monitoring, content interception and inspection for Web traffic, and patching). Consider steps 1–3 as a continuous improvement cycle of data discovery.

  4. Shrink the battlespace: Delete sensitive personal data that is no longer necessary. Minimize data collected and design code-level controls to support data retention periods. This limits the proliferation of sensitive data throughout your environment.

  5. Avoid technology infatuation: Data loss prevention (DLP) tools help avoid accidents, but they should not be considered a catch-all for data security. Most DLP technologies are weak and can lull organizations into a false sense of security. Like all things cyber and privacy, data protection is about getting people, process, and technology working in balance and harmony. Reinforce carefully chosen tools with crisp processes (detailed and well-documented playbooks and blueprints), workflows, and runbooks, and make sure they're managed and led by thoughtful people with real expertise.

Latent Data May Be Risky Data

There are well-known cases where large organizations found themselves the victims of dark data breaches. Data that was latent and secondary to their business models was suddenly extremely costly in terms of brand trust and legal fees.

Just because you don't see it or use it doesn't mean your data is not dangerous. Dark data should be a consideration for every organization. It should be accounted for, protected, and regularly purged (as applicable) to keep cybercriminals at bay. Dark data may be your most elusive asset, but it can also be your most costly if you don't protect it.

About the Author

Cameron Over

Partner, National Cyber & Privacy Lead, CrossCountry Consulting

Cameron Over leads CrossCountry’s National Cyber & Privacy Practice and their newly launched offensive team, Icebreaker, where she is responsible for the overall strategy, practice development, and client delivery. Cameron has over 25 years of experience in the fields of cybersecurity and privacy serving department of defense, private, and public organizations. Cameron leads a team of talented cybersecurity and privacy experts across strategy, risk management, cloud and application security, privacy and data protection, and offensive security.

Prior to joining CrossCountry, Cameron was a Senior Associate at Booz Allen Hamilton, where she led a large portfolio of cybersecurity programs for the department of defense, including critical infrastructure systems such as the defense industrial base, satellite communications infrastructure, and classified networks.

Cameron received her BS in Computer Science from the University of Mary Washington and is a Certified Information Systems Security Professional (CISSP).

Keep up with the latest cybersecurity threats, newly discovered vulnerabilities, data breach information, and emerging trends. Delivered daily or weekly right to your email inbox.

You May Also Like


More Insights