Cybersecurity In-Depth: Feature articles on security strategy, latest trends, and people to know.

Making Sure Lost Data Stays LostMaking Sure Lost Data Stays Lost

Retired hardware and forgotten cloud virtual machines are a trove of insecure confidential data. Here's how to ameliorate that weakness.

Stephen Lawton, Contributing Writer

May 12, 2023

5 Min Read
Photo of an abandoned campsite completely overgrown with saplings and underbrush.
Source: Håkan Lindén via Alamy Stock Photo

The stories are both infamous and legendary. Surplus computing equipment purchased at auction contains thousands of files with private information, including employee health records, banking information, and other data covered by a multitude of state and local privacy and data laws. Long-forgotten virtual machines (VMs) with confidential data are compromised — and no one knows. Enterprise-class routers with topology data about corporate networks are sold on eBay. With so much confidential data made available to the public on a daily basis, what else are companies exposing to potential attackers?

The fact is a lot of data gets exposed regularly. Last month, for example, cybersecurity vendor ESET reported that 56% of decommissioned routers sold on the secondary market contained sensitive corporate material. This included such configuration data as router-to-router authentication keys, IPsec and VPN credentials and/or hashed passwords, credentials for connections to third-party networks, and connection details for some specific applications.

Cloud-based vulnerabilities that result in data leaks are usually the result of misconfigurations, says Greg Hatcher, a former instructor at the National Security Agency and now CEO and co-founder of White Knight Labs, a cybersecurity consultancy that specializes in offensive cyber operations. Sometimes the data is put at risk deliberately but naively, he notes, such as proprietary code finding its way into ChatGPT in the recent Samsung breach.

Confidential data, such as credentials and corporate secrets, are often stored in GitHub and other software repositories, Hatcher says. To search for multifactor authentication or bypasses for valid credentials, attackers can use MFASweep, a PowerShell script that attempts to log into various Microsoft services using a provided set of credentials that attempts to identify if MFA is enabled; Evilginx, a man-in-the-middle attack framework used for phishing login credentials along with session cookies; and other tools. These tools can find access vulnerabilities into a variety of systems and applications, bypassing existing security configurations.

Having both a hardware and software asset inventory is essential, Hatcher says. The hardware inventory should include all devices because the security team needs to know exactly what hardware is on the network for maintenance and compliance reasons. Security teams can use a software asset inventory to protect their cloud environments, since they cannot access most cloud-based hardware. (The exception is a private cloud with company-owned hardware in the service provider's data center, which would fall under the hardware asset inventory as well.)

Even when applications are deleted from retired hard disks, the unattend.xml file in the Windows operating system on the disk still holds confidential data that can lead to breaches, Hatcher says.

"If I get my hands on that and that local admin password is reused throughout the enterprise environment, I now can get an initial foothold," he explains. "I already can move laterally throughout the environment."

Sensitive Data Might Not Stay Hidden

Short of physically destroying disks, the next best option is overwriting the entire disk — but that option can sometimes be overcome as well.

Oren Koren, co-founder and chief product officer of Tel Aviv-based Veriti.ai, says service accounts are an often-ignored source of data that attackers can exploit, both on production servers and when databases on retired servers are left exposed. Compromised mail transfer agents, for example, can act as a man-in-the-middle attack, decrypting simple mail transfer protocol (SMTP) data as it is being sent from production servers.

Similarly, other service accounts could be compromised if the attacker is able to determine the account's primary function and find which security components are turned off to meet that goal. An example would be turning off data analysis when super-low latency is required.

Just as service accounts can be compromised when left unattended, so can orphaned VMs. Hatcher says that in popular cloud environments, VMs are often not decommissioned.

"As a red teamer and a penetration tester, we love these things because if we get access to that, we can actually create persistence within the cloud environment by popping in [and] popping a beacon on one of those boxes that can talk back to our [command-and-control] server," he says. "Then we can kind of hold onto that access indefinitely."

One file type that often gets short shrift is unstructured data. While rules are generally in place for structured data — online forms, network logs, Web server logs, or other quantitative data from relational databases — the unstructured data can be problematic, says Mark Shainman, senior director of governance products at Securiti.ai. This is data from nonrelational databases, data lakes, email, call logs, Web logs, audio and video communications, streaming environments, and multiple generic data formats often used for spreadsheets, documents, and graphics.

"Once you understand where your sensitive data exists, you can put in place specific policies that protect that data," says Shainman.

Access Policies Can Remediate Vulnerabilities

The thought process behind sharing data often identifies potential vulnerabilities.

Says Shainman: "If I'm sharing data with a third party, do I put specific encryption or masking policies in place, so when that data is pushed downstream, they have the ability to leverage that data, but that sensitive data that exists within that environment is not exposed?"

Access intelligence is a group of policies that allows specific individuals to access data that exists within a platform. These policies control the ability to view and process data at the permission level of the document, rather than on a cell basis on a spreadsheet, for example. The approach bolsters third-party risk management (TPRM) by allowing partners to access data approved for their consumption; data outside that permission, even if it is accessed, cannot be viewed or processed.

Documents such as NIST's Special Publication 800-80 Guidelines for Media Sanitation and the Enterprise Data Management (EDM) Council's security frameworks can help security pros define controls for identifying and remediating vulnerabilities related to decommissioning hardware and protecting data.

About the Author

Stephen Lawton

Contributing Writer

Stephen Lawton is a veteran journalist and cybersecurity subject matter expert who has been covering cybersecurity and business continuity for more than 30 years. He was named a Global Top 25 Data Expert for 2023 and a Global Top 20 Cybersecurity Expert for 2022. Stephen spent more than a decade with SC Magazine/SC Media/CyberRisk Alliance, where he served as editorial director of the content lab. Earlier he was chief editor for several national and regional award-winning publications, including MicroTimes and Digital News & Review. Stephen is the founder and senior consultant of the media and technology firm AFAB Consulting LLC. You can reach him at [email protected].

Keep up with the latest cybersecurity threats, newly discovered vulnerabilities, data breach information, and emerging trends. Delivered daily or weekly right to your email inbox.

You May Also Like


More Insights