200M Records of US Citizens Leaked in Unprotected Database
Researchers have not determined who owns the database, which was one of several large exposed instances disclosed this week.
Researchers discovered an unprotected database holding 800GB of personal user information, including 200 million detailed user records. The entirety of the database was wiped on March 3.
User records inside the database held what appeared to be profiles of US users, according to researchers with Lithuanian research group CyberNews. Data contained exposed individuals' full names and titles, email addresses, phone numbers, birthdates, credit ratings, home and mortgage real estate addresses, demographics, mortgage and tax records, and information about personal interests, and investments, as well as political, charitable, and religious donations.
"We were shocked by the sheer scale of the data exposed: The combination of personal, demographic and real estate asset data was an absolute goldmine for cybercriminals," the CyberNews team says.
It seems much of the data in this "main folder" may have originated in the United States Census Bureau, researchers report. When they found the database on Shodan.io in late January, they reached out to the US Census Bureau as a potential owner and have not heard a response. They watched the database for a couple of months, they say, but assume it was exposed for longer.
Finding the database was "not that difficult," the researchers say, but attackers would need some basic technical knowledge to understand what they were looking for. While someone could have accessed the database by mistake, the chances of this happening would be low.
In addition to the main folder of unsecured data, the database contained two more folders seemingly unrelated to the trove of personal records held in the main folder. These folders held the emergence call logs of a US-based fire department, as well as a list of some 74 bike stations that formerly belonged to a bike-share program. The bike-share stations are owned by Lyft.
While the two smaller folders did not hold personal data, the fire department call logs did have dates, time, locations, and other metadata dating back to 2010. "The presence of the folders that contained bike-sharing and fire department service call data was what confused us the most," they say. It's possible the data in these two folders may have been stolen or was used by several parties at the same time, the researchers hypothesize. They were unable to confirm this.
"The structure of the data led us to believe that the database belonged to a data marketing firm, or a credit or real estate company," the team says. For example, categories and sections were marked as codes in a fashion similar to the dictionaries used by data marketers. The codes themselves, they explain, were specific to the US Census Bureau or used in its classifications.
Information in this database, if accessed, could be "incredibly useful" to phishers, scammers, and other cybercriminals who could use the personal details within it to launch sophisticated phishing campaigns, spam attacks, and social engineering attempts.
This wasn't the only large misconfigured database found exposing sensitive information this week. Just yesterday, a misconfigured Elasticsearch database exposed more than 5 billion records related to data breaches between 2012 and 2019. A UK research firm collected detailed information on the breaches, including domain, source, contact email address, and password.
Earlier this week, vpnMentor researchers discovered an unprotected AWS S3 bucket holding 425GB of data, representing some 500,000 files related to MCA Wizard, a mobile app that acts as a tool for a Merchant Cash Advance. Data in the documents included credit reports, bank statements, contracts, legal documents, purchase orders, tax returns, and Social Security data.
Check out The Edge, Dark Reading's new section for features, threat data, and in-depth perspectives. Today's featured story: "Security Lessons We've Learned (So Far) from COVID-19."
About the Author
You May Also Like
Applying the Principle of Least Privilege to the Cloud
Nov 18, 2024The Right Way to Use Artificial Intelligence and Machine Learning in Incident Response
Nov 20, 2024Safeguarding GitHub Data to Fuel Web Innovation
Nov 21, 2024The Unreasonable Effectiveness of Inside Out Attack Surface Management
Dec 4, 2024