PrivateGPT Tackles Sensitive Info in ChatGPT Prompts

In an effort to curb employees from entering private data into the AI, ChatGPT is blocked from ingesting more than 50+ types of PII and other sensitive information.

abstract rendering of a brain plugged into computer code
Source: Christian Lagerek via Alamy Stock Photo

Amidst concerns that employees could be entering sensitive information into the ChatGPT artificial intelligence model, a data privacy vendor has launched a redaction tool aimed at reducing companies' risk from inadvertently exposing customer and employee data.

Private AI's new PrivateGPT platform integrates with OpenAI's high-profile chatbot, automatically redacting 50+ types of personally identifiable information (PII) in real time as users enter ChatGPT prompts.

PrivateGPT sits in the middle of the chat process, stripping out everything from health data and credit-card information to contact data, dates of birth, and Social Security numbers from user prompts, before sending them through to ChatGPT. When ChatGPT responds, PrivateGPT re-populates the PII within the answer, to make the experience more seamless for users, according to a statement this week from PrivateGPT creator Private AI.

"Generative AI will only have a space within our organizations and societies if the right tools exist to make it safe to use," said Patricia Thaine, co-founder and CEO of Private AI, in a statement. "By sharing personal information with third-party organizations, [companies] lose control over how that data is stored and used, putting themselves at serious risk of compliance violations."

Privacy Risks & ChatGPT

Every time a user enters data into a prompt for ChatGPT, the information is ingested into the service's LLM data set, used to train the next generation of the algorithm. The concern is that the information could be retrieved at a later date if proper data security isn't in place for the service.

"The aspect of AI consuming all input as source material for others queries presents a black box of uncertainty as to exactly how and where a company's data would end up and completely upends the tight data security at the heart of most all companies today," warns Roy Akerman, co-founder and CEO at Rezonate.

This risk of data exposure is not theoretical, it should be noted: OpenAI in March acknowledged a bug that released users' chat histories, after screen shots of private chats started showing up on Reddit.

OpenAI has warned users to be selective when using ChatGPT: "We are not able to delete specific prompts from your history. Please don't share any sensitive information in your conversations," OpenAI's user guide notes.

Yet employees are still learning about how to handle privacy when it comes to ChatGPT, even as the service sees a dizzying amount of adoption (it reached the milestone of 100 million users in record time, just two months after launch).

In a recent report, data security service Cyberhaven detected and blocked requests to input sensitive data into ChatGPT from 4.2% of the 1.6 million workers at its client companies, including confidential information, client data, source code, and regulated information.

As a concrete example of the phenomenon, earlier in the month it came to light that Samsung engineers had made three significant leaks to ChatGPT: buggy source code from a semiconductor database, code for identifying defects in certain Samsung equipment, and the minutes of an internal meeting.

"The wide adoption of AI language models is becoming widely accepted as a means of accelerating delivery of code creation and analysis," says Akerman. "Yet data leakage is most often a by-product of that speed, efficiency, and quality. Developers worldwide are anxious to use these technologies, yet guidance from engineering management has yet to be put in place on the do's and don'ts of AI usage to ensure data privacy is respected and maintained."

About the Author

Tara Seals, Managing Editor, News, Dark Reading

Tara Seals has 20+ years of experience as a journalist, analyst and editor in the cybersecurity, communications and technology space. Prior to Dark Reading, Tara was Editor in Chief at Threatpost, and prior to that, the North American news lead for Infosecurity Magazine. She also spent 13 years working for Informa (formerly Virgo Publishing), as executive editor and editor-in-chief at publications focused on both the service provider and the enterprise arenas. A Texas native, she holds a B.A. from Columbia University, lives in Western Massachusetts with her family and is on a never-ending quest for good Mexican food in the Northeast.

Keep up with the latest cybersecurity threats, newly discovered vulnerabilities, data breach information, and emerging trends. Delivered daily or weekly right to your email inbox.

You May Also Like


More Insights