Large language models (LLMs) like ChatGPT are susceptible to tricky prompts and can potentially leak the data they were trained on.
A collaborative effort by researchers from Google DeepMind, UC Berkeley, the University of Washington, and others revealed that this method called “divergence attack”, can compromise user privacy.
The researchers believe that fixing specific vulnerabilities won’t be enough, adding that addressing the underlying loopholes is imperative for flawless security.
In its study, the researchers explored a phenomenon called “memorization”, where it was found that LLMs are capable of recalling and reproducing certain fragments of the data used to train them.
The researchers were working on “extractable memorization”, where they were exploring the potential of using specific queries to extract data.
The team experimented with different language models, including ChatGPT, LLaMA, and GPT-Neo, while generating billions of tokens. Then, they checked them for potential matches with the respective datasets used to train the systems.
Surprisingly, ChatGPT showed its memorization capabilities, which means that the model can remember user inputs and the information used to train them. With tricky prompts from other users, the generative AI can later reveal these details.
The Researchers Tailored “Divergence Attack” For ChatGPT
A unique technique, known as “divergence attack”, was tailored for ChatGPT by the researchers. In this case, they requested the model to repeat the word “poem’ infinitely. In the process, they observed that the model unexpectedly revealed their training data.
Likewise, the researchers requested ChatGPT to repeat the word “company” repeatedly, which prompted the AI to reveal the phone number and email address of a law firm in the US.
This data included detailed investment research reports on specific Python codes for machine learning tasks. The most alarming part of this finding was that the system memorized and revealed personal information of the trainers like phone numbers and email addresses.
Using only $200 worth of queries to ChatGPT (GPT-3.5- Turbo), we are able to extract over 10,000 unique verbatim memorized training examples. Our extrapolation to larger budgets suggests that dedicated adversaries could extract far more data.Researchers
The study explains that a comprehensive approach is needed to test AI models beyond the aspects users usually face to scrutinize the foundational base models like API interactions.
What Does The Vulnerability Mean For ChatGPT Users?
Within the first couple of months after its launch, ChatGPT gained a mammoth user base of more than 100 million. Although OpenAI expressed its commitment to secure user privacy, the recent study brings the risks to the forefront.
ChatGPT is susceptible to leaking user information on receiving specific prompts, and this puts its users’ information at risk.
Companies have already responded to concerns over data breaches, with Apple restricting its employees from using LLMs.
In a measure to boost data security, OpenAI added a feature that allows users to turn off their chat history. However, the system retains sensitive information for 30 days before it deletes it permanently.
Google researchers have issued a caution to users to refrain from using LLMs for applications where they need to reveal sensitive information without adequate security measures in place. While ChatGPT was initially introduced as a beneficial and safe AI, the latest report brings worrying concerns to the forefront.
>>> Read full article>>>
Copyright for syndicated content belongs to the linked Source : TechReport – https://techreport.com/news/chatgpt-vulnerability-exposes-user-information-can-potentially-leak-training-data/