* . *
  • About
  • Advertise
  • Privacy & Policy
  • Contact
Monday, July 14, 2025
Earth-News
  • Home
  • Business
  • Entertainment
    Entertainment Business Master’s Grad Launched Nonprofit to Nurture Emerging Artists – Full Sail University

    Entertainment Business Master’s Grad Launched Nonprofit to Nurture Emerging Artists – Full Sail University

    Review: At the Huntington, the New Hollywood String Quartet recalls legendary studio musicians – Los Angeles Times

    Review: At the Huntington, the New Hollywood String Quartet recalls legendary studio musicians – Los Angeles Times

    Kehoe repeals paid sick leave, allows several counties in the Ozarks to have entertainment districts in bill signings – KY3

    Kehoe repeals paid sick leave, allows several counties in the Ozarks to have entertainment districts in bill signings – KY3

    Emily Deschanel was scolded during “Bones” season 1 for being ‘late and unprepared’: ‘I was just beside myself’ – Yahoo

    Emily Deschanel was scolded during “Bones” season 1 for being ‘late and unprepared’: ‘I was just beside myself’ – Yahoo

    How you can see new movies early – Yahoo

    Unlock the Secret to Watching New Movies Before Everyone Else!

    Immersive sports and entertainment venue Cosm set to build its 5th location in Cleveland – WKYC

    Cosm Reveals Exciting Vision for Its 5th Immersive Sports and Entertainment Venue in Cleveland

  • General
  • Health
  • News

    Cracking the Code: Why China’s Economic Challenges Aren’t Shaking Markets, Unlike America’s” – Bloomberg

    Trump’s Narrow Window to Spread the Truth About Harris

    Trump’s Narrow Window to Spread the Truth About Harris

    Israel-Gaza war live updates: Hamas leader Ismail Haniyeh assassinated in Iran, group says

    Israel-Gaza war live updates: Hamas leader Ismail Haniyeh assassinated in Iran, group says

    PAP Boss to Niger Delta Youths, Stay Away from the Protest

    PAP Boss to Niger Delta Youths, Stay Away from the Protest

    Court Restricts Protests In Lagos To Freedom, Peace Park

    Court Restricts Protests In Lagos To Freedom, Peace Park

    Fans React to Jazz Jennings’ Inspiring Weight Loss Journey

    Fans React to Jazz Jennings’ Inspiring Weight Loss Journey

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • Science
  • Sports
  • Technology
    Sentrycs’ Cyber Over RF technology integrated into Rafael’s combat-proven Drone Dome system – Defence Industry Europe

    Sentrycs’ Cyber Over RF Technology Boosts Rafael’s Battle-Tested Drone Dome System

    Nordic Air Defence raises $3 million to expand operations and advance drone defence technology – Defence Industry Europe

    Nordic Air Defence Lands $3 Million to Transform Drone Defense and Supercharge Operations

    China’s energy dominance in three charts – MIT Technology Review

    How China Is Powering Its Energy Dominance: A Visual Breakdown

    Meta Acquires AI Startup PlayAI to Enhance Voice Technology Capa – GuruFocus

    Meta Acquires AI Startup PlayAI to Revolutionize Voice Technology Capabilities

    Stallion Uranium Provides Update on Technology Data Acquisition Agreement – GlobeNewswire

    Stallion Uranium Announces Exciting Progress in Technology Data Acquisition Agreement

    2025 WE Local Prague Recap: Inspiring Women in Engineering and Technology – Society of Women Engineers

    2025 WE Local Prague Recap: Inspiring Women in Engineering and Technology – Society of Women Engineers

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
No Result
View All Result
  • Home
  • Business
  • Entertainment
    Entertainment Business Master’s Grad Launched Nonprofit to Nurture Emerging Artists – Full Sail University

    Entertainment Business Master’s Grad Launched Nonprofit to Nurture Emerging Artists – Full Sail University

    Review: At the Huntington, the New Hollywood String Quartet recalls legendary studio musicians – Los Angeles Times

    Review: At the Huntington, the New Hollywood String Quartet recalls legendary studio musicians – Los Angeles Times

    Kehoe repeals paid sick leave, allows several counties in the Ozarks to have entertainment districts in bill signings – KY3

    Kehoe repeals paid sick leave, allows several counties in the Ozarks to have entertainment districts in bill signings – KY3

    Emily Deschanel was scolded during “Bones” season 1 for being ‘late and unprepared’: ‘I was just beside myself’ – Yahoo

    Emily Deschanel was scolded during “Bones” season 1 for being ‘late and unprepared’: ‘I was just beside myself’ – Yahoo

    How you can see new movies early – Yahoo

    Unlock the Secret to Watching New Movies Before Everyone Else!

    Immersive sports and entertainment venue Cosm set to build its 5th location in Cleveland – WKYC

    Cosm Reveals Exciting Vision for Its 5th Immersive Sports and Entertainment Venue in Cleveland

  • General
  • Health
  • News

    Cracking the Code: Why China’s Economic Challenges Aren’t Shaking Markets, Unlike America’s” – Bloomberg

    Trump’s Narrow Window to Spread the Truth About Harris

    Trump’s Narrow Window to Spread the Truth About Harris

    Israel-Gaza war live updates: Hamas leader Ismail Haniyeh assassinated in Iran, group says

    Israel-Gaza war live updates: Hamas leader Ismail Haniyeh assassinated in Iran, group says

    PAP Boss to Niger Delta Youths, Stay Away from the Protest

    PAP Boss to Niger Delta Youths, Stay Away from the Protest

    Court Restricts Protests In Lagos To Freedom, Peace Park

    Court Restricts Protests In Lagos To Freedom, Peace Park

    Fans React to Jazz Jennings’ Inspiring Weight Loss Journey

    Fans React to Jazz Jennings’ Inspiring Weight Loss Journey

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • Science
  • Sports
  • Technology
    Sentrycs’ Cyber Over RF technology integrated into Rafael’s combat-proven Drone Dome system – Defence Industry Europe

    Sentrycs’ Cyber Over RF Technology Boosts Rafael’s Battle-Tested Drone Dome System

    Nordic Air Defence raises $3 million to expand operations and advance drone defence technology – Defence Industry Europe

    Nordic Air Defence Lands $3 Million to Transform Drone Defense and Supercharge Operations

    China’s energy dominance in three charts – MIT Technology Review

    How China Is Powering Its Energy Dominance: A Visual Breakdown

    Meta Acquires AI Startup PlayAI to Enhance Voice Technology Capa – GuruFocus

    Meta Acquires AI Startup PlayAI to Revolutionize Voice Technology Capabilities

    Stallion Uranium Provides Update on Technology Data Acquisition Agreement – GlobeNewswire

    Stallion Uranium Announces Exciting Progress in Technology Data Acquisition Agreement

    2025 WE Local Prague Recap: Inspiring Women in Engineering and Technology – Society of Women Engineers

    2025 WE Local Prague Recap: Inspiring Women in Engineering and Technology – Society of Women Engineers

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
No Result
View All Result
Earth-News
No Result
View All Result
Home Technology

Training /= chatting: ChatGPT and other LLMs don’t remember everything you say

May 29, 2024
in Technology
Share on FacebookShare on Twitter

29th May 2024

I’m beginning to suspect that one of the most common misconceptions about LLMs such as ChatGPT involves how “training” works.

A common complaint I see about these tools is that people don’t want to even try them out because they don’t want to contribute to their training data.

This is by no means an irrational position to take, but it does often correspond to an incorrect mental model about how these tools work.

Short version: ChatGPT and other similar tools do not directly learn from and memorize everything that you say to them.

This can be quite unintuitive: these tools imitate a human conversational partner, and humans constantly update their knowledge based on what you say to to them. Computers have much better memory than humans, so surely ChatGPT would remember every detail of everything you ever say to it. Isn’t that what “training” means?

That’s not how these tools work.

LLMs are stateless functions

From a computer science point of view, it’s best to think of LLMs as stateless function calls. Given this input text, what should come next?

In the case of a “conversation” with a chatbot such as ChatGPT or Claude or Google Gemini, that function input consists of the current conversation (everything said by both the human and the bot) up to that point, plus the user’s new prompt.

Every time you start a new chat conversation, you clear the slate. Each conversation is an entirely new sequence, carried out entirely independently of previous conversations from both yourself and other users.

Understanding this is key to working effectively with these models. Every time you hit “new chat” you are effectively wiping the short-term memory of the model, starting again from scratch.

This has a number of important consequences:

There is no point at all in “telling” a model something in order to improve its knowledge for future conversations. I’ve heard from people who have invested weeks of effort pasting new information into ChatGPT sessions to try and “train” a better bot. That’s a waste of time!
Understanding this helps explain why the “context length” of a model is so important. Different LLMs have different context lengths, expressed in terms of “tokens”—a token is about 3/4s of a word. This is the number that tells you how much of a conversation the bot can consider at any one time. If your conversation goes past this point the model will “forget” details that occurred at the beginning of the conversation.
Sometimes it’s a good idea to start a fresh conversation in order to deliberately reset the model. If a model starts making obvious mistakes, or refuses to respond to a valid question for some weird reason that reset might get it back on the right track.
Tricks like Retrieval Augmented Generation and ChatGPT’s “memory” make sense only once you understand this fundamental limitation to how these models work.
If you’re excited about local models because you can be certain there’s no way they can train on your data, you’re mostly right: you can run them offline and audit your network traffic to be absolutely sure your data isn’t being uploaded to a server somewhere. But…
… if you’re excited about local models because you want something on your computer that you can chat to and it will learn from you and then better respond to your future prompts, that’s probably not going to work.

So what is “training” then?

When we talk about model training, we are talking about the process that was used to build these models in the first place.

As a big simplification, there are two phases to this. The first is to pile in several TBs of text—think all of Wikipedia, a scrape of a large portion of the web, books, newspapers, academic papers and more—and spend months of time and potentially millions of dollars in electricity crunching through that “pre-training” data identifying patterns in how the words relate to each other.

This gives you a model that can complete sentences, but not necessarily in a way that will delight and impress a human conversational partner. The second phase aims to fix that—this can incorporate instruction tuning or Reinforcement Learning from Human Feedback (RLHF) which has the goal of teaching the model to pick the best possible sequences of words to have productive conversations.

The end result of these phases is the model itself—an enormous (many GB) blob of floating point numbers that capture both the statistical relationships between the words and some version of “taste” in terms of how best to assemble new words to reply to a user’s prompts.

Once trained, the model remains static and unchanged—sometimes for months or even years.

Here’s a note from Jason D. Clinton, an engineer who works on Claude 3 at Anthropic:

The model is stored in a static file and loaded, continuously, across 10s of thousands of identical servers each of which serve each instance of the Claude model. The model file never changes and is immutable once loaded; every shard is loading the same model file running exactly the same software.

These models don’t change very often!

Reasons to worry anyway

A frustrating thing about this issue is that it isn’t actually possible to confidently state “don’t worry, ChatGPT doesn’t train on your input”.

Many LLM providers have terms and conditions that allow them to improve their models based on the way you are using them. Even when they have opt-out mechanisms these are often opted-in by default.

When OpenAI say “We may use Content to provide, maintain, develop, and improve our Services” it’s not at all clear what they mean by that!

Are they storing up everything anyone says to their models and dumping that into the training run for their next model versions every few months?

I don’t think it’s that simple: LLM providers don’t want random low-quality text or privacy-invading details making it into their training data. But they are notoriously secretive, so who knows for sure?

The opt-out mechanisms are also pretty confusing. OpenAI try to make it as clear as possible that they won’t train on any content submitted through their API (so you had better understand what an “API” is), but lots of people don’t believe them! I wrote about the AI trust crisis last year: the pattern where many people actively disbelieve model vendors and application developers (such as Dropbox and Slack) that claim they don’t train models on private data.

People also worry that those terms might change in the future. There are options to protect against that: if you’re spending enough money you can sign contracts with OpenAI and other vendors that freeze the terms and conditions.

If your mental model is that LLMs remember and train on all input, it’s much easier to assume that developers who claim they’ve disabled that ability may not be telling the truth. If you tell your human friend to disregard a juicy piece of gossip you’ve mistakenly passed on to them you know full well that they’re not going to forget it!

The other major concern is the same as with any cloud service: it’s reasonable to assume that your prompts are still logged for a period of time, for compliance and abuse reasons, and if that data is logged there’s always a chance of exposure thanks to an accidental security breach.

What about “memory” features?

To make things even more confusing, some LLM tools are introducing features that attempt to work around this limitation.

ChatGPT recently added a memory feature where it can “remember” small details and use them in follow-up conversations.

As with so many LLM features this is a relatively simple prompting trick: during a conversation the bot can call a mechanism to record a short note—your name, is a preference you have expressed—which will then be invisibly included in the chat context passed in future conversations.

You can review (and modify) the list of remembered fragments at any time, and ChatGPT shows a visible UI element any time it adds to its memory.

Bad policy based on bad mental models

One of the most worrying results of this common misconception concerns people who make policy decisions for how LLM tools should be used.

Does your company ban all use of LLMs because they don’t want their private data leaked to the model providers?

They’re not 100% wrong—see reasons to worry anyway—but if they are acting based on the idea that everything said to a model is instantly memorized and could be used in responses to other users they’re acting on faulty information.

Even more concerning is what happens with lawmakers. How many politicians around the world are debating and voting on legislation involving these models based on a science fiction idea of what they are and how they work?

If people believe ChatGPT is a machine that instantly memorizes and learns from everything anyone says to it there is a very real risk they will support measures that address invented as opposed to genuine risks involving this technology.

>>> Read full article>>>
Copyright for syndicated content belongs to the linked Source : Hacker News – https://simonwillison.net/2024/May/29/training-not-chatting/

Tags: chattingtechnologytraining
Previous Post

Three.js Shading Language

Next Post

ProjectPro (YC IK12) Is Hiring

Spatio-Temporal Geographic Networks for Value Co-Creation and Technology Transfer in China with Patent Data – Nature

Unlocking Innovation: How Spatio-Temporal Geographic Networks Drive Value Co-Creation and Technology Transfer in China Using Patent Data

July 14, 2025
2025 MLB Draft tracker, results: Live updates, complete list of every pick, first-round analysis – CBS Sports

2025 MLB Draft tracker, results: Live updates, complete list of every pick, first-round analysis – CBS Sports

July 14, 2025
Canids as pollinators? Nectar foraging by Ethiopian wolves may contribute to the pollination of Kniphofia foliosa – ESA Journals

Could Ethiopian Wolves Be Unexpected Pollinators of Kniphofia foliosa?

July 14, 2025
Guest Opinion: Science is stronger with robust federal funding – Palo Alto Online

Why Strong Federal Funding is Essential for Advancing Science

July 14, 2025
Weight loss may ‘rejuvenate’ fat tissues, clearing away aged cells – Live Science

Weight Loss Could ‘Rejuvenate’ Fat Tissue by Clearing Out Old Cells

July 14, 2025
If your goal is to glow up, say goodbye to these 10 daily decisions – VegOut

10 Daily Habits to Ditch Now for a Stunning Glow-Up

July 14, 2025
‘We’ve never seen a team do this to PSG’ – how Chelsea won Club World Cup – BBC

Unbelievable Comeback: How Chelsea Shocked PSG to Clinch the Club World Cup!

July 14, 2025
India will become $10 trillion economy over next decade, GCCs to contribute $0.5 trillion – The Economic Times

India Poised to Become a $10 Trillion Economy Within a Decade, Powered by GCCs Driving $0.5 Trillion Growth

July 14, 2025
Entertainment Business Master’s Grad Launched Nonprofit to Nurture Emerging Artists – Full Sail University

Entertainment Business Master’s Grad Launched Nonprofit to Nurture Emerging Artists – Full Sail University

July 14, 2025
11 lessons for health tech startups from one of UpToDate’s creators – STAT

11 Essential Lessons for Health Tech Startups from a Leading Industry Innovator

July 14, 2025

Categories

Archives

July 2025
MTWTFSS
 123456
78910111213
14151617181920
21222324252627
28293031 
« Jun    
Earth-News.info

The Earth News is an independent English-language daily published Website from all around the World News

Browse by Category

  • Business (20,132)
  • Ecology (721)
  • Economy (743)
  • Entertainment (21,631)
  • General (15,893)
  • Health (9,781)
  • Lifestyle (751)
  • News (22,149)
  • People (745)
  • Politics (754)
  • Science (15,962)
  • Sports (21,242)
  • Technology (15,728)
  • World (727)

Recent News

Spatio-Temporal Geographic Networks for Value Co-Creation and Technology Transfer in China with Patent Data – Nature

Unlocking Innovation: How Spatio-Temporal Geographic Networks Drive Value Co-Creation and Technology Transfer in China Using Patent Data

July 14, 2025
2025 MLB Draft tracker, results: Live updates, complete list of every pick, first-round analysis – CBS Sports

2025 MLB Draft tracker, results: Live updates, complete list of every pick, first-round analysis – CBS Sports

July 14, 2025
  • About
  • Advertise
  • Privacy & Policy
  • Contact

© 2023 earth-news.info

No Result
View All Result

© 2023 earth-news.info

No Result
View All Result

© 2023 earth-news.info

Go to mobile version