Why Your AI Keeps Forgetting Things

AI memory looks like database lookup but it isn't. The five failure modes - decay, contradiction, confidence - that make AI forget what you told it.

Open drawer with mixed items representing scattered memories

Updated 11 June 2026 How we review →

By Rob11 June 2026 · 11 min read

If you have spent any time using ChatGPT, Claude, or Gemini over more than one session, you have probably noticed a strange thing. You tell the assistant something useful - your dog's name, that you live in Manchester, that you are allergic to nuts - and it remembers it brilliantly for an hour, sometimes for weeks, and then one Tuesday morning it cheerfully forgets. Worse, it sometimes remembers the wrong thing. You moved house six months ago and it is still confidently recommending pubs in your old postcode.

The instinct is to treat this as a search problem. Store everything, retrieve when needed. But the people building these systems are increasingly realising that this approach is the source of the problem, not the solution. Memory is not a filing cabinet. It is a kitchen drawer that needs the occasional clear-out.

What is AI memory, really?

When most people picture AI memory, they imagine something like a contacts app. The assistant writes down the fact, looks it up later, and reads it back. Simple, deterministic, like a phone number.

The reality is messier. AI memory is layered. There is the conversation right now (whatever is being typed back and forth), there is what the model was trained on months or years ago, and there is whatever store the application has bolted on to track things between sessions - which might be a database, a vector store, a plain text file, or all three.

The bolted-on layer is where the trouble lives. It is the part that has to decide what to keep, what to throw out, how to recall things, and what to do when two recalled things flatly disagree. None of that is what a database is designed for.

Why doesn't simple search work?

Imagine you keep absolutely everything anyone ever tells you in one drawer. Receipts, birthday cards, takeaway menus, the warranty for a fridge from 2008. Whenever you need to recall something, you tip the drawer onto the table and rummage.

This is a fair picture of "store everything, retrieve later" memory. The problem is not the storing - that bit is easy. The problem is what you find when you rummage. The fridge warranty is irrelevant. Half the takeaway menus belong to restaurants that closed years ago. The birthday card from your aunt is sweet but is not helpful for working out what to make for dinner.

AI systems that retrieve from this kind of drawer do exactly what you would do - they grab the most superficially relevant scraps and combine them. Sometimes the result is right. Often it isn't. Five failure modes recur often enough to deserve names.

How does decay actually help?

Human memory is not flat. The phone number you used yesterday is sharper than the phone number you used in 2014. This is decay - facts losing weight over time unless something refreshes them. It feels like a bug, but it is the feature that stops your head filling up with rubbish.

Good AI memory mimics this. A fact gets a freshness score that quietly drops over the weeks. When the system goes to recall something, fresh facts are louder than stale ones. If two facts disagree and the older one has decayed substantially, the newer one wins by default.

The everyday version is the loft clear-out. You haul a box down, look at it, decide most of it can go. You did not delete the memories themselves - if you needed a specific one back, you could remember the box existed. You demoted them, which is what decay does mechanically. The fact still exists; it just no longer shouts when you go looking.

What happens when two memories disagree?

Contradiction is the failure mode that produces the strangest AI behaviour. You told the assistant in January that you have one cat. You told it in May that you have two. A naive retrieval grabs both and either picks one essentially at random or, even worse, returns both and lets the model pick which to believe.

The right behaviour is for the newer fact to supersede the older one, with a record that the older one existed in case it turns out to be the correct one after all. This is exactly how editing works in a notebook - you cross things out rather than rip the page, so you can change your mind back later.

Building this properly is harder than it sounds. "I have two cats now" is easy to spot as superseding "I have one cat". "I might get a second cat eventually" is not. Real systems use timestamps, similarity scores, and small models to judge whether two facts genuinely contradict or just sit alongside each other.

Why does confidence matter for AI memory?

Not every fact is equally well-supported. "My partner is allergic to peanuts" is a hard fact that you have stated several times. "I think the new restaurant on the corner is Korean, maybe" is a guess you mentioned once.

A well-built memory system attaches a confidence score to each fact. Strong facts get retrieved with weight; weak facts are flagged as uncertain. When the assistant uses them, it can hedge appropriately - or, better, ask. The worst behaviour is for a low-confidence fact to be retrieved and then used with the same swagger as a high-confidence one.

This is the difference between a knowledgeable friend and a know-it-all. The knowledgeable friend says I think when they think; the know-it-all does not. The thing that makes AI useful in everyday life is the same thing that makes a friend useful - knowing when to be sure and when to ask.

How does compression keep things manageable?

If you store every single thing the user has ever said, three things happen. Storage costs grow forever. Retrieval gets slower as the haystack grows. And the signal-to-noise ratio drops, because most of what anyone has said is forgettable.

Compression is the answer. Instead of keeping seventeen separate notes that "the user prefers spicy food", you keep one. Instead of keeping every casual mention of the dog over three years, you keep a richer single record - the dog's name, breed, age, vet, food preferences - and discard the noise.

The kitchen-drawer analogy is the right one again. You do not throw out the spatulas; you tidy them into a tray. When you next need a spatula, you find it instantly rather than knocking over a corkscrew, a roll of cling film, and three loyalty cards on the way. The information is the same; the organisation is what makes it useful.

When should memories expire?

Some facts have a clear sell-by date. "I am on holiday next week" is true until the following Monday and then quietly false forever. "My fridge needs servicing" is true until it has been serviced. "My eldest daughter starts secondary school in September" is correct for one specific autumn and never again.

Expiry is the rule that says these memories should die when their time is up, not linger as ghosts. Without it, an assistant happily reminds you about a trip you took last year, or recommends an oven repair you completed six months ago, or refers to your daughter as starting school when she is now in Year 9.

Most current AI memory systems are weak at this. They either keep everything forever or they delete things by age, which is too blunt. The mature design lets the system attach an expiry to facts when they are stored, so a trip booking auto-expires after the dates pass and a school year auto-expires the following summer.

What does this mean for running local AI at home?

If you have followed our guide to running a local LLM on a Raspberry Pi 5 or experimented with local image generators, you have probably found that the "out of the box" memory experience is poor. The model forgets things you would expect it to remember and remembers things you wish it had forgotten.

This is not a sign your hardware is too small. It is the same memory-design problem affecting cloud assistants, just visible at the seams because there is no big corporate plumbing hiding it. Several open-source projects are now experimenting with the five rules above. A common shape is a small SQLite database holding plain-text notes alongside the model, with a tiny scoring script that handles decay, confidence, and expiry. The Towards Data Science write-up that prompted this article describes one such design in detail, and the broader research community has been moving in this direction for some time.

The practical upshot for home users is that local AI memory will get noticeably better over the next year as these designs mature. If you are running anything memory-heavy today - a personal assistant, a home automation helper, a research tool - it is worth keeping the install pattern flexible so you can swap in a better memory backend without rebuilding everything else.

Frequently asked questions

Q01Why does ChatGPT sometimes forget things I told it last week?

Memory in consumer AI products is selective by design. The system stores some interactions and not others, applies its own decay logic, and is bounded by storage limits. From a user perspective the rules are not transparent, which is why the forgetting feels arbitrary. The product is making decisions about what is worth keeping; sometimes those decisions go against what you would have picked.

Q02Can I see what an AI assistant is remembering about me?

On the major consumer products, yes. ChatGPT, Claude, and Gemini all have a memory or personalisation panel in settings where you can review, edit, or delete stored facts. It is worth a look every few months - you will often find a mix of accurate facts, stale ones, and occasional misinterpretations that need correcting or removing.

Q03Is local AI memory more private than cloud AI memory?

Yes, by definition. If the memory store lives on your own device, no third party has it. The trade-off is that you are responsible for backup, security, and the quality of the memory design itself. Running locally is the right choice for sensitive material; running in the cloud is the right choice for convenience.

Q04Will future AI assistants finally get memory right?

They will get it less wrong. There is no single "right" - what counts as useful memory depends on the user, the context, and the task. The improvement curve will look like the early days of search engines: many specialised approaches, gradual standardisation, then a sudden jump when one design clearly wins. We are mid-experiment.

Q05What's the difference between AI memory and a vector database?

A vector database is one type of storage that can hold AI memory. It is good at "find me things semantically similar to this". It is not, on its own, a complete memory system. Decay, contradiction handling, confidence, compression, and expiry are all things you have to build around the vector database. Treating the database as the memory is the mistake.

Context Rot: Why Long AI Sessions Get Worse

Why Your AI Keeps Forgetting Things

What is AI memory, really?

Why doesn't simple search work?

How does decay actually help?

What happens when two memories disagree?

Why does confidence matter for AI memory?

How does compression keep things manageable?

When should memories expire?

What does this mean for running local AI at home?

Frequently asked questions

Context Rot: Why Long AI Sessions Get Worse

Local LLM on Raspberry Pi 5

ChatGPT vs Claude vs Gemini

When to Trust AI Answers

Why Your AI Keeps Forgetting Things

What is AI memory, really?

Why doesn't simple search work?

How does decay actually help?

What happens when two memories disagree?

Why does confidence matter for AI memory?

How does compression keep things manageable?

When should memories expire?

What does this mean for running local AI at home?

Frequently asked questions

Related guides

Context Rot: Why Long AI Sessions Get Worse

Local LLM on Raspberry Pi 5

ChatGPT vs Claude vs Gemini

When to Trust AI Answers