How Humans Learn: A Plain Tour of Memory

By Pritesh Yadav 9 min read

Before we can build software that helps a person learn, we have to understand the machine we are trying to help: the human brain. Here is the surprising truth that shapes everything in this book — the brain is brilliant at storing things forever, but terrible at holding more than a tiny amount at once, and it forgets new things almost immediately unless we do something about it. Every clever technique later in this book is really just a workaround for these three facts. So let's take a calm, plain-English tour of how memory actually works.

3.1 Two memories, not one

Your memory is not one big container. It comes in two very different parts that work together.

Working memory is the small space where you actively think about whatever is in front of you right now — the sentence you are reading, the numbers you are adding. It has two cruel limits. First, it is tiny. George Miller famously suggested in 1956 that it holds about seven items, but later work by Nelson Cowan (2001) showed the real limit is closer to four separate items once you stop people from silently repeating them to themselves. Second, it empties in seconds the moment your attention wanders.

Long-term memory is the opposite: effectively unlimited and effectively permanent. It is where knowledge lives once it has truly stuck. Learning, in one sentence, is the act of moving information from the cramped working memory into the vast long-term store.

Analogy: Working memory is like a small kitchen counter — you can only lay out a few ingredients at a time, and someone wipes it clean every few minutes. Long-term memory is the giant pantry in the back: huge, organized, and durable. Cooking (thinking) happens on the counter, but everything is stored in the pantry.

3.2 Chunking: how experts cheat the four-item limit

If working memory only holds about four items, how does a chess master glance at a board and remember every piece? The trick is chunking — grouping small items into one bigger, meaningful unit. To you, the letters C-A-T are three items; once you read them as the word "CAT," they become a single chunk. The expert isn't holding more raw items; they pull whole patterns out of long-term memory at once.

Example: The phone number 4 1 5 5 5 5 0 1 7 2 is ten separate items — impossible to hold. Written as 415-555-0172 it becomes three friendly chunks. Nothing changed except the grouping. Good teaching does exactly this regrouping for the learner.

This is the single most important idea for anyone designing an AI tutor: every lesson has to fit through a roughly four-item doorway. Dump ten new ideas on someone at once and working memory overflows — the learner sweats, tries hard, and almost nothing reaches the pantry.

3.3 The three steps every memory goes through

Getting something into long-term memory and back out again happens in three stages. Keep these words; the whole book leans on them.

  • Encoding — getting information in: paying attention to it and connecting it to things you already know.
  • Storage — keeping it filed away over time.
  • Retrieval — pulling it back out when you need it.
   NEW INFO
      |
      v
 [ WORKING MEMORY ]   <- tiny (~4 chunks), wiped fast
      |  encoding (attention + linking)
      v
 [ LONG-TERM MEMORY ] <- huge, durable
      ^
      |  retrieval (pulling it back out)
      |
   WHEN YOU NEED IT

Here is a quiet but powerful point we will return to: retrieval is not just a way of checking what you stored — the act of pulling something out actually strengthens it. Every time you successfully recall a fact, the path to it gets easier to find next time. That one finding flips ordinary studying on its head, and Chapter 5 is built entirely on it.

3.4 Schemas: the brain's filing system

Long-term memory does not store facts as a random pile. It organizes them into schemas — mental models, or organized patterns of knowledge about how something works. You have a "restaurant schema": you walk in already expecting a host, a menu, a bill at the end. You never have to relearn it.

Schemas are the secret to beating the four-item limit. New information that connects to an existing schema arrives as one chunk instead of many. This is why an expert learns new material in their field faster than a beginner: they already have rich schemas to hang it on. A good lesson — and a good AI tutor — always reaches for what the learner already knows, so new material lands as fewer, bigger chunks.

Tip: When you teach (or design a tutor), start every new idea by connecting it to something the learner already understands. "You know how a thermostat works? A feedback loop is the same idea…" That link turns a hard new chunk into an easy extension of an old one.

3.5 The forgetting curve: forgetting is the default

In the 1880s Hermann Ebbinghaus ran experiments on himself, memorizing lists of nonsense syllables like "WID" and "ZOF" and testing how much he kept after various delays. He found that forgetting is fast and predictable: you lose roughly half of new information within about an hour, and around 70–80% within a day. Then the loss slows down. Drawn on a graph, this makes a steep drop that flattens out — the forgetting curve.

 memory
 100% |*
      | *
      |   *
      |     *  . . . . . forgetting curve
      |        *  .
      |           *   .  . . . .
      |              *      .       .
   0% +-------------------------------> time
       1hr   1 day        1 week

The crucial second half of his discovery: every time you review and successfully recall the material, the curve resets and gets flatter — you forget more slowly each time. So forgetting is not a personal weakness or a sign you're "bad at learning." It is simply the normal behavior of human memory, and it has a cure.

Analogy: A new memory is like a path worn through tall grass. Walk it once and the grass springs back overnight. Walk it again before it fully recovers, and the path stays clearer for longer. Walk it repeatedly over weeks and it becomes a permanent trail that barely fades. Each review is one more walk down the path.

This is exactly why an AI tutor cannot just teach a thing once and walk away. No matter how clear the lesson, most of it is gone by tomorrow. The forgetting curve tells the tutor when to step back in — right as a memory is about to fade.

3.6 Why understanding the brain is the foundation of everything

Three brain facts drive three families of technique you'll meet in the coming chapters. Here is the map.

Brain factThe problem it causesThe technique (preview)
Working memory is tiny (~4 chunks)Showing too much at once overflows it; effort is wastedCognitive load — chunk, sequence, strip clutter (Chapter 4)
Retrieval strengthens memoryRe-reading feels productive but builds littleRetrieval practice — quiz, recall, explain back (Chapter 5)
Memory fades along the forgetting curveTeach once and it's mostly gone tomorrowSpacing — review at growing intervals (Chapter 6)

Let's preview each briefly so the names feel familiar later.

Cognitive load — protecting the tiny workbench

Because working memory is so small, everything you ask a learner to think about competes for the same scarce space. Some of that effort is the real difficulty of the subject (algebra is just harder than addition). Some of it is wasted effort caused by bad presentation — clutter, jargon, confusing layout, hunting for the answer. The whole craft is to cut the wasted effort to near zero so the freed-up space goes to real learning. That's Chapter 4.

Retrieval — pulling out, not pushing in

Henry Roediger and Jeffrey Karpicke (2006) showed that students who tested themselves remembered far more weeks later than students who simply reread the same material the same number of times — even though the rereaders felt more confident. Pulling information out is the thing that builds durable memory. Chapter 5.

Spacing — reviewing right before you'd forget

Spreading study out over time beats cramming the same minutes into one sitting. Reviewing each item just as it's about to fade strengthens it the most. That's the engine behind flashcard apps, and it's Chapter 6.

Common mistake: Trusting how things feel. Rereading and highlighting feel smooth and familiar, so learners conclude "I know this" — surveys find about 80% of college students name rereading as their top study method, despite it being one of the weakest. That smooth feeling is just recognizing the surface, not real retrievable knowledge. The methods that actually work (recalling, spacing) feel harder and make you stumble, so people wrongly judge them as worse. Throughout this book, never confuse "feels easy" with "is learned."

One last gentle warning that ties it together: good learning often feels uncomfortable. A bit of struggle — trying to recall before being told, mixing problem types — slows you down in the moment but builds memory that lasts. Researchers call these "desirable difficulties." The catch is in the word desirable: a struggle only helps if the learner has enough background to push through it. Pile difficulty on someone missing the basics and you just get frustration. Calibrating that balance, learner by learner, is exactly where a well-built AI tutor can shine — and where the rest of this book is headed.

Key takeaways
  • Memory has two parts: a tiny, fast-emptying working memory (~4 chunks) and a vast, durable long-term memory. Learning is moving information from the first into the second.
  • Chunking and schemas (prior knowledge) let learners squeeze more through the narrow working-memory doorway — always connect new ideas to what's already known.
  • Memory works in three steps — encoding, storage, retrieval — and the act of retrieval itself strengthens what you know.
  • The forgetting curve means new material is mostly gone within a day unless reviewed; each successful review flattens the curve, so spaced review is a necessity, not a nicety.
  • These brain facts justify the three core techniques ahead: managing cognitive load, retrieval practice, and spacing — and warn us never to trust the comfortable feeling of fluency.

Continue reading