Knowledge Graphs and Curriculum Generation

By Pritesh Yadav 9 min read

Imagine you hand a smart but disorganized friend a giant pile of facts about Python programming and say, "Teach me to be a developer in four months." If they just read the pile to you in the order it happens to be stacked, you'll drown. You'll hit "decorators" before you know what a function is, or "databases" before you can write a loop. Good teaching is not just having the knowledge — it's knowing what depends on what, and laying it out in the right order, sized to where the learner is right now.

This chapter is about giving your AI tutor that sense of order. We'll cover two connected ideas: the knowledge graph (a map of a subject's concepts and which ones must come first) and curriculum generation (turning a learner's personal goal into a step-by-step path across that map). And because letting a language model draw the map unsupervised is risky, we'll see exactly where a human expert has to step in.

What a knowledge graph is

A knowledge graph is simply a map of a subject broken into small concepts, with arrows showing which concepts you need before you can learn another. Each small concept is often called a knowledge component — a single teachable idea, like "what a variable is" or "how a for-loop repeats." The arrows are prerequisites: an arrow from A to B means "you should understand A before B."

Analogy: Think of a video-game skill tree. You can't unlock the fireball spell until you've earned the basic-magic node that feeds into it. A knowledge graph is that skill tree for a subject — the tutor walks you up it, never offering a branch whose earlier nodes aren't lit yet.

Here's a tiny slice of a Python graph. Read the arrows as "comes before":

   Variables
      |
      v
   Data types ---> Lists & loops ---> Functions
                        |                |
                        v                v
                   Dictionaries     Decorators

This structure is not a nice-to-have. It is what makes "what should I learn next?" a principled decision instead of a random one. Real systems prove the point: the adaptive math platform ALEKS (short for Assessment and LEarning in Knowledge Spaces) maps middle-school math into roughly 1,000 linked concepts, and China's Squirrel AI famously broke the same subject into over 10,000 fine-grained "knowledge points" so it could pinpoint a learner's exact gap. Finer maps diagnose more precisely — but each extra node costs real expert effort to author and maintain.

Tip: Granularity is a design dial, not a free upgrade. Start coarse (a few dozen concepts) so a human can actually verify the whole map, then split nodes only where learners keep getting stuck and you need finer aiming.

Why prerequisites are the heart of it

The single most useful thing a graph does is let the tutor find the root cause of a failure. Suppose a learner keeps getting fraction equations wrong. A naive tutor drills more fraction equations — and the learner keeps failing, getting demoralized. A graph-aware tutor follows the arrows backward, discovers the real weakness is plain adding fractions, and teaches that instead. You fix the foundation, and the upper floor stops collapsing.

This connects directly to ideas from earlier chapters. The right next step lives in the learner's Zone of Proximal Development — the sweet spot just beyond what they can already do alone, hard enough to grow but reachable with a little help. The graph tells you which steps are even reachable (all their prerequisites are met); the learner model tells you which ones are still needed. ALEKS gives this reachable-and-needed set a name: the outer fringe, the concepts a learner is "ready to learn" right now.

From a goal to a personal path

Now the second half: curriculum generation. A learner arrives with a goal in plain English — "become a Python web developer in four months, I already know a little HTML." Your job is to turn that into an ordered path. Here is the recipe:

  1. Pin down the destination. Translate the fuzzy goal into concrete, checkable objectives. Not "understand Python" but "can build and deploy a small web app with a database." Objectives must use observable, do-able verbs (build, write, debug), because the tutor needs a clear finish line to know when the learner has arrived.
  2. Find the start. A short diagnostic quiz marks which concepts the learner already owns, so you don't waste their four months re-teaching HTML.
  3. Select the needed concepts. Walk the graph backward from the goal's concepts, collecting every prerequisite that isn't already mastered. This is the learner's personal subset of the map.
  4. Order them safely. Sort that subset so every concept appears after all the things it depends on. (Computer scientists call this a topological sort; in plain terms, "never teach B before its arrow-parents A.")
  5. Pace it. Slice the ordered list into weeks that fit the four-month budget and respect the limits of working memory — a few new ideas at a time, not ten.
Example: Two learners give the identical goal "Python web developer in 4 months." One already codes in JavaScript; the other has never programmed. Same destination, same graph — but the diagnostic produces two very different paths. The first skips loops and functions and starts near web frameworks; the second begins at variables. The graph is shared; the curriculum is personal.

Sequencing: the rule that prevents chaos

Sequencing by prerequisites is the one rule you cannot break. If your generated path ever places a concept before something it depends on, every later lesson built on it wobbles. The topological sort guarantees a valid order, but there's usually more than one valid order — and that's where good judgment (and a sprinkle of learning science) comes in. Once the foundations are solid, you can interleave related skills (mix problem types instead of drilling one to death) so the learner practices choosing the right approach, and you can schedule spaced reviews of shaky earlier concepts so they don't fade. The graph defines the legal orders; pedagogy picks the best one.

How LLMs help — and where they must be checked

Building a knowledge graph by hand is slow, expert work. This is exactly where a large language model (LLM) — an AI trained to read and write human language — earns its keep. You can ask it to draft the first version of the map, list the concepts in a subject, propose the prerequisite arrows, and even sketch a week-by-week curriculum for a given goal. It's a tireless junior author that produces a complete first draft in seconds.

But a first draft is not a finished map. LLMs hallucinate — they state wrong things with total confidence — and they are weak at the precise, structural reasoning a dependency graph demands. Left alone, an LLM will happily invent a prerequisite that's backwards, miss a crucial dependency, or place an advanced topic too early. In a learning product, a confident wrong answer is worse than no answer, because the learner trusts it and builds on the mistake.

Common mistake: Treating the LLM-drafted graph as the source of truth and shipping it straight to learners. The cautionary tale of the heavily funded "robot tutor in the sky" Knewton is the warning: confident algorithms and big claims are not the same as real learning. A government study found its adaptive course produced no significant gains. The systems that lasted, like Carnegie Mellon's Cognitive Tutor, were slow, theory-grounded, and independently checked.

So the workflow is a partnership, not a handoff:

StepLLM does the heavy liftingHuman expert verifies
List conceptsDrafts the full set of knowledge componentsRemoves duplicates, fills real gaps
Draw arrowsProposes prerequisite linksFixes backwards or missing dependencies
Generate pathSequences a curriculum for the goalConfirms order is sound and well-paced
Write lessonsDrafts explanations, examples, quizzesChecks facts, fixes quiz answers

A practical safeguard: don't let the model improvise from memory. Use retrieval-augmented generation (RAG) — feed it the actual source material (a textbook, a course syllabus) and instruct it to build the graph and lessons only from that grounded text, with citations back to the page. This sharply reduces invented prerequisites and gives a human a fast way to spot-check each claim against the source.

Tip: Make the expert's job easy, not heroic. Have the LLM output the graph as a simple list of "A must come before B" statements a teacher can skim and tick off. Reviewing a clean list of arrows is far faster than auditing a tangled diagram, and it's where the highest-value human attention belongs.

Putting it together

The graph is the stable, reusable map of a whole subject — built once with the LLM drafting and an expert verifying. The curriculum is the personal, throwaway route across that map for one learner with one goal and one starting point. The tutor keeps a running estimate of what each learner knows, recomputes the "ready to learn" set as they progress, and serves the next reachable, needed, right-sized step — reviewing shaky earlier concepts along the way. That loop is the difference between a chatbot that answers questions and a tutor that actually drives someone from "I know nothing" to "I can build this."

Key takeaways
  • A knowledge graph maps a subject into small concepts plus prerequisite arrows; it turns "what to teach next" from guesswork into a principled choice.
  • Trace failures backward along the arrows to the root-cause prerequisite — drilling the failing topic directly rarely fixes a missing foundation.
  • Curriculum generation = find the goal's concepts, subtract what the learner already knows, order the rest so nothing precedes its prerequisites, then pace it to fit the time and working-memory limits.
  • Always serve the next step inside the learner's Zone of Proximal Development: reachable in the graph, still needed, and a small stretch — not a leap.
  • Let the LLM draft the graph, the path, and the lessons (ideally grounded in real sources), but a human expert must verify the prerequisites and facts before any of it reaches a learner.

Continue reading