Bottlenecks and Constraints: The Theory of Constraints

By Pritesh Yadav 10 min read

Imagine you spend a whole weekend making one part of your work faster — and on Monday nothing improves. The team still misses deadlines. Things still pile up. You worked hard, you measured a real improvement, and yet the system as a whole behaved exactly as before. How is that possible?

The answer is one of the most useful ideas in all of systems thinking: every system has a constraint, and almost all improvement effort is wasted unless it lands on that one spot. This chapter is about finding that spot and what to do once you have.

What a constraint is

Let's define two words in plain English first.

Constraint
The single factor that limits how much a whole system can produce. It is the "weakest link." No matter how good everything else is, the total cannot exceed what this one factor allows.
Bottleneck
A constraint that is actively choking the flow right now — like the narrow neck of a bottle that limits how fast liquid pours out. People often use "bottleneck" and "constraint" to mean the same thing.

This idea was turned into a full method by Eliyahu M. Goldratt, a physicist, in his 1984 business novel The Goal (co-written with Jeff Cox). He called it the Theory of Constraints (TOC). TIME magazine later named the book one of the 25 most influential management books. When someone once asked Goldratt to sum up his whole method in one word, he answered: "FOCUS." That is the heart of it — pour your energy into the one place that matters, not everywhere at once.

Analogy: A chain made of ten links can hold only as much weight as its weakest link. If nine links are forged from titanium and one is soft iron, the chain still snaps at the iron link. Every hour spent strengthening the strong links is wasted. The whole game is finding and fixing the weak link.

The counterintuitive heart of TOC

Here is the part that surprises almost everyone the first time: improving a non-constraint can make the system worse.

Suppose a process has five steps in a row. Step 3 is the slow one. If you speed up Step 2 (which comes before the slow step), more work now arrives at Step 3 — but Step 3 still can only handle what it always could. So the extra work just piles up in front of it. You have not increased output. You have increased the pile, the chaos, and the cost of storing half-finished work.

That pile has a name: WIP (Work-In-Progress) — partly finished work waiting in the system. In a factory it is physical inventory on the floor. In software it is code written but not yet deployed. In a hospital it is patients waiting in chairs. As we saw with feedback loops in earlier chapters, pushing harder on the wrong part of a system often amplifies a problem instead of solving it.

Common mistake: Speeding up a non-constraint and calling it progress. Goldratt put it bluntly: "Any improvement made anywhere besides the bottleneck is an illusion." A team that codes faster while the deploy step is the bottleneck just grows a bigger queue of undeployed code.
Analogy: A garden hose with a kink. Water pressure at the tap is high, but the kink decides how much reaches the nozzle. Buying a bigger pump (improving a non-constraint) just builds more pressure behind the kink. The only useful move is to unkink the hose.

The Herbie hike — the most famous lesson

In The Goal, the hero Alex Rogo takes his son's Boy Scout troop on a hike. The goal is for the whole troop to reach camp together. But fast scouts sprint ahead and slow scouts fall behind, so the line stretches out over a quarter mile.

Alex realizes the troop's real speed — the rate the group covers ground — is set entirely by Herbie, the slowest scout. Herbie is the constraint. So Alex does two things:

  1. Lighten Herbie's pack. Herbie is carrying the heaviest load — canned food and iron cookware. They redistribute that weight to stronger scouts so Herbie can walk as fast as he possibly can. This is getting the most out of the constraint.
  2. Move Herbie to the front. Now nobody can outrun him, so the line stops stretching and bunching. Everyone moves at one steady pace. This is making the whole system serve the constraint.

The troop arrives together, on time. Note the trap Alex avoided: putting the fastest scout at the front would only create bigger gaps — fast scouts racing ahead, slow ones bunching behind. Those gaps are WIP made visible.

Common mistake: Reading the Herbie story as "slow the fast people down." The real lesson is the opposite: let the constraint go as fast as it possibly can (lightened and supported), then align everyone else to that pace so the wasteful gaps disappear.

The Five Focusing Steps

TOC gives a simple, repeatable recipe — sometimes called POOGI, the Process of On-Going Improvement:

  1. Identify the constraint. Walk the process and look for where work piles up, where downstream people sit idle waiting, where upstream is overloaded.
  2. Exploit it. Squeeze the most out of the constraint with what you already own — remove its downtime, give it good-quality inputs, keep it never starved or idle. Most constraints run far below their true capacity.
  3. Subordinate everything else to it. Re-pace the whole system to the constraint's speed. Non-constraints should run only fast enough to keep the constraint fed — no faster.
  4. Elevate it. Only now, if you still need more, spend money: add a machine, hire, add a shift, re-architect.
  5. Repeat. Once the constraint is broken, go back to Step 1 — because the constraint has moved. Watch out for inertia (old habits built around the old constraint).
Key takeaway: Exploit and Subordinate come before Elevate. Most organizations skip straight to "buy more capacity," spending money on a constraint that was never running at full speed. TOC estimates around 30% of free, cost-free capacity is hidden in Step 2.

Drum-Buffer-Rope: pacing a system

For factories, TOC packages Steps 2 and 3 into a scheduling method called Drum-Buffer-Rope (DBR):

  • Drum — the constraint sets the beat for the whole line, like a drummer setting marching pace.
  • Buffer — a small cushion of work placed right before the constraint, so it is never left idle waiting for inputs.
  • Rope — a signal that releases new material into the line only as fast as the constraint can consume it, so WIP never floods upstream.
 raw  --ROPE (release rate)-->  [S1]->[S2]->[DRUM]->[S4]->[S5]-> out
   |                                       ^^^^^
   +-- release only as fast as DRUM eats --+
                                  [BUFFER] kept just ahead of DRUM

Constraints come in four flavors

TypeWhat it isExample
Physical / EquipmentA machine, server, or road laneA machine that does 25 parts/hr
People / HumanA scarce expert everyone needsThe one engineer who knows everything
PolicyA rule that caps output even when capacity exists"All orders over $500 need manager sign-off"
MarketDemand is the limit; you have spare capacityNot enough orders to fill the plant
Common mistake: Missing policy constraints. Goldratt argued these are the most common and most damaging in mature organizations — precisely because the limiting rule looks like sensible practice from the inside. A slow machine is obvious; a batch-size rule or an approval chain is invisible.

The same pattern in software and DevOps

The Phoenix Project (Gene Kim and co-authors, 2013) is openly "the modern version of The Goal" for IT. Its constraint is a person: Brent, the one engineer whose knowledge every incident, deployment, and project somehow needs. Work piles up in front of Brent exactly like inventory before a machine.

The fix follows the five steps: identify Brent; exploit (route requests through a ticket queue so he is not constantly interrupted); subordinate (make others document every solution Brent gives); elevate (cross-train others to hold his knowledge). The real company Slido hit this exact pattern — and when they fixed it, the constraint promptly shifted to code-review approvals.

Example: By 2011, Amazon was deploying code roughly once every 11.6 seconds. The old constraint was the slow, manual, batched deployment step. Treating deployment as the constraint and applying exploit/subordinate/elevate (automation, mandatory pipeline, continuous-delivery investment) is essentially how DevOps was born. Kim's rule: until code is in production it is not throughput — it is WIP "stuck in the system."

The bottleneck always moves

Step 5 exists because of the shifting bottleneck: fix one constraint and the next-weakest link instantly becomes the new constraint. A bottling line where the palletizer was the limit gets a second palletizer — and now the labeling station (quietly slower all along) is the constraint. Improvement is not a project with an end date; it is a cycle.

Donella Meadows described the same thing in Thinking in Systems, calling it a shift in the limiting factor: "growth itself depletes or enhances limits, and therefore changes what's limiting." A city fixes its schools, grows, and then housing affordability becomes the new ceiling. Peter Senge captured it as the limits to growth archetype: a growth loop tied to a constraint loop — and the leverage is always on the constraint side, never on pushing the growth side harder.

Tip: Knowing the constraint will move is good news, not bad. It tells you exactly where to look next. The system does get better — it just meets a new ceiling each time it grows past the old one.

Why the wrong metrics fool you

Traditional efficiency numbers actively mislead here. A non-constraint running at 95% utilization looks productive — but it may just be manufacturing harmful WIP. The constraint running at 60% looks wasteful — but it is the actual problem to fix. Utilization measures local performance; what matters is global throughput.

Example: A patient's ER journey: registration 2 min, triage 5, wait for doctor 45, exam 20, labs 30, discharge 5. Hiring a faster registration clerk saves 1 minute of 107 — a 0.9% gain. Hiring one more doctor cuts the 45-minute wait to about 22 — a 21% gain. Same money, wildly different results, because one targets the constraint and one does not.

TOC even redefines the financial measures around throughput. The three operational measures are: Throughput (the rate the system makes money through sales), Inventory (money tied up in things to be sold), and Operating Expense (money spent turning inventory into throughput). The goal: lift throughput while reducing the other two.

Where TOC has limits

TOC is powerful but not magic. Some operations researchers find DBR less than optimal versus full simulation scheduling. TOC assumes one constraint dominates, yet real systems sometimes have two near-equal ones. Its measures were built for factories and need careful translation for knowledge work — measuring story points or lines of code is measuring a non-constraint. And it borrows heavily from earlier thinkers like Deming and Forrester. None of this breaks the core idea; it just means apply it thoughtfully.

Key Takeaways

  • Every system has at least one constraint; total throughput can never exceed what that one constraint allows.
  • Improving anything that is not the constraint does not raise output — it just grows the pile of WIP and the chaos.
  • Follow the Five Focusing Steps in order: Identify, Exploit, Subordinate, then Elevate, then Repeat. Don't skip to spending money.
  • WIP pile-ups are the free, visible signal of where the constraint is — just walk the process and look for the queue.
  • Fix one constraint and the bottleneck moves; improvement is a continuous cycle, not a one-time project (Goldratt's Step 5 = Meadows' shifting limiting factor).
  • Policy and human constraints (Brent, an approval rule) are the hardest to see and often the most damaging in mature organizations.

Continue reading