System Design
A 20-module, staff-level curriculum: from back-of-the-envelope estimation to consensus, consistency models, and real-world case studies.
21 posts · Engineering
- 1
System Design — Complete Learning Curriculum
This is a 20-lesson study plan for software developers who want to get very good at designing large, reliable computer systems — the kind that handle millions…
- 2
Foundations & Back-of-the-Envelope Estimation
How engineers estimate whether a system can handle a given number of users — before they build anything.
- 3
Networking & Protocols
How data travels between a user's browser and a server — and why some websites feel instant while others feel slow.
- 4
API Design & Service Communication
How different parts of a software system talk to each other — and how to design those conversations so they are reliable, easy to change, and hard to break.
- 5
Database Internals: Storage Engines, Indexes, Transactions
How databases physically store data, why some databases are better for reading versus writing, how indexes speed up (and sometimes slow down) queries, and how…
- 6
05 — Data Modeling: SQL vs NoSQL & Polyglot Persistence
How to decide where and how to store data in a software system — and when a traditional database is the right choice versus a modern "NoSQL" database.
- 7
06 — Caching Deep Dive
Caching — a technique where your software saves a copy of information in a very fast place so it does not have to look it up from the slow database every…
- 8
07 · Load Balancing, Scaling & Stateless Design
How a website that runs on one computer grows into a fleet of many computers that can handle big crowds without breaking.
- 9
08 — Replication & Partitioning (Sharding)
Two big ideas that every large website relies on: keeping copies of your data on multiple computers (so the site stays up if one breaks), and splitting a huge…
- 10
09 — CAP, PACELC & Consistency Models
A fundamental problem every database faces: when computers storing the same data are spread across different locations, what happens when the connection…
- 11
10 — Consensus & Distributed Coordination
How a group of computers can agree on one correct answer — even when some of them crash, slow down, or lose messages.
- 12
Module 11 — Distributed Transactions, Sagas & Idempotency
What happens when a software system needs to save data in two different places at the same time — and why that is much harder than it sounds.
- 13
12 · Messaging, Queues & Event-Driven Architecture
How software systems pass work between their parts without making everything wait for everything else.
- 14
13 — Event Sourcing & CQRS
Two related ideas for storing data in software: "Event Sourcing" (saving a history of everything that happened instead of just the current state) and "CQRS"…
- 15
14 · Stream Processing & Real-Time Systems
How computer systems can react to information the moment it arrives — instead of waiting until the end of the day to run a report.
- 16
15 — Probabilistic Data Structures & Algorithms at Scale
Clever shortcuts that let big software systems work fast without using too much memory. Instead of tracking every exact detail, these tools keep a tiny, rough…
- 17
16 — Rate Limiting, Resiliency & Fault Tolerance
Two things every online system needs: a way to control how much traffic comes in, and a way to stay standing when something inside the system breaks.
- 18
17 · Observability, SRE & Operating Systems at Scale
How software teams know when their systems are healthy, how they find problems when something goes wrong, and how they release updates without breaking things…
- 19
Module 18 — Architecture Patterns: Monolith → Microservices → Cells
How to decide whether to split a software system into smaller, separate pieces — and how to do it safely if you do.
- 20
Module 19 — Specialized Systems: Search, Geospatial, Time-Series & Analytics
Why a standard database like PostgreSQL cannot handle every type of question efficiently, and which specialist tool to use instead.
- 21
20 — Case Studies & the System Design Interview Framework
This document is a study guide for answering "design a big system" questions in a software engineering interview.