The Big Picture: Why Systems Fundamentals Are Durable

By Pritesh Yadav 12 min read

Welcome. Before we touch a single database or network protocol, let's answer the most important question: why learn this stuff at all, and why will it still be useful in twenty years? This section is the map for the whole guide. Everything that follows hangs off the ideas here.

Two words to define first: "data engineering" and "systems fundamentals"

Data engineering is the job of moving, storing, shaping, and serving data reliably and at scale. "At scale" means it keeps working when the amount of data, or the number of users, gets very large. A data engineer builds the pipes and storage that other people use: the analysts who make charts, the apps customers click on, the machine-learning models that need to be fed. Think of it as plumbing for information.

Systems fundamentals are the layer underneath that work: how a computer actually executes a task. How bytes (a byte is 8 bits, the smallest unit of data a computer normally handles) move through the CPU, the memory, the disk, and the network. How separate machines and separate threads of work coordinate without stepping on each other.

The thesis of this entire guide is simple: you cannot build good data systems without understanding the machine underneath them. Every design choice — batch or stream? add an index or scan the whole table? copy the data or split it across machines? use a lock or a queue? — is, at bottom, a bet about a physical resource and how much that resource costs. If you don't understand the resource, you're guessing.

Key takeaway: Data engineering is the what (move and serve data). Systems fundamentals are the why and how (what the machine is physically doing). The second is the foundation that makes the first reliable.

The four pillars (and why a database is secretly all four)

Almost everything in computer systems rests on four load-bearing topics. Picture them as four faces of one single problem: "get the right data to the right place, correctly, fast enough."

  • Databases — the storage pillar. How data is saved on durable media (storage that survives a power cut), indexed (organized so you can find things fast), queried, and kept correct. This covers things like B-trees and LSM-trees (two ways of organizing data on disk — covered later), transactions, and indexes.
  • Distributed systems — the scale + failure pillar. What you do when one machine is not enough: copy data to several machines (replication), split data across machines (partitioning or sharding), get machines to agree (consensus), and survive machines crashing.
  • Networking — the movement pillar. How bytes physically travel between machines: packets, the TCP/IP protocols, the hard speed limit set by the speed of light, and the difference between bandwidth and latency (defined below).
  • Concurrency — the coordination pillar. Doing many things at once on one machine without corrupting your data: threads, locks, race conditions, and async (asynchronous) input/output.

Here's the punchline that ties them together: a distributed database is all four pillars at once. It is a database (pillar 1), copied over a network (pillar 3), accessed by many users concurrently (pillar 4), spread across many machines (pillar 2). Learn the four pillars and you can reason about the hardest systems out there.

Analogy: Think of one click on a website as a parcel's journey. It travels down a road (networking), is handled by busy workers sorting many parcels at once (concurrency), arrives at a warehouse with branches in several cities (distributed systems), and is finally placed on a physical shelf (the database / storage). One click touches all four.

Why these skills barely change in 30 years

Programming languages, cloud providers, and "framework of the month" tools churn constantly. Yet the gap between the fastest and slowest places a computer can store data has stayed roughly the same for decades. Why? Because fundamentals are bounded by physics and information theory, not by fashion.

The speed of light is a hard floor. The cost of a "cache miss" (asking for data and finding it isn't in the fast nearby memory, so you must fetch it from somewhere slower) is a hard floor. The impossibility of instantly agreeing across an unreliable network is a hard floor. Nobody can buy their way past these with money or a newer library.

A famous reference table of latency numbers (we'll see it below) was popularized around 2012. In 2026 it is still essentially correct. Only disks and networks got faster, and only by a constant factor — not by changing category. So time you spend understanding why a cache miss is expensive pays off for your whole career. Time spent memorizing one specific tool's exact function names depreciates within a couple of years.

Common mistake: Believing AI and large language models (LLMs) make fundamentals obsolete. The opposite is true. LLMs are great at the churny surface (boilerplate code, glue between APIs) and weakest at deep system reasoning — figuring out why something is slow or wrong. An AI will happily write code that is O(n²) (gets dramatically slower as input grows) or makes a network call inside a loop. Only someone with the mental model catches it. Fundamentals are how you supervise AI output instead of being misled by it. And modern machine learning is itself a brutal systems problem — its bottleneck is usually data movement and I/O (input/output: reading and writing data), not the math.

The mental model: a computer as a stack of layers

The cleanest way to picture a computer is as a stack you pass through to get work done. Each layer up adds abstraction (hiding messy detail) and, usually, more delay.

+-----------------------------------------------+
|  DATA      tables, files, messages, objects   |  what we care about
+-----------------------------------------------+
|  NETWORK   sockets, TCP/IP, other machines    |  slowest hops live here
+-----------------------------------------------+
|  PROCESS   your program: threads, heap, stack |  concurrency lives here
+-----------------------------------------------+
|  OS        scheduler, virtual memory, syscalls|  the referee
+-----------------------------------------------+
|  HARDWARE  CPU, caches, RAM, SSD, network card|  physics lives here
+-----------------------------------------------+
        (reaching DOWN or ACROSS = slower)

Three ideas to lock in:

  • The OS (operating system) is a referee. It multiplexes scarce hardware — meaning it shares one CPU and one pool of memory (RAM) among many running programs, taking turns so fast it looks simultaneous, while stopping programs from corrupting each other.
  • A syscall (system call) is a controlled trip down to the OS — for example, "open this file" or "send this on the network." It costs roughly a microsecond or more (a microsecond, written µs, is one millionth of a second). That's cheap compared to disk, but expensive compared to a plain function call inside your program.
  • Crossing a layer boundary costs time. The further down or across you reach, the slower it gets. Reading a variable in your program touches just the CPU and RAM. Reading a row from a database on another continent touches every layer, including the network. Most performance bugs are simply "we accidentally reached across an expensive boundary inside a loop."

The latency numbers every engineer should know

Latency means "how long until I get an answer" — the delay before a single operation completes. Below are the canonical numbers (from the Jeff Dean / Peter Norvig lineage). Do not memorize the exact figures. Memorize the orders of magnitude — the powers of ten and the ratios. (One nanosecond, ns, is one billionth of a second; a millisecond, ms, is one thousandth.)

OperationApprox. timeRelative to L1
L1 cache read (tiny on-chip memory)0.5 ns
Branch mispredict5 ns~10×
L2 cache read7 ns~14×
Mutex lock/unlock25 ns~50×
Main memory (RAM) read100 ns~200×
Send 1 KB over 1 Gbps network~10 µs~20,000×
Random 4 KB read, SSD (NVMe today ~10–70 µs)~150 µs*~300,000×
Round trip, same datacenter / cross-AZ~0.5 ms~1,000,000×
HDD (spinning disk) seek~10 ms~20,000,000×
Round trip, Virginia ↔ Ireland (AWS)~68 ms~130,000,000×
Round trip, California ↔ Netherlands~150 ms~300,000,000×

*The 150 µs is the 2012 figure for SATA SSD; modern NVMe SSDs do random 4 KB reads in roughly 10–70 µs. The shape of the table is unchanged.

The shape that matters: roughly every step down the storage hierarchy is about 10–100× slower than the one above it. RAM is ~100× slower than L1 cache. SSD is ~1000× slower than RAM. A spinning disk seek and an intercontinental round trip are millions of times slower than L1.

Analogy: Multiply every time by a billion so one CPU cycle feels like one second. Then: L1 cache = 0.5 seconds. RAM = under 2 minutes. An SSD random read = ~1.7 days. A same-datacenter round trip = ~6 days. A disk seek = ~4 months. A California-to-Netherlands round trip = ~4.8 years. Suddenly "never call the network inside a loop" feels obvious — you'd never run an errand that takes years, a hundred times in a row.
Analogy: The storage hierarchy as a library. A fact in your head = registers/L1 (instant). A book open on your desk = RAM (a quick glance). A book on a shelf across the room = SSD (get up and walk). A book in another building = a spinning disk (drive across town). A book mailed from another country = a cross-region network call. Every step is dramatically slower than the last.

Bandwidth is not latency (the two-budget rule)

This trips up nearly everyone. Bandwidth is how much data you can move per second (megabytes per second). Latency is how long one round trip takes. They are two separate budgets.

You can buy bandwidth — add more or fatter pipes. You cannot buy latency below the speed of light. Light in fiber-optic cable travels at about two-thirds the speed of light, roughly 4.9 µs per kilometer one-way. So a 5,500 km link (Virginia to Ireland) can never beat about 27 ms one-way, ~54 ms for the round trip — and the measured AWS round trip is ~68 ms, i.e. physics plus a little routing overhead. No CDN, no upgrade, no money removes that 54 ms.

Analogy: A cargo ship versus a sports car. The ship has huge bandwidth (carries enormous cargo) but terrible latency (slow to arrive). The car carries little but arrives fast. Mailing a truck full of SSDs across the country can beat the internet on bandwidth — but it's hopeless on latency. Pick the right tool for the budget that's actually tight.
Best practice: To cut latency, cut distance — use edge servers, CDNs, or regional copies of your data closer to users. Buying a faster pipe only helps bandwidth. And to avoid latency pain, reduce the number of round trips: batch and pipeline requests instead of making fifty chatty back-and-forth calls.

The five resources, and the one skill that matters most

Every system runs against a fixed budget of physical resources. Engineering is deciding which to spend and which to conserve.

  • CPU — compute cycles. Tight when you are "CPU-bound" (hashing, compression, parsing, ML inference). Add cores — but then you need concurrency (pillar 4) to use them.
  • Memory (RAM) — fast, but small, expensive, and volatile (it's wiped when power is lost). Tight when your "working set" (the data you're actively using) doesn't fit. The penalty for overflowing is spilling to disk — a 1000×+ cliff.
  • Disk / storage — durable, large, cheap, slow. Tight on capacity and especially on IOPS (random operations per second). Sequential access (reading data laid out in order) is far cheaper than random access (jumping around). This is the root reason databases love append-only logs and big ordered scans.
  • Network — two budgets: bandwidth (scalable) and latency (physics-bound).
  • Time — the meta-constraint, the one users actually feel. A latency budget like "respond within 200 ms" forces every other trade-off.

The core craft is this: find the bottleneck resource first. Optimizing anything else buys nothing.

Example: An API endpoint loops over 100 items and makes one database call per item. Each call is a ~50 ms network round trip: 100 × 50 ms = 5 seconds, almost all of it spent waiting on the network. Doubling the CPU speed changes nothing — the CPU is idle, waiting. The fix is to replace 100 calls with one batched query. This is the famous "N+1 query" bug, and it's the single most common performance problem in real systems.
Common mistake: Optimizing a resource that isn't the bottleneck. A request stuck waiting on a 50 ms database call is not made faster by a faster CPU. Measure first, then optimize the constraint you actually have.

How the rest of this guide is built

This section was the map; the four pillars are the territory. We move in this order: physics (the latency numbers and the layer stack you just met) → constraints (the five resources) → storage (databases) → coordination on one machine (concurrency) → coordination across machines (distributed systems) → the wire that connects them (networking). Every later chapter is one variation of the same question: how do we get good behavior out of a resource that is scarce, slow, or unreliable? The latency table is the cheat sheet you'll mentally consult in every one of them.

Key takeaways:
  • Fundamentals are durable because they're bounded by physics and information theory, not by changing frameworks — the latency table from 2012 is still right in 2026.
  • Everything reduces to four pillars — databases, distributed systems, networking, concurrency — and a distributed database is all four at once.
  • Picture the machine as layers (hardware → OS → process → network → data); reaching down or across costs time, and most slow code reaches across an expensive boundary inside a loop.
  • Memorize orders of magnitude, not exact numbers: RAM ~100× L1, SSD ~1000× RAM, the network ~millions× L1.
  • Bandwidth and latency are separate budgets — you can buy bandwidth, but latency below the speed of light is impossible; cut distance, not pipe size.
  • The master skill is finding the bottleneck resource (CPU, memory, disk, network, time) before optimizing anything — and using these fundamentals to supervise, not trust, AI-generated systems.

Continue reading