Databases III — Scaling Up: Replication, Partitioning & NoSQL
So far you have used one database on one server. For most apps, that is fine. But as an app grows, one server eventually struggles. This section explains why one server isn't enough, and the two main tricks to grow beyond it: replication (keep copies of the same data) and partitioning (split different data across machines). We'll also meet the NoSQL database family and learn about connection pooling, a small idea that quietly saves many apps from falling over.
The four ceilings of one server
A "ceiling" here means a limit you eventually hit. A single database server has four separate limits, and the most important lesson is that each one has a different fix. Naming them separately helps you choose the right tool.
- Read load — too many "read" queries per second. A read is a
SELECT: fetching data without changing it (showing a product page). Too many reads overwhelm the server's CPU and disk. (Easiest to fix.) - Write load — too many "write" queries per second. A write is an
INSERT,UPDATE, orDELETE: it changes the data. (Hardest to fix.) - Storage — the data grows bigger than one machine's disk can hold (or bigger than its memory can keep fast).
- Availability — "available" means up and reachable. One server is a single point of failure: if it dies, the whole app is down, and you can't even take it offline for an upgrade without downtime.
Two directions to grow: up vs out
There are two ways to give a database more power.
- Vertical scaling (scale up) — buy a bigger machine: more CPU, more memory, a faster disk. Simple, no code changes, no new complexity. But there is a hard physical ceiling (the biggest machine that exists), the cost grows faster than the power (a "twice as powerful" box costs much more than twice as much), and it's still a single point of failure.
- Horizontal scaling (scale out) — add more machines and spread the work. Effectively unlimited, and gives you spare copies for safety. But now machines must talk over a network and stay in agreement, which adds real complexity. Replication and sharding live here.
Replication — many copies of the same data
Replication means keeping the same dataset on several servers. It gives you more machines to serve reads, a backup ready to take over if one dies, and copies that can sit physically near your users for speed.
The standard shape is leader/follower (also called primary/replica; the old term was master/slave). One server is the leader. All writes go to the leader. The leader keeps a change log — an ordered list of every change it made (Postgres calls this the WAL, "Write-Ahead Log"; MySQL calls it the binlog). It streams this log to the followers, which replay the exact same changes in the exact same order. Reads can be answered by any server.
Why one leader? Writes must happen in a definite order to keep the data consistent, and the single leader is the one place that decides that order.
write
client --------> [ LEADER ]
| streams change log (in order)
+-----------+-----------+
v v v
[ F1 ] [ F2 ] [ F3 ] (followers)
^ ^ ^
+-----------+-----------+
| reads
clients ------------+ (any follower can answer)
One arrow IN (writes), many arrows OUT (reads).
How fast must followers keep up? Three choices
| Mode | Leader replies "OK" when… | Trade-off |
|---|---|---|
| Synchronous | at least one follower confirms it saved the write | No data loss, but slower; if that follower is slow or down, writes stall. |
| Asynchronous (common default) | the leader saves it locally — followers catch up later | Fast and resilient, but if the leader dies before a write reaches a follower, that write is lost. |
| Semi-synchronous | exactly one follower confirms; the rest update async | Practical middle ground: always a second durable copy, with low latency. Common in production. |
Replication lag and the "read-your-own-writes" bug
Replication lag is how far a follower trails the leader — usually milliseconds, but it can stretch to seconds under heavy load or network trouble. This causes a classic, confusing bug.
Read replicas are followers dedicated to answering reads. Adding read replicas is the cheapest, easiest scaling win there is — each one multiplies your read capacity. The catch: your app must split its traffic — writes go to the leader, reads go to the replicas.
Failover — when the leader dies
Failover means promoting a follower to become the new leader. Three steps: (1) detect the leader is dead (usually a missed heartbeat or timeout); (2) choose the most up-to-date follower; (3) reconfigure clients and the other followers to point at the new leader.
Partitioning / sharding — splitting the data itself
Replication copies the same data everywhere. Partitioning splits different data across machines: shard A holds users 1–1,000,000, shard B holds 1,000,001–2,000,000, and so on. ("Partitioning" is the general term; "sharding" usually means partitioning across separate servers. A "shard" is one of those pieces.) This is the only way to scale total writes and total storage past one machine, because each shard has its own leader accepting writes in parallel.
You must pick a partition key (or shard key): the column that decides which shard a row lives on. Choosing it well is the single most important sharding decision.
Two ways to assign rows to shards
- Range partitioning — split by ranges of the key (names A–M on shard 1, N–Z on shard 2; or by date). Good: efficient ordered scans, like "all orders in March." Bad: prone to hot spots — if you shard by timestamp, every one of today's writes hits the single "today" shard.
- Hash partitioning — run the key through a hash function (a function that turns any input into a scrambled fixed-size number) and assign by that. Good: spreads load evenly. Bad: you lose range scans, because neighboring keys scatter to random shards.
The hot-spot / hot-key problem
Even hashing can't save you from a single super-popular key. In a flash sale, every "add to cart" for one product ID hashes to the same shard — that shard drowns while the others idle. (Engineers nickname this the "Justin Bieber problem": one celebrity account overwhelms one machine.) Mitigations: add a small random suffix to spread one hot key across several partitions (e.g. product#0 … product#9), then gather them on read; or cache that hot key. Amazon DynamoDB does this automatically — it detects a "hot partition" and splits it.
The mod-N trap and consistent hashing
A tempting but broken scheme is hash(key) % N, where N is the number of servers. It spreads data fine — until you add a server.
hash(key) % N. With 4 servers you use % 4. Add a 5th and switch to % 5, and almost every key now maps to a different server — forcing a near-total data reshuffle and a cache-miss storm. Use consistent hashing instead.Consistent hashing imagines a ring of numbers (0 up to a huge value, then wrapping back to 0). You hash each server onto a point on the ring, and each key onto a point too. A key belongs to the first server you meet going clockwise. Now adding or removing a server only reassigns the keys in one arc — about 1/N of the data — instead of all of it.
0 / 2^32
.
S3 . . S1
. .
K --> (key snaps clockwise to next server)
. .
. .
S2
Add S4 between S2 and S1: it only "steals"
the arc from S2 to S4 — not the whole ring.
Real systems improve this with virtual nodes (vnodes): each physical server is placed at many points on the ring (often ~100–200). This evens out the load and, when a server leaves, spreads its share across many neighbors instead of dumping it all on one. Cassandra, DynamoDB, and many caches use this.
Why reads are easy and writes are hard
This is the deep idea tying the section together. Reads are stateless and order-independent: ask 1,000 replicas the same question and they all give the same answer, so you just add replicas. Writes must be serialized — put in a definite order — to keep data consistent, and that ordering needs a single decision point (the leader), which is a chokepoint. Sharding scales writes only by giving up easy cross-shard joins and transactions.
NoSQL — and why it exists
NoSQL means "Not Only SQL." These databases appeared in the 2000s when web-scale companies needed easy horizontal scaling, flexible data shapes, and very high write throughput that single-server relational databases struggled to deliver. The core trade: most NoSQL stores drop rich relational features (joins, fixed schema, multi-row transactions) in exchange for easy scale-out and speed. There are four families.
| Family | What it stores | Examples | Good for | Weakness |
|---|---|---|---|---|
| Key-Value | a giant dictionary: get/put by key | Redis, DynamoDB, Memcached | caching, sessions, rate limiters, leaderboards | can't query by value; no relationships |
| Document | self-contained JSON-like documents | MongoDB, Couchbase | catalogs, user profiles, evolving data | joins are awkward; you denormalize |
| Wide-Column | rows by partition key; flexible columns | Cassandra, HBase, Bigtable | time-series, IoT, event logs, write-heavy feeds | must design tables per query up front |
| Graph | nodes + relationships as first-class | Neo4j | social graphs, recommendations, fraud | doesn't shard easily; weak for bulk analytics |
("Schema" = the fixed structure of your data; a "flexible schema" lets different records have different fields. "Denormalize" = store duplicate copies of data together so you don't need a join to read it.)
SQL vs NoSQL — a balanced view
SQL databases (PostgreSQL, MySQL) give you a strict schema, strong ACID transactions (reliable all-or-nothing changes), powerful joins, and decades of maturity. They're the right default whenever relationships and correctness matter — money, orders, inventory. NoSQL gives flexible schemas, built-in horizontal scale, and high throughput, but weaker (often eventual) consistency and limited joins, so you model the data around your exact queries.
Connection pooling — a small fix that saves apps
Opening a database connection is surprisingly expensive. PostgreSQL forks a separate operating-system process for each connection — several megabytes of memory each — plus a security handshake. It was never built for thousands of simultaneous connections, and max_connections (often defaulting to ~100) is a hard ceiling. A busy web app where every request opens its own connection will exhaust the database and crawl.
A connection pool keeps a small set of already-open connections and lends them out to requests, taking them back when each request finishes. App-side pools include HikariCP (Java) and pgx (Go); a popular dedicated proxy is PgBouncer, which fronts thousands of clients onto a few real backend connections (a PgBouncer client costs ~2 KB versus megabytes for a real connection).
PgBouncer offers pool modes: session (a real connection is tied to a client for its whole session), transaction (the real connection is held only for one transaction — most efficient, but breaks session features like server-side prepared statements and LISTEN/NOTIFY), and statement (per query).
- One server hits four separate ceilings — read load, write load, storage, availability — and each needs a different fix.
- Replication (same data, many copies) scales reads and gives availability; only sharding (different data, split across machines) scales writes and storage.
- Reads are easy to scale (fan out to replicas); writes are hard because they must be ordered through a single leader — a CAP/eventual-consistency trade-off.
- Use consistent hashing with virtual nodes, never
hash(key) % N; and watch for hot keys that overwhelm one shard. - NoSQL trades joins, fixed schema, and multi-row ACID for scale and flexibility — pick by access pattern, and a relational DB is the right default for most apps.
- Always put a connection pool in front of Postgres, and size it to the database's capacity, not your traffic.