Networking I — How Data Travels: IP, TCP/UDP, DNS

By Pritesh Yadav 15 min read

Every distributed app is just programs on different machines talking to each other over a network. A distributed app means software whose parts run on separate computers and cooperate. When your code calls a database, hits an API, reads from a cache, or asks one microservice to talk to another, that is a network conversation — bytes leaving one machine and (hopefully) arriving at another.

Here is the uncomfortable truth that shapes this whole field: the network is the one part you can never make perfectly reliable. Messages get lost, duplicated, reordered, or delayed in unpredictable ways. Almost every hard problem in distributed systems — timeouts, retries, consistency, latency budgets — traces back to how the network actually behaves. So if you understand IP, TCP, UDP, and DNS, you can reason about why a service is slow, why a request hangs forever, and what a cryptic error like "connection refused" really means.

Key takeaway: The network is a shared, unreliable medium. Understanding it is not optional trivia — it is the foundation that explains most outages, slowdowns, and "weird intermittent bugs" in distributed systems.

The layered model: solving one problem at a time

Networking is built in layers. A layer is a level of responsibility: each one solves a single problem and trusts the layer below it to handle the rest. This keeps the system manageable — the part that encrypts data does not need to know how Wi-Fi works.

The classic teaching diagram is the OSI 7-layer model (a textbook reference model with seven levels). But real internet software uses the simpler, practical TCP/IP 4-layer model. We will use the 4-layer version because it matches reality.

TCP/IP layerIts one jobExamples
ApplicationSpeak the app's languageHTTP, DNS, SMTP (email)
TransportDeliver to the right programTCP, UDP, QUIC
InternetMove data between hosts across networksIP
LinkMove bits across one physical hopEthernet, Wi-Fi
Analogy: Think of mailing a letter as a parcel inside a parcel inside a parcel. Your HTTP request is the letter. TCP wraps it in an envelope that adds port numbers and reliability. IP wraps that in an envelope with the source and destination addresses. The link layer wraps that in one more envelope with the local hardware address for the next hop. Each handler reads only its own envelope and ignores the rest.

This wrapping is called encapsulation — each layer adds its own header (a small block of control information) around the data it receives from above. The data being wrapped is the payload. The receiving machine unwraps in reverse order, and each layer reads only its own header. This is the single most important mental model in networking.

Packets: why data is chopped up

Data is not sent as one giant blob. It is chopped into small pieces called packets. Each packet carries a header (addresses, length, and other control fields) plus its payload (a slice of your actual data).

Why chop it up? Three reasons. First, many conversations can fairly share one wire instead of one huge transfer hogging it. Second, if one small piece is lost, only that piece is resent — not the whole file. Third, different packets can travel different paths to the destination.

Each network link has a Maximum Transmission Unit (MTU) — the largest payload it will carry in one frame. On typical Ethernet this is about 1500 bytes. Send something bigger and it must be fragmented (split) or it gets dropped.

Common mistake: Ignoring MTU. If you send a payload larger than the path MTU (~1500 bytes), it gets fragmented or silently dropped. This is exactly why large DNS responses sent over UDP fall back to TCP — they no longer fit in one packet.

IP: addressing and routing (best-effort delivery)

IP (Internet Protocol) has one job: get a packet from the source machine to the destination machine across many connected networks. Crucially, IP is best-effort and connectionless. "Best-effort" means it tries but makes no promise. "Connectionless" means there is no setup conversation first — each packet is independent.

IP does not guarantee delivery, does not guarantee order, and does not prevent duplicates. This is deliberate: keep the core of the internet dumb and fast, and push reliability out to the edges (to TCP, which we will meet shortly).

IP addresses: IPv4 vs IPv6

  • IPv4 = 32 bits, written as four numbers from 0–255, like 142.250.72.110. That gives about 4.3 billion addresses — far too few for today's internet. The central pool (IANA) ran out in 2011, and the regional registries exhausted their free pools between 2011 and 2019.
  • IPv6 = 128 bits, written as eight groups of hex digits, like 2001:db8::1 (the :: collapses one run of zero groups). The address space is astronomically large. As of 2025, global IPv6 adoption sits around 43–48% (per Google's public measurements) — a slow, decades-long transition, with mobile networks well ahead of corporate ones.
Common mistake: Assuming "everyone is on IPv6 now." Adoption is under half in 2025, so dual-stack setups and NAT'd IPv4 are still everyday reality.

Subnets and CIDR

An IP address splits into a network part (which network you are on) and a host part (which machine within it). CIDR notation writes this as an address plus a slash-number.

Example: 192.168.1.0/24 means "the first 24 bits are the network, the rest is the host." With 32 total bits, 8 bits are left for hosts, giving 2⁸ = 256 addresses: 192.168.1.0 through 192.168.1.255. A /24 is a common home-network size.

Grouping machines into subnets (sub-networks) lets routers — the devices that forward packets between networks — make decisions by network prefix instead of memorizing every individual machine. Routing happens hop-by-hop: each router consults its routing table ("for this prefix, send the packet out this interface to this next router"). No single router knows the entire path; each only knows the next hop. The traceroute tool reveals these hops.

Private addresses and NAT

Some IPv4 ranges are reserved as private and never appear on the public internet: 10.0.0.0/8, 172.16.0.0/12, and 192.168.0.0/16. NAT (Network Address Translation) lets a whole home or office share one public IP. The router rewrites private source addresses to its single public address and remembers the mapping using port numbers, so replies return to the right device.

Example: A home with 5 devices on 192.168.1.x all browse through ONE public IP. The router rewrites each outgoing packet's source address and uses port numbers to remember which reply belongs to which device. NAT was a stopgap for IPv4 scarcity and is now everywhere; a side effect is that incoming connections need port-forwarding to reach a device behind it. IPv6's huge space removes the need for NAT.

Ports: finding the right program

An IP address gets you to the right machine. A port (a number from 0 to 65535) gets you to the right program on that machine. The combination IP:port is called an endpoint.

Analogy: The IP address is the street address of an apartment building. The port number is the specific apartment. You need both to deliver to the right person (the right process).
  • Well-known ports (0–1023): reserved for standard services — HTTP 80, HTTPS 443, DNS 53, SSH 22.
  • Registered ports (1024–49151): assigned to specific applications.
  • Ephemeral / dynamic ports (~49152–65535): the random temporary port your client picks for each outgoing connection.

Transport layer: TCP vs UDP

Both TCP and UDP add port numbers so the operating system knows which program to hand the data to. The difference is what they guarantee.

TCP: reliable, ordered, connection-oriented

TCP (Transmission Control Protocol) gives you a reliable, in-order stream of bytes built on top of unreliable IP. Before any data flows, TCP sets up a connection with the 3-way handshake:

   Client                          Server
     | ----------- SYN ----------> |   "here's my start number"
     | <-------- SYN-ACK --------- |   "got it; here's mine"
     | ----------- ACK ----------> |   "got yours"
     |        (data flows)         |

SYN means "synchronize" (proposing a starting sequence number, a counter for the bytes). ACK means "acknowledge." After these three messages both sides are ESTABLISHED. Notice this costs one full round-trip (a message there and back) of delay before any real data moves. With TLS (the encryption layer behind HTTPS) you add another round-trip or two on top.

Once connected, TCP guarantees correctness through three mechanisms working together:

  • Reliability and ordering: every byte has a sequence number; the receiver sends back ACKs for what it got; anything unacknowledged is retransmitted after a timeout. The receiver reassembles bytes in order, so your app always reads a clean, in-order stream.
  • Flow control: the receiver advertises a window — how many more bytes it can accept right now. This stops a fast sender from overwhelming a slow receiver. This is about the two endpoints.
  • Congestion control: TCP also limits its own send rate to avoid overwhelming the network itself. The classic scheme is AIMD (Additive Increase / Multiplicative Decrease): grow the send rate slowly while things are fine, then cut it sharply (for example, halve it) the moment a packet is lost, since loss signals congestion. Slow start ramps gently at the very beginning.
Common mistake: Confusing flow control with congestion control. Flow control protects a slow receiver (the advertised window). Congestion control protects the shared network (AIMD/slow start). They are separate mechanisms solving different problems.
Analogy: Flow control is a slow cashier telling a fast bagger "slow down, my counter is full" — about the two people. Congestion control is everyone on a highway easing off the gas when traffic jams — about the shared road. AIMD = ease onto the gas, slam the brakes on a crash.

TCP's key weakness is head-of-line (HOL) blocking. Because TCP is one single ordered byte stream, if one packet is lost, every byte after it must wait for that one packet's retransmission — even bytes that already arrived and belong to unrelated requests sharing the connection. One missing packet stalls everything behind it.

Analogy: A single-lane checkout where one stuck customer blocks everyone behind them, even though those shoppers' goods are ready to scan. That stuck customer is the lost packet.

UDP: connectionless, fast, fire-and-forget

UDP (User Datagram Protocol) is the opposite trade. No handshake, no ACKs, no ordering, no retransmission, no flow or congestion control. You send a datagram (a self-contained packet) and hope. It may arrive, arrive out of order, arrive twice, or vanish. Its header is tiny (8 bytes). The upside: very low latency, very low overhead, no head-of-line blocking, and support for one-to-many sending (broadcast/multicast).

AspectTCPUDP
ConnectionYes (3-way handshake)No
Reliable deliveryYes (ACK + retransmit)No
In-orderYesNo
Overhead / latencyHigherVery low
Best forWeb, APIs, databases, emailVoice, video, games, DNS
Analogy: A phone call is UDP — a dropped half-second is better than replaying it 3 seconds late. Downloading a contract PDF is TCP — every byte must arrive perfectly and in order.
Common mistake: Thinking UDP is "broken" or always worse. It is a deliberate trade. No reliability overhead means lower latency, which is exactly right for live voice, video, and games where a late packet is useless anyway.

QUIC and HTTP/3: the modern twist

QUIC runs over UDP but rebuilds reliability, ordering, and congestion control in software — and crucially gives each logical stream its own ordering. So a lost packet stalls only its own stream, not the others, eliminating TCP's transport-level head-of-line blocking. QUIC also folds the connection and TLS handshakes together to cut setup round-trips. HTTP/3 is simply HTTP running over QUIC. This is the headline reason the web industry moved off plain TCP.

Common mistake: Believing you can dodge head-of-line blocking by "just opening more TCP connections." At the transport level it is inherent to TCP's single ordered stream. The real fix is QUIC/HTTP/3's independent per-stream ordering.

Sockets: the programmer's handle

A socket is the operating system's API endpoint your program uses to send and receive over the network. A connection is uniquely identified by a 4-tuple: (protocol, local IP:port, remote IP:port).

  • Server side: create socket → bind to a port → listenaccept (returns a fresh socket per client).
  • Client side: create socket → connectsend/recv.

"Listening on port 8080" means a socket is bound and waiting. Because each connection is a distinct 4-tuple, a single server port can serve thousands of clients at once — each client has a different remote IP:port.

DNS: turning names into addresses

DNS (Domain Name System) is the internet's phone book. Humans use names like www.example.com; machines need IP addresses. DNS is a distributed, hierarchical, heavily cached database, read right-to-left: in www.example.com. the trailing dot is the implicit root, com is the TLD (top-level domain), example is the domain, and www is the host.

A recursive resolver does the legwork for you — it asks everyone and returns the final answer (often your ISP's, or public ones like 8.8.8.8 / 1.1.1.1). Authoritative servers hold the real records. The resolution walks the hierarchy, each level delegating to the next:

 "www.example.com?"
       |
       v
 Recursive Resolver
   |-> Root server      : ".com lives over there"
   |-> .com TLD server  : "example.com's NS is over there"
   |-> Authoritative    : "A record = 93.184.x.x, TTL 300"
       |
       v
 answer cached on the way back

Common record types to know:

  • A — hostname → IPv4 address.
  • AAAA — hostname → IPv6 address.
  • CNAME — an alias: "this name is really that other name," which must then itself be resolved. It cannot coexist with other records at the same name, and must not sit at the root/apex of a domain.
  • MX — mail exchange: which server handles email for the domain (with priorities).
  • NS — which servers are authoritative for a zone. TXT — arbitrary text (used for domain verification, email policies).

Every record carries a TTL (Time To Live) in seconds — how long it may be cached. A high TTL (e.g. 86400 = 1 day) means fewer lookups and faster responses, but slow to change. A low TTL (e.g. 60–300) updates quickly but creates more query load. DNS queries usually travel over UDP port 53 (fast, one packet), falling back to TCP for large responses.

Best practice: Before a planned migration, lower the TTL days in advance so caches expire by cutover. If you change a record while the TTL is still high, caches keep serving the old value until it expires — this is why people complain a change "won't propagate."
Common mistake: Hardcoding IP addresses instead of names. IPs change; DNS exists precisely so you depend on stable names and let resolution plus TTL handle the rest. Also: assuming CNAME stores an IP (it stores another name), and confusing A (IPv4) with AAAA (IPv6).

Putting it together: what happens when you type a URL

Press Enter on https://www.example.com/page and a remarkable amount happens:

  1. Parse the URL: scheme https → port 443, host www.example.com, path /page.
  2. DNS lookup: browser checks its cache, then the OS cache, then the recursive resolver, which walks Root → .com TLD → authoritative server and returns the A/AAAA record. The answer is cached at each level per its TTL.
  3. TCP handshake: open a connection to that IP on port 443 (SYN / SYN-ACK / ACK) — one round-trip.
  4. TLS handshake: negotiate encryption so the page is private.
  5. HTTP request: send GET /page. It is encapsulated: HTTP bytes → TCP segment (ports + sequence numbers) → IP packet (source/dest IP) → link-layer frame (hardware address for the next hop).
  6. Routing and NAT: the frame goes to your gateway; routers forward the IP packet hop-by-hop across the internet; NAT may rewrite your source address on the way out.
  7. Server responds: it unwraps each layer, TCP reassembles the ordered byte stream, the web server reads the request, builds a response, and sends it back the same way.
  8. Browser renders: bytes reassembled by TCP, decrypted by TLS, parsed as HTML — often triggering more DNS + TCP + HTTP rounds for CSS, JavaScript, and images.
Key takeaway: A single "page load" is dozens of network round-trips, each subject to loss, latency, and setup cost. This is exactly why caching, connection reuse, and HTTP/2 and HTTP/3 multiplexing exist — to shrink that cost.
Key takeaways:
  • IP delivers packets to the right machine on a best-effort basis — it may lose, reorder, or duplicate freely; reliability is added above it.
  • Ports find the right program; a socket is your program's handle, and a connection is the full 4-tuple (src IP:port + dst IP:port).
  • TCP builds a reliable, ordered byte stream on unreliable IP, at the cost of handshake latency and head-of-line blocking; flow control protects the receiver, congestion control protects the network.
  • UDP is fast and lossy by design — perfect for voice, video, games, and DNS; QUIC adds modern, per-stream reliability on top of UDP.
  • DNS turns names into addresses through a cached hierarchy (Root → TLD → Authoritative); TTL trades freshness against load, so lower it before migrations.

Continue reading