Reading the World in Numbers: Data Types, Averages, Spread, and Charts

By Pritesh Yadav June 22, 2026 20 min read —

Every day, the world hands you numbers. A weather app says "70% chance of rain." A news site reports the "average salary" in your city. A product page boasts "4.6 stars from 2,000 reviews." A chart on the news shows a line shooting up. These numbers shape what you believe and what you decide. But raw numbers, by themselves, do not speak. You have to learn to read them — and to spot when someone is using them to mislead you.

This chapter assumes you know nothing about statistics. We start from the very bottom: what data even is. Then we learn how to summarize a pile of numbers with a single value (the average), how to describe how "spread out" those numbers are, and finally how to read the most common kind of chart. By the end, you will look at a number in the news and instinctively ask the right questions.

Key takeaway: Statistics is the art of using a small, measurable piece of the world to say something trustworthy about the whole. The skills in this chapter are not just for scientists — they are everyday self-defense against bad numbers.

10.1 What is data, anyway?

Let's start with the most basic word.

Data: Recorded observations about the world. Anything you write down, measure, or count is data — the ages of people in a room, the price of milk each week, the colors of cars in a parking lot.
Variable: A thing that varies from one observation to the next. Age is a variable because different people have different ages. Eye color is a variable. Yesterday's high temperature is a variable.

The easiest way to picture data is as a spreadsheet. Each row is one thing you observed (one person, one sale, one day). Each column is one variable (their age, their height, how they voted).

        | age | height_cm | favorite_color
--------+-----+-----------+----------------
Asha    |  31 |    165    |   blue
Ben     |  47 |    180    |   green
Carmen  |  22 |    158    |   blue
Dev     |  39 |    172    |   red
        ^      ^           ^
      each column = one variable
   each row = one observation (one person)

The two big kinds of variables

Before you can do anything with a variable, you must know what type it is. This decides which tools you are allowed to use.

Categorical variable: A variable that puts each observation into a category, not a number. Examples: eye color (blue/brown/green), whether someone is a customer (yes/no), country, blood type. You cannot meaningfully add or average these — there is no "average eye color."
Numeric variable: A variable measured as a number, where the size of the number means something. Examples: age, weight, salary, temperature. You can add and average these.

Numeric variables split further into two kinds:

Discrete: Whole counts that can't be split. The number of children in a family (you can have 2 or 3, never 2.4). The number of cars sold.
Continuous: Any value on a scale, including fractions. Height (165.3 cm), weight, time, temperature. Between any two values there's always another possible value.

Common mistake: Treating a category coded as a number like it's a real number. A survey might code "Country" as 1 = USA, 2 = India, 3 = Brazil. If you "average" those codes and get 1.7, that number is meaningless — it is not "1.7 countries." Numbers used as labels are still categories.

10.2 The whole pot vs. the spoonful: population and sample

Here is the deepest idea in all of statistics, and it's surprisingly simple. You almost never get to measure everything. So you measure a piece, and use the piece to talk about the whole.

Population: Everyone (or everything) you actually care about. "All adults in Canada." "Every product our factory makes this year."
Sample: The subset you actually measure. "The 1,000 Canadians we phoned." "The 50 products we pulled off the line to inspect."

Analogy: Imagine a giant pot of soup. You want to know if it's salty enough. You don't drink the whole pot — you stir it well and taste one spoonful. The spoonful is your sample; the pot is the population. The entire game of statistics is using a spoonful to make a confident claim about the pot.

This soup image will come back later. For now, hold onto one lesson hidden inside it: the spoonful only works if the soup is well stirred. If all the salt sank to the bottom and you taste from the top, your spoonful lies. A sample is only trustworthy if it fairly represents the population — a theme we'll return to.

We won't go deep into sampling in this foundations chapter. Just remember the two words and the soup. They are the frame for everything else.

10.3 Summarizing the middle: mean, median, and mode

Suppose you have a column of 1,000 salaries. Nobody can hold 1,000 numbers in their head. We want one number that captures "the typical value." This is called a measure of central tendency — a fancy phrase for "where's the middle?" There are three, and choosing the right one matters enormously.

The mean (the everyday "average")

Mean: Add up all the values, then divide by how many there are. This is what most people mean when they say "average."

If five friends earn $30k, $35k, $40k, $45k, and $50k, the mean is (30+35+40+45+50) ÷ 5 = 200 ÷ 5 = $40k.

Analogy: The mean is the "fair share." Imagine all five friends pooled every dollar into one pile, then split the pile evenly. Each would walk away with the mean — $40k.

The mean has one serious weakness: it gets dragged by extremes. One very large or very small value can yank it far from where most of the data actually sits.

The median (the true middle)

Median: Sort all the values from smallest to largest, then take the one in the exact middle. Half the values are below it, half above.

For our five salaries sorted (30, 35, 40, 45, 50), the middle one is $40k. Here mean and median agree. But watch what happens next.

Analogy: Line everyone up by height, shortest on the left. The median is the height of the person standing dead center. It doesn't care how tall the tallest person is — only who's in the middle.

Why mean ≠ median is the most important lesson here

Keep our five friends earning $30k–$50k. Now Bill Gates walks into the room. Say his income that year is $5 billion.

	The 5 friends	+ Bill Gates (6 people)
Mean	$40k	~$833 million
Median	$40k	~$42.5k

The mean now says the "average" person in the room is a multi-millionaire — which is absurd. Five of the six people are nowhere near it. The median barely budged, to about $42.5k, and still honestly describes a typical person in the room.

Key takeaway: When a few extreme values exist (incomes, house prices, wait times), the median tells the honest story and the mean can be wildly misleading. This is why serious reports cite median household income, not mean.

Common mistake: Assuming "average" always means the meaningful middle. If someone reports the mean of a lopsided thing like wealth, salaries, or home prices, be suspicious. Ask: "Is this the mean or the median?" A handful of billionaires (or one CEO) can make the mean look great while most people are doing far worse.

The mode (the most common)

Mode: The value that appears most often.

Analogy: The mode is the best-selling shoe size in a store — the size the shop stocks most, because it shows up most.

The mode is the only average that works for categorical data. There's no mean or median eye color, but there can be a most common one. If a shop's customers are mostly paying with cards, "card" is the mode of the payment-method variable.

Measure	What it is	Best for	Weakness
Mean	Add up, divide	Roughly symmetric numeric data	Dragged by outliers
Median	Middle when sorted	Skewed data (income, prices)	Ignores exact size of extremes
Mode	Most frequent	Categories; spotting the common case	Useless for "the middle" of numbers

10.4 How spread out is the data? Range, variance, and standard deviation

Two groups can have the exact same average yet be totally different. Consider two classrooms that both average 70% on a test:

Class A: everyone scored between 68% and 72%. Calm and uniform.
Class B: half scored 95%, half scored 45%. Wildly split.

Same average, completely different reality. The average alone hides this. To see it, we need a measure of spread — how much the values differ from each other.

Range — the crude first attempt

Range: The largest value minus the smallest. The simplest possible spread.

If test scores run from 45% to 95%, the range is 50 points. Easy. But the range is fragile: it depends entirely on two values. One freak outlier and the range explodes, even if every other point is tightly clustered.

Variance — the average squared distance

We want a measure that uses all the data, not just the two ends. The idea: for each value, measure how far it sits from the mean. Then average those distances. There's one twist — we square each distance first (multiply it by itself).

Variance: The average of the squared distances of each value from the mean.

Why square? Two reasons. First, squaring makes every distance positive (a value 5 below the mean and a value 5 above both contribute 25), so they don't cancel out. Second, squaring punishes far misses extra hard — being 10 away counts as 100, while being 2 away counts as only 4.

Analogy: Picture darts thrown at a bullseye. Variance measures how scattered the darts are, with the darts that landed far away counting much more heavily than the near-misses.

Variance has one annoying flaw: its units are squared. If your data is in dollars, the variance is in "dollars squared," which means nothing to a human. That awkwardness leads us to the fix.

Standard deviation — spread in normal units

Standard deviation (SD): The square root of the variance. Taking the square root undoes the squaring, so the result is back in the original units. It answers: "on a typical day, how far from average do things land?"

The SD is the workhorse measure of spread. A small SD means values huddle close to the mean (Class A above). A large SD means they're scattered far and wide (Class B). Same mean, very different SD.

Class A (small SD)      Class B (large SD)
mean = 70               mean = 70

      ###                  #            #
     #####                 #            #
    #######                #            #
  66 68 70 72 74        45 ...        95
  tight cluster         two far-apart clumps

Key takeaway: The average tells you where the data centers; the standard deviation tells you how reliable that center is as a description. Always ask for both. A number with no sense of spread is half a story.

Example: Two investment funds both "averaged 8% per year." Fund 1 returned 7%, 8%, 9% across three years (tiny SD — steady). Fund 2 returned +40%, −30%, +14% (huge SD — a roller coaster). Same average, but anyone who needs the money next year should care enormously about the difference. The SD is the risk.

10.5 Slicing the data into hundredths: percentiles, quartiles, and IQR

Another way to describe data without being fooled by extremes is to talk about position.

Percentile: The k-th percentile is the value below which k% of the data falls.

Analogy: A pediatrician says a baby is in the "90th percentile for height." That means the baby is taller than 90% of babies its age. It says nothing about exact centimeters — only the baby's rank in the crowd.

The median you already met is simply the 50th percentile — half below, half above. Three percentiles are used so often they get their own names, the quartiles (because they cut the sorted data into four equal quarters):

Q1 (25th percentile): a quarter of the data is below this.
Q2 (50th percentile): the median.
Q3 (75th percentile): three-quarters of the data is below this.

Interquartile range (IQR): Q3 minus Q1 — the spread of the middle 50% of the data.

The IQR is a robust spread measure: because it throws away the top 25% and bottom 25%, a single billionaire or typo can't blow it up the way it blows up the range. The IQR is the backbone of a chart called the box plot, which draws a box from Q1 to Q3 with a line at the median.

10.6 Outliers: the value that doesn't belong

Outlier: A data point that sits far away from all the others.

Analogy: One billionaire in a room full of schoolteachers. Their wealth is an outlier — it doesn't represent the group, and it distorts the mean.

Beginners often want to just delete outliers to "clean up" the data. Resist. An outlier is a question, not a nuisance.

Best practice: Investigate every outlier before deciding what to do with it. It might be a typo — someone entered an age of "350" instead of "35," and that should be fixed. Or it might be the single most important finding in your whole dataset — the one fraudulent transaction, the one machine about to fail, the one breakthrough patient. Deleting it blindly can erase exactly what you needed to discover.

10.7 The shape of data: distributions and the bell curve

So far we've crushed data into single summary numbers. But the richest picture comes from seeing the whole shape. That's what a distribution is.

Distribution: The pattern of how often each value (or range of values) occurs in your data.
Histogram: The basic picture of a distribution: a bar chart where each bar's height shows how many observations fall into that value-range.

Analogy: Pour out a bag of M&Ms and sort them into colored piles. Some piles are tall (common colors), some short (rare ones). That collection of piles is the distribution; a photo of the pile-heights is the histogram.

The normal distribution (the bell curve)

One shape appears so often in nature that it has a special name.

Normal distribution: A symmetric, single-peaked distribution where most values cluster near the middle and fewer and fewer appear as you move toward the extremes on either side. Its shape looks like a bell, so it's nicknamed the "bell curve."

Analogy: The heights of adult men. Most are near average; a few are quite tall or quite short; almost nobody is a 7-foot giant or under 4 feet. Plot them all and you get a bell — fat in the middle, thin at the edges.

            .-=#####=-.
         .-############-.       The bell curve:
       .################.       most values near the
     .####################.     middle, few at the edges.
   .########################.
 -----|--------|--------|-----
   -2 SD     mean     +2 SD

The bell curve has a beautiful, reliable property called the 68–95–99.7 rule. For data that's roughly normal:

About 68% of values fall within 1 standard deviation of the mean.
About 95% fall within 2 standard deviations.
About 99.7% fall within 3 standard deviations.

Example: Suppose adult male height is normal with a mean of 175 cm and an SD of 7 cm. Then about 68% of men are between 168 and 182 cm (one SD each way), and about 95% are between 161 and 189 cm (two SDs). A man over 196 cm (three SDs above) is rarer than 1 in 1,000 on the tall side. The rule turns "mean + SD" into concrete predictions.

Skew: when the bell tips over

Not all data is symmetric. Often one side has a long tail.

Skew: The lopsidedness of a distribution. A long tail to the right is right-skewed (a.k.a. positive skew); a long tail to the left is left-skewed.

Right-skewed data is everywhere: income, wealth, house prices, wait times, page views. Most values are modest, but a few enormous ones stretch a long tail to the right.

Analogy: Income. Most people earn a normal amount, but a small number of people earn enormous salaries, pulling the right tail far out. That tail is exactly the Bill Gates effect from earlier.

Here's the rule that ties skew back to averages: in right-skewed data, the long tail drags the mean up past the median. So mean > median is a tell-tale sign of right skew. (Left-skewed flips it: mean < median. A left-skew example is an easy exam where almost everyone scores high and a few low scores trail off to the left.)

Right-skewed (income):

   ####
  ######
 ########_
 ##########_____
 ###################_________
 |     |                    
median mean   <- long tail pulls mean rightward

Key takeaway: Always plot a histogram before trusting a summary number. The shape — symmetric, skewed, or even two-humped — tells you which average to trust and warns you when the mean is lying.

One more shape worth a name: a bimodal distribution has two peaks. That's usually a clue that two different groups got mixed together. For example, plotting "time to finish a task" might show two humps because beginners and experts are blended in one dataset. Two peaks mean: split the data and look at each group separately.

10.8 Reading charts without getting fooled

A chart turns numbers into a picture, and our brains read pictures fast — often before we read the labels. That speed is exactly what makes charts powerful and also makes them easy to abuse. Let's learn the honest charts first, then the dishonest tricks.

The charts you'll meet most

Chart	What it shows	Good for
Bar chart	A value for each category	Comparing categories (sales by region)
Histogram	How often numeric values occur	Seeing the distribution / shape
Line chart	A value changing over time	Trends (revenue per month)
Box plot	Median, quartiles, outliers (IQR)	Comparing spread between groups

A note on the difference between a bar chart and a histogram, since beginners mix them up: a bar chart compares separate categories (apples vs. oranges) and the bars have gaps. A histogram shows the distribution of one numeric variable, with bars touching to show a continuous range.

How charts lie

The same true data can be drawn to tell opposite stories. Here are the classic deceptions to watch for.

1. The truncated y-axis (not starting at zero). This is the most common trick of all. If the vertical axis starts at, say, 95 instead of 0, a tiny difference can look like a cliff.

Y-axis starts at 0:        Y-axis starts at 98:

100|        _              100|        ___
   |  _    | |              99|  ___  |   |
   | | |   | |              98| |   | |   |
 50| | |   | |                 +---+--+---+
   | | |   | |                 A: 99   B: 100
   | | |   | |
  0+-+-+---+-+              "B is HUGE!"  (it's a 1% gap)
    A:99  B:100
   "basically equal"

Common mistake: Trusting the height of bars without checking where the axis starts. Whenever a bar chart makes one bar look dramatically taller, glance at the bottom of the y-axis. If it doesn't start at zero, a "huge" difference may be trivial.

2. The cherry-picked time window. Show only the slice of time that supports the story. A stock that fell for ten years but rose last month can be drawn to look like a rocket — if you only plot last month. Always ask: "What does the longer trend look like?"

3. Dual axes that manufacture a fake link. Putting two unrelated lines on the same chart with two different vertical scales lets someone slide the scales until the lines appear to move together. Two lines tracking each other is meaningless if the axes were tuned to make it happen.

4. Misleading area and 3-D effects. If a designer doubles a circle's radius to show "twice as much," the circle's area actually grows fourfold — so the eye sees a quadrupling. 3-D pie charts tilt slices so the front ones look bigger. Our eyes judge area, and area is easy to distort.

5. Color and emphasis. Coloring one bar bright red while the rest are grey makes it pop regardless of its actual value. Emphasis steers your eye before you've read a single number.

Best practice: Before believing any chart, run a four-point check: (1) Where does the y-axis start — at zero? (2) What's the full time window — is a longer trend hidden? (3) What are the actual numbers and units on the axes? (4) Who made this chart, and what do they want you to conclude? If the picture and the numbers disagree, trust the numbers.

This deception-spotting skill has a long pedigree. Back in 1954, Darrell Huff wrote a famous little book called How to Lie with Statistics, full of exactly these tricks. The fact that the same tricks still work 70 years later tells you how naturally our eyes are fooled — and how valuable it is to train them.

10.9 Putting it together: a worked reading

Let's apply the whole chapter to one realistic headline.

Example: A news site reports: "Average home price in our city jumped to $850,000 — homeownership is booming!" A trained reader immediately asks:

Mean or median? Home prices are heavily right-skewed (a few mansions stretch the tail). The mean $850k could be pulled up by a handful of luxury sales, while the typical (median) home is far cheaper. Demand the median.
What's the spread? A single "average" hides whether prices are tightly clustered or all over the map. What's the IQR — the middle 50%?
Any outliers? Did one $40-million estate sale inflate the figure? Investigate before believing.
Population or sample? Is this every sale this year, or a small sample? Was it representative — or only downtown?
Show me the chart. If there's a line going up, does the y-axis start at zero, and is the time window long enough to be honest?

One sentence of statistics; five sharp questions. That instinct is the entire point of this chapter.

Key takeaway: Reading the world in numbers is a set of reflexes, not a set of formulas. Ask what kind of data it is, which average fits, how spread out it is, whether outliers or skew are distorting the picture, and whether the chart is drawn honestly. Master these foundations and the harder ideas — probability, sampling, significance, correlation — will feel like natural next steps rather than new mysteries.

10.10 Chapter recap

Data is recorded observations; a variable is what varies. Variables are categorical or numeric; numeric splits into discrete (counts) and continuous (any value).
You usually measure a sample to learn about a population — the spoonful and the pot. The sample is trustworthy only if it fairly represents the whole.
Three averages: mean (add and divide, but dragged by extremes), median (the honest middle for skewed data), mode (the most common, the only average for categories).
Spread matters as much as the middle: range (crude), variance (average squared distance), and standard deviation (typical distance, in real units). Percentiles, quartiles, and IQR describe position robustly.
Investigate outliers; never delete them blindly.
See the distribution with a histogram. The normal (bell) curve follows the 68–95–99.7 rule. Skew warps things — in right-skewed data, mean > median.
Charts are fast but easy to abuse. Watch for truncated axes, cherry-picked windows, dual axes, area tricks, and emphasis. Check the axes, the window, the numbers, and the source.

Continue reading

🧭 Ten Disciplines

Seeing the Whole: What a System Is and Why It Behaves the Way It Does

🧭 Ten Disciplines

The Machinery of Systems: Stocks, Flows, Feedback Loops, and Delays

🧭 Ten Disciplines

Leverage Points, System Traps, and Thinking in Systems (Advanced & Real-World)