Data Literacy: The Skill Everyone Needs and Nobody Teaches
Every day, you’re presented with data. Graphs in news articles. Statistics in social media posts. KPIs at work. Health metrics from your fitness tracker.
Most people accept this data at face value. They shouldn’t. Basic data literacy — understanding how data can mislead, even when technically accurate — is an essential skill that almost nobody is taught.
The “Technically True” Problem
Statistics can be technically accurate while being deeply misleading. This happens constantly and it’s usually not malicious — it’s just how data works.
“The average salary in this suburb is $150,000” could mean most people earn around $150,000. Or it could mean most people earn $70,000 and a few people earn millions, pulling the average up. The median would tell you something very different.
“Sales increased 200%!” sounds impressive. If sales went from 1 unit to 3 units, it’s technically a 200% increase. But it’s not impressive.
“This treatment reduced risk by 50%!” Without the base rate, this is meaningless. If your risk was 2 in 1,000 and it dropped to 1 in 1,000, the relative reduction is 50%. The absolute reduction is 0.1 percentage points. Very different implications for your decision.
How Graphs Lie
Charts and graphs are particularly effective at misleading because humans process visual information quickly and often uncritically.
Truncated axes. A bar chart showing values from 98 to 102 with a truncated Y-axis makes tiny differences look massive. Always check whether the axis starts at zero.
Misleading scales. A graph with inconsistent intervals on the axis (0, 10, 100, 1000) can make exponential growth look linear or vice versa.
Cherry-picked time frames. A stock chart that starts at a low point and ends at a high point tells one story. The same stock over a different time period might tell the opposite story.
3D effects and area distortion. Pie charts with 3D effects make some slices appear larger than they are. Always look at the actual numbers, not just the visual representation.
Correlation and Causation
You’ve heard this one, but it still trips people up constantly.
“Countries that eat more chocolate produce more Nobel Prize winners.” This is a real correlation found in published data. It doesn’t mean chocolate causes Nobel Prizes.
Correlation simply means two things happen together. Causation means one thing causes the other. Confusing them leads to terrible decisions.
Common examples: “People who eat breakfast are healthier” could mean breakfast makes you healthy. It could also mean healthier people tend to be the type who eat breakfast. Or that wealthier people can afford both breakfast and better healthcare.
Before accepting any causal claim based on data, ask: is there an alternative explanation? Almost always, there is.
Sample Size and Selection Bias
“8 out of 10 dentists recommend this toothpaste.” How many dentists were asked? Were they paid by the company? Were the questions designed to elicit a positive response?
Small sample sizes are unreliable. A study with 20 participants might find a dramatic effect that disappears when replicated with 2,000 participants. Always check how many people or data points a claim is based on.
Selection bias is equally important. If you survey people who voluntarily respond, you get a different result than surveying people randomly. Online reviews suffer from this — people with strong opinions (very positive or very negative) are more likely to write reviews.
Asking the Right Questions
When you see a statistic or data claim, five questions will protect you from most misinterpretation:
- Compared to what? A number alone means nothing. It needs context.
- How was this measured? The measurement method determines the quality of the data.
- How many people/cases? Sample size matters.
- Who produced this data? Consider the source’s incentives and potential biases.
- What’s the base rate? Percentage changes without base rates are often misleading.
Data at Work
In professional settings, data literacy is increasingly important. Businesses are generating more data than ever, and decisions are increasingly data-driven.
The risk is “data-driven” decision making where the data is poorly understood. Companies using data without understanding its limitations can make worse decisions than companies using informed judgment.
If your organisation is building its data capabilities, having people who can critically evaluate data — not just produce reports — is essential. Some firms like Team400.ai help businesses build this kind of analytical capacity alongside their AI and automation work.
Building Data Literacy
You don’t need a statistics degree. A few resources will significantly improve your ability to read data critically:
“How to Lie with Statistics” by Darrell Huff is a short, accessible book written in 1954 that’s still relevant today. Read it in an afternoon.
“Factfulness” by Hans Rosling demonstrates how our intuitions about data are systematically wrong and provides frameworks for thinking more clearly.
“Calling Bullshit” by Carl Bergstrom and Jevin West is a modern guide to identifying data misuse in media, business, and politics.
Start paying attention to the statistics you encounter daily. Ask the five questions above. Notice when data is presented without context, with misleading visualisations, or with causal claims based on correlations.
Data literacy isn’t about being sceptical of everything. It’s about being appropriately sceptical and asking the right questions before accepting claims at face value.