Sometimes, starting the Y-axis at zero is the BEST way to lie with statistics


#1

[Read the post]


#2

One of my all-time favorite examples is relevant here.


#3

One of my favorites from the comment thread under the NR graph:


#4

I’ve actually seen some value in comparing temperature increases against zero – not their 0°F, which is just stupid, but absolute 0 K. Then you see the whole thing is really on the order of a 1% temperature change, and that helps explain how a gas that only absorbs a small percentage of the energy Earth emits to space can still be responsible. Something people are often misled about.

The consequences for living things, sadly, needs at least a second graph. As if anything important doesn’t! National Review is not only nasty and disingenuous, but really doesn’t care to try and hide that, huh?


#5

I have to say that example they used to demonstrate their point argues against it. The labor participation graph makes it look as though participation declined in 2010 to around fifteen percent of what it was in 2000! When you have measures that are ratio-level, like percentage, the chart should reflect accurate ratios.

The narration says “you can’t see the change at all” if you include the zero on the y access, but that’s not true, and it’s sort of the point: it puts the change in context. You CAN see the drop and you can tell that it’s 7 percentage points, not 85. Leaving the zero off this chart exaggerates the change dramatically and is a good example of a chart that deceives.

If you want to call out the changes, magnify a section of the chart, but leave the zero on the big chart so that we know we’re talking about a recession, not a planetary catastrophe.

There are many times when zero is best not used as the origin–when negative numbers are meaningful; when your universe of observations has an average and you want to show deviations from it and zero would be meaningless (like for body temperature); and when charting SAT scores and similar measures that don’t permit a zero. However, with a ratio-level statistic that has a meaningful zero, put the zero in and avoid future panda attacks;).

I don’t even want to address the straw man argument about the English language–it’s enough to point out, I think, the crappy chart that undermines their point.


#6

I really, sincerely hope that the person assigned to make this graph did not have a conscience nor any knowledge of statistics because if I’d been in his/her place and committed this thought felony I probably would have killed myself, possibly while still on the clock.


#7

Whenever there’s a minimum threshold of some sort then it’s perfectly acceptable to use that as a boundary (or even better indicate it with a compressed section above or below…that’s generally what I do , but I deal with a more professional audience)

In the case of the labor participation, there’s definitely a catastrophic boundary that is far above zero, and a shift of even a few percent is pretty dramatic.

The guys at VOX.com know their stuff and are as professional as any media organization out there (and I do data analytic work for a living, so I can definitely address the subject as a non-consumer)


#8

Here’s a nice chart from the FBI showing the five year drop in violent crime

When I saw that graph, I thought, “WOW, that’s a huge drop! …wait a minute, it starts at 1.15 million.” But it is still a pretty big drop.


#9

That one’s deceptive AND put together by somebody who has poor data skills.

With any statistics associated with a changing population you’re supposed to use incidents per volume (basically percentages that make sense in that context) rather than total numbers. It actually would’ve slightly improved the curve in their favor since population has been increasing.


#10

Oh God this hits home… I went to a talk on global warming once, and noticed that the chart didn’t go anywhere near down to zero. Like an idiot I pointed this out, and only later thought about the fact that the chart was in Farenheit, and that really the only zero that has any relevance is absolute zero on the Kelvin scale, and that showing global temperature changes on that scale would have been ridiculous. It was nice to see these people mentioning my exact error, since I’ve been thinking about it since X-D


#11

That can’t be so!

According to Conrad Black, “The crime rate, after decades of decline, is rising again.”


#12

Well, I guess we have to disagree. I have a background in statistics and data presentation to non-statisticians and I still think the chart stinks.

The chart is not for professionals–they would recognize that it exaggerates the drop in participation fairly dramatically and focus on the data points instead. Non-professionals need the zero because they will look at the ratios and perhaps ignore the y-axis labels.

As to whether 8 points is a dramatic difference, it would be instructive to extend the chart back to the '50s. The shape of the graph would be a steady increase from a much lower level in the early '60s to a peak around the year 2000, after which there would be a decline to today.

I’d really like to see data from the '30s on it as well, but they were not collected back then. Participation would likely have approached catastrophic levels then.

My point is that the differences in labor force participation are important, but not as dramatic as the graph suggests, and that’s deceptive.


#13

Whoever Vox is makes a good point, but there’s the bit about, “that’s not lying, that’s just telling the story.” Wouldn’t the nice people are Fox News say the same thing?

Ah, but you see their narrative is false. How do I know this? They’re constantly lying with statistics! How can I tell? Because their charts seem to support their false narrative…


#14

We can certainly disagree that there’s an analog scale for improvement, but ‘stinks’ is a bit hyperbolic and that’s not terribly professional. How would you improve it, exactly?

How many years of experience do you have presenting such things to people who then make decisions based upon your information where improperly representing the data can come back to bite you?

There we disagree strongly. Because no…no they most definitely don’t. That’s the point of the video. Do you disagree that the zero Y access is not always appropriate?[quote=“OneOff, post:12, topic:70744”]
As to whether 8 points is a dramatic difference, it would be instructive to extend the chart back to the '50s. The shape of the graph would be a steady increase from a much lower level in the early '60s to a peak around the year 2000, after which there would be a decline to today.
[/quote]
What would be instructive is to show additional information demonstrating the context of that loss and how it impacts the economy in other ways. Often historical timelines aren’t as useful as informational context.

Those are called ‘unemployment rate’ charts. They exist all over the place, but they don’t usually show up to 100% on the axis (which is the zero axis for employment after all) so they might be confusing to purists :wink:

(Sorry, I really couldn’t help that one)


#15

The point is that an axis and a chart don’t tell the whole story and context is required. Are you implying that FOX and Vox are somehow both similar players in the data visualization market and adhere to similar ethical standards?


#16

No, not at all. Just pointing out things like scale and axis placement are choices, and sometimes conceal value judgements. It’s tempting to think that I’m the objective one, and they’re the fools.


#17

I couldn’t agree more on that one.

Honestly while I like Vox’s explainers, my preference is to hit multiple sites, especially those who are being reasonable and rational as part of the conversation. I like finding smart, well written discussions on topics that disagree with my original thoughts on what I’m reading…but honestly it takes a bit of google-fu and while I’ve got a few favorites I think we all should have our own and be as organic about them as possible.

Current events are honestly bad subjects for discussion because a lot of the info out there is going to be hyperbolic, so a friend gave me an idea that I’ve been doing for a while, which is to save articles on a subject and write down my thoughts on the subject and then look back in several months to see how reality has caught up with things. Keywords like ‘analysis’, ‘debunk’, ‘follow-up’, ‘revisited’ and so on seem to help, as does ‘.pdf’ or ‘journal’ to find journal articles (pdf sometimes gives me them behind paywalls even!)

Data is hard and any data with tons of people is bound to be messy. We’re all wrong a lot, so it’s more a matter of gradually improving our skills than being right the first time.


#18

I have about twenty years experience presenting information to decision makers who have varying levels of numeracy and I tell you I would NEVER use that kind of graph alone.

It’s not an uncommon problem among researchers and especially academic researchers to present information to decision makers in a way that’s flawed, misleading and poorly constructed for that audience. Mostly I think it’s arrogance and laziness, e.g., “I’m a psych prof, I know what I’m doing–I’m a scientist, for Christ sakes! I can make charts!”

You have to present data in a way that is clear and is difficult to misconstrue. Having ratios out of whack with the data points would be a red flag for at least one university president and many faculty members I’ve worked with.

I told you how I would fix it–zero on the y axis, perhaps more history. If you want to get into the data, you can pull out that part that you want to focus on, but not without the context in the first chart with the y-axis zero.

I know I’m in the minority on this, but in the real world you need to be crystal clear and not confuse your users any more than necessary. Just because it’s common practice to make graphs where the ratios shown are out of whack doesn’t make it right.

If you read my earlier comments, you’d see that I agree that there are plenty of times when non-zero y-axes are preferred. This just ain’t one of 'em!


#19

My favorite Fox graphs are the ones where the graphs themselves simply fail to accurately plot the changes in numbers that don’t fit the narrative they’re trying to sell - they do it quite frequently, eg:


#20

Nor would I, as I mentioned…

And neither did they.

So, as discussed. The graph needed context. It had context. In the video it was mentioned as an example about data axes, but it was never actually delivered without the useful information that we both agree is necessary.

What I wouldn’t do is go off on a specific graph without double-checking that, because that’d be really embarrassing.

:wink:

(and yes, I know, this is a forum and not work, we shouldn’t have to be constantly on our toes, but we should be responsible in our slams)

You are definitely in the minority on this one…yes.

I’d have added a compressed lower bound (so, to zero,but not wasting a ton of whitespace or overflattening the data) and perhaps also included information on the impacts of percentage reductions so that there’s more useful context.

It is very difficult to read more than a slight trend on the zero axis version of the graph, which makes it largely useless to anybody other than an industry expert, I would not have considered a full zero axis variant a good use of space, though I could see doing a small thumbnail one that somebody could zoom in on.

Also, and this is important and you know this as well as I do…catastrophic is not zero in this case, and that graph was demonstrating the degree of catastrophe. I could see using the unemployment rates in other recessions (though the great depression is an outlier and I wouldn’t consider that a great choice) as a lower bound with appropriate indicators.

I see almost no value in the zero axis graph in this case, not as a full page at least. Maybe in an appendix or something. I believe I’m in the majority (or at least plurality) on this one, no?