# Sometimes, starting the Y-axis at zero is the BEST way to lie with statistics

One of my all-time favorite examples is relevant here.

One of my favorites from the comment thread under the NR graph:

Iāve actually seen some value in comparing temperature increases against zero ā not their 0Ā°F, which is just stupid, but absolute 0 K. Then you see the whole thing is really on the order of a 1% temperature change, and that helps explain how a gas that only absorbs a small percentage of the energy Earth emits to space can still be responsible. Something people are often misled about.

The consequences for living things, sadly, needs at least a second graph. As if anything important doesnāt! National Review is not only nasty and disingenuous, but really doesnāt care to try and hide that, huh?

I have to say that example they used to demonstrate their point argues against it. The labor participation graph makes it look as though participation declined in 2010 to around fifteen percent of what it was in 2000! When you have measures that are ratio-level, like percentage, the chart should reflect accurate ratios.

The narration says āyou canāt see the change at allā if you include the zero on the y access, but thatās not true, and itās sort of the point: it puts the change in context. You CAN see the drop and you can tell that itās 7 percentage points, not 85. Leaving the zero off this chart exaggerates the change dramatically and is a good example of a chart that deceives.

If you want to call out the changes, magnify a section of the chart, but leave the zero on the big chart so that we know weāre talking about a recession, not a planetary catastrophe.

There are many times when zero is best not used as the origināwhen negative numbers are meaningful; when your universe of observations has an average and you want to show deviations from it and zero would be meaningless (like for body temperature); and when charting SAT scores and similar measures that donāt permit a zero. However, with a ratio-level statistic that has a meaningful zero, put the zero in and avoid future panda attacks;).

I donāt even want to address the straw man argument about the English languageāitās enough to point out, I think, the crappy chart that undermines their point.

I really, sincerely hope that the person assigned to make this graph did not have a conscience nor any knowledge of statistics because if Iād been in his/her place and committed this thought felony I probably would have killed myself, possibly while still on the clock.

Whenever thereās a minimum threshold of some sort then itās perfectly acceptable to use that as a boundary (or even better indicate it with a compressed section above or belowā¦thatās generally what I do , but I deal with a more professional audience)

In the case of the labor participation, thereās definitely a catastrophic boundary that is far above zero, and a shift of even a few percent is pretty dramatic.

The guys at VOX.com know their stuff and are as professional as any media organization out there (and I do data analytic work for a living, so I can definitely address the subject as a non-consumer)

Hereās a nice chart from the FBI showing the five year drop in violent crime

When I saw that graph, I thought, āWOW, thatās a huge drop! ā¦wait a minute, it starts at 1.15 million.ā But it is still a pretty big drop.

That oneās deceptive AND put together by somebody who has poor data skills.

With any statistics associated with a changing population youāre supposed to use incidents per volume (basically percentages that make sense in that context) rather than total numbers. It actually wouldāve slightly improved the curve in their favor since population has been increasing.

Oh God this hits homeā¦ I went to a talk on global warming once, and noticed that the chart didnāt go anywhere near down to zero. Like an idiot I pointed this out, and only later thought about the fact that the chart was in Farenheit, and that really the only zero that has any relevance is absolute zero on the Kelvin scale, and that showing global temperature changes on that scale would have been ridiculous. It was nice to see these people mentioning my exact error, since Iāve been thinking about it since X-D

That canāt be so!

According to Conrad Black, āThe crime rate, after decades of decline, is rising again.ā

Well, I guess we have to disagree. I have a background in statistics and data presentation to non-statisticians and I still think the chart stinks.

The chart is not for professionalsāthey would recognize that it exaggerates the drop in participation fairly dramatically and focus on the data points instead. Non-professionals need the zero because they will look at the ratios and perhaps ignore the y-axis labels.

As to whether 8 points is a dramatic difference, it would be instructive to extend the chart back to the '50s. The shape of the graph would be a steady increase from a much lower level in the early '60s to a peak around the year 2000, after which there would be a decline to today.

Iād really like to see data from the '30s on it as well, but they were not collected back then. Participation would likely have approached catastrophic levels then.

My point is that the differences in labor force participation are important, but not as dramatic as the graph suggests, and thatās deceptive.

Whoever Vox is makes a good point, but thereās the bit about, āthatās not lying, thatās just telling the story.ā Wouldnāt the nice people are Fox News say the same thing?

Ah, but you see their narrative is false. How do I know this? Theyāre constantly lying with statistics! How can I tell? Because their charts seem to support their false narrativeā¦

We can certainly disagree that thereās an analog scale for improvement, but āstinksā is a bit hyperbolic and thatās not terribly professional. How would you improve it, exactly?

How many years of experience do you have presenting such things to people who then make decisions based upon your information where improperly representing the data can come back to bite you?

What would be instructive is to show additional information demonstrating the context of that loss and how it impacts the economy in other ways. Often historical timelines arenāt as useful as informational context.

Those are called āunemployment rateā charts. They exist all over the place, but they donāt usually show up to 100% on the axis (which is the zero axis for employment after all) so they might be confusing to purists

(Sorry, I really couldnāt help that one)

The point is that an axis and a chart donāt tell the whole story and context is required. Are you implying that FOX and Vox are somehow both similar players in the data visualization market and adhere to similar ethical standards?

No, not at all. Just pointing out things like scale and axis placement are choices, and sometimes conceal value judgements. Itās tempting to think that Iām the objective one, and theyāre the fools.

I couldnāt agree more on that one.

Honestly while I like Voxās explainers, my preference is to hit multiple sites, especially those who are being reasonable and rational as part of the conversation. I like finding smart, well written discussions on topics that disagree with my original thoughts on what Iām readingā¦but honestly it takes a bit of google-fu and while Iāve got a few favorites I think we all should have our own and be as organic about them as possible.

Current events are honestly bad subjects for discussion because a lot of the info out there is going to be hyperbolic, so a friend gave me an idea that Iāve been doing for a while, which is to save articles on a subject and write down my thoughts on the subject and then look back in several months to see how reality has caught up with things. Keywords like āanalysisā, ādebunkā, āfollow-upā, ārevisitedā and so on seem to help, as does ā.pdfā or ājournalā to find journal articles (pdf sometimes gives me them behind paywalls even!)

Data is hard and any data with tons of people is bound to be messy. Weāre all wrong a lot, so itās more a matter of gradually improving our skills than being right the first time.

I have about twenty years experience presenting information to decision makers who have varying levels of numeracy and I tell you I would NEVER use that kind of graph alone.

Itās not an uncommon problem among researchers and especially academic researchers to present information to decision makers in a way thatās flawed, misleading and poorly constructed for that audience. Mostly I think itās arrogance and laziness, e.g., āIām a psych prof, I know what Iām doingāIām a scientist, for Christ sakes! I can make charts!ā

You have to present data in a way that is clear and is difficult to misconstrue. Having ratios out of whack with the data points would be a red flag for at least one university president and many faculty members Iāve worked with.

I told you how I would fix itāzero on the y axis, perhaps more history. If you want to get into the data, you can pull out that part that you want to focus on, but not without the context in the first chart with the y-axis zero.

I know Iām in the minority on this, but in the real world you need to be crystal clear and not confuse your users any more than necessary. Just because itās common practice to make graphs where the ratios shown are out of whack doesnāt make it right.

If you read my earlier comments, youād see that I agree that there are plenty of times when non-zero y-axes are preferred. This just aināt one of 'em!

My favorite Fox graphs are the ones where the graphs themselves simply fail to accurately plot the changes in numbers that donāt fit the narrative theyāre trying to sell - they do it quite frequently, eg:

Nor would I, as I mentionedā¦

And neither did they.

So, as discussed. The graph needed context. It had context. In the video it was mentioned as an example about data axes, but it was never actually delivered without the useful information that we both agree is necessary.

What I wouldnāt do is go off on a specific graph without double-checking that, because thatād be really embarrassing.

(and yes, I know, this is a forum and not work, we shouldnāt have to be constantly on our toes, but we should be responsible in our slams)

You are definitely in the minority on this oneā¦yes.

Iād have added a compressed lower bound (so, to zero,but not wasting a ton of whitespace or overflattening the data) and perhaps also included information on the impacts of percentage reductions so that thereās more useful context.

It is very difficult to read more than a slight trend on the zero axis version of the graph, which makes it largely useless to anybody other than an industry expert, I would not have considered a full zero axis variant a good use of space, though I could see doing a small thumbnail one that somebody could zoom in on.

Also, and this is important and you know this as well as I doā¦catastrophic is not zero in this case, and that graph was demonstrating the degree of catastrophe. I could see using the unemployment rates in other recessions (though the great depression is an outlier and I wouldnāt consider that a great choice) as a lower bound with appropriate indicators.

I see almost no value in the zero axis graph in this case, not as a full page at least. Maybe in an appendix or something. I believe Iām in the majority (or at least plurality) on this one, no?

