Statistics Done Wrong: a guide to spotting and avoiding stats errors


#1

[Permalink]


#2

There were a couple of nutbag doctors a while back publishing papers claiming strong links between radiation from Fukushima and increased deaths here in the US. Even from what they put in the paper, their use of statistics was obviously wrong, and obviously deliberately so. I was going to do it right and post a correction on the web, but I discovered that the CDC won't give their data out to just anyone. You have to jump through all kinds of hoops to prove that you're a legitimate researcher, which are really difficult if you don't have an appropriate institutional affiliation or aren't in the right field. Eventually I gave up.


#3

You may well be right, but rebuttal via insinuation is hardly a convincing method. Which papers do you mean, at the very least?


#4

http://www.radiation.org/reading/pubs/HS42_1F.pdf

The errors in their use of stats are completely obvious. Maggie actually covered that here on Boingboing, (http://boingboing.net/2011/06/23/fukushima-babies-and.html) and the report has since been thoroughly debunked, but I was looking into it before that. I'm not saying I was initially sure that their conclusion was wrong, simply that I was very suspicious, since if doing the stats right would have gotten them the answers they wanted, why would they have bothered to fudge it?


#5

Maybe update the xkcd link to point to the source comic? http://xkcd.com/882/


#6

Ok, irritatingly the Moyer article linked in that old article has been moved to here.

To be honest, as a professional statistician, looking at both the Moyer rebuttal and the Mangano original, both are problematic. Moyer's rebuttal is no rebuttal - one cannot simply draw a straight linear trend line and call that an analysis. While the points he raises are often reasonable, he should seek to justify his assertions.

Mangano's work involves more numbers at least, but looking only at the US Total Deaths result on page 6, I cannot replicate his p-value calculation with the data provided. His numbers are simply wrong, as far as I can see. I get a p=0.253, not p less than 0.00001.

I'm overall loathe to claim deliberate fraud instead of an honest error.


#7

I'm not a statistician, so I wouldn't argue with you. (I do regularly use statistics as an engineer, however) I actually wasn't referring to the Moyer article, there have been other treatments of it since. The real issue is that they only dealt with the time immediately before and after the accident (and, though it's not a stats mistake, included the time after the accident but before anything could have reached the states in the after part). A more nuanced treatment would have considered the variance over the course of a normal year, and the variance in that period from year to year, both of which they omitted, and both of which strike me as obviously relevant.


#8

This topic was automatically closed after 5 days. New replies are no longer allowed.