Statistics Done Wrong: The Woefully Complete Guide


#1

[Read the post]


#2

“fewer than 3% of articles" in prestigious journals Science and Nature “calculate statistical power before starting their study."

But what is the statistical power of that statistic?


#3

Gord Doctorow : Why link to Amazon instead of publisher (http://shop.oreilly.com/product/9781593276201.do)?


#4

Did you know over 73% of statistics are just made up? It’s true!

(Also, I think it’s arrant nonsense)


#5

this would pair so nicely with Darrell Huff’s “How to Lie with Statistics” required reading for a young political consultant in the late 1980s, it was the definitive instruction manual for how to present data when you need to influence…

of course Huff published his in 1954, and we don’t like remembering things that far back.


#7

''There are three kinds of lies: lies, damned lies, and statistics."


#8

I read a 1990s paper with a failed 2x2 clinical trial where n=800. It’s true! I seen it!

Basically, with a scenario like that the treatment would have to be nearly a miracle cure to be statistically significant.

I don’t think this is the result of a general lack of statistical literacy, I think it has more to do with the general apathy and constant cutting of corners. If someone is inclined to ignore important strategic decisions, one more math course won’t make any difference in their management style. Mostly it’s a question of doing the project with a certain budget by skiving on the power calculations.


#9

''There are three kinds of lies: lies, damned lies, and Lamaze"


#10

Oh I do! I read that book in middle school.


#11

Looks like O’Reilly is the US distributor for No Starch Press (“the finest in geek entertainment” – I love it!).
I just ordered it straight from them, and I got the paperback plus a DRM-free ebook. And I got a nice discount with a coupon code that I found with a quick Google search.

I think every book should come like this. Read the paper edition, then pass it on to a friend while keeping the ebook for reference. Brilliant!


#12

Why oh why do high schools and colleges teach so much calculus and so little statistics?!


#13

Probably because advanced college math courses treat statistics as a form of calculus, rendering statistics a very esoteric discipline. Which is a shame, because we could be teaching statistics in third grade. After all, much of statistics was figured out literally by bean counters, not theorists. Even today, statisticians are always discovering the theoretical basis of what’s been known empirically for decades.

I’ve known people who had to sweat through the full blown graduate level statistics classes that were supposedly essential and they never, ever look at it again.


#14

Pretty good because it’s a discrete value rather than a continuous range, and discrete values are the best endpoint because they can have definitions like “alive” or “dead,” but not “mostly dead” except in “The Princess bride.”


#15

The t- and F- tests are a form of calculus. It’s just that William Sealey Gosset (AKA “Student”) calculated tables when he designed the *t-*tests so that Guinness brewers only needed to look up the best fit numbers and didn’t have to do the underlying complicated mathematics. Just imagine, Student’s *t-*test could be even more hideous to calculate manually than it is … :frowning:

It’s a descriptive statistic not a inferential statistic, so it doesn’t have statistical power.

In terms of power calculations, the inferential tests for discrete data have less statistical power than those for interval and ratio (continuous) data and therefore need larger sample sizes to achieve statistical significance. Hence the kludge of turning ordinal data (dimensionless values with a rank order like “Really Bad”;“Bad”;“OK”;“Good”;“Really Good”) into equal ranks (1,2,3,4,5) in order to use the “more powerful” tests.

Statistical significance itself is over rated. Significant at p=0.05 also means that there is a 1 in 20 probability that random sampling threw up a weird result and we’re barking up the wrong tree. Replication studies should fix that problem, but negative replication studies rarely get published …


#16

[quote=“aeon, post:15, topic:58070”]
The t- and F- tests are a form of calculus. It’s just that William Sealey Gosset (AKA “Student”) calculated tables when he designed the t-tests so that Guinness brewers only needed to look up the best fit numbers and didn’t have to do the underlying complicated mathematics. Just imagine, Student’s t-test could be even more hideous to calculate manually than it is … [/quote]
But statistics courses walk the students through the actual formulas used to create the tables, which is a tall cold glass of who the hell cares. There’s no reason not to be teaching statistics to third graders.

Oh, if only I had a dollar for every software salesperson doing the statistical Rumpelstiltskin sales pitch promising to turn straw into gold…


#17

If you’re doing a University level Stats course for a discipline that uses a lot of Statistics that’s surely a good thing? Otherwise, yeah, not so much.

My kids seem to be doing a lot more on descriptive statistics than I recall doing at school and not just in mathematics lessons either. I never had to do a box-and-whisker plot until I hit University, but my son covered that age 13~ish. They can elect to do either calculus or inferential stats in higher maths in the last couple of years of High School. But maybe that’s just a New Zealand thing?


#18

A Galton board? I have to question your ideological correctness, comrade. Galton, a member of the oppressor class, was well-known for his counter-revolutionary belief in genetic differences in intellect. Trotsky (PBUH) would have had him purged, even if he hadn’t been a known cousin of the notorious anti-Lysenkoist Darwin. One might almost question your belief in the inevitable rightness of social equality, or even suspect you are some species of hereditarian deviationist.


#19

You simply can’t advocate for good statistics as a socialist. It leads to thought, which leads to questions, which leads to realizing that the whole enterprise is founded on lies.

The truth is that people are not equal, and they differ in mentality due primarily to genetics rather than environment, and this will not be changed by any sort of indoctrination or infiltration of institutions. (Gord is a notorious Trotskyite entryist, which is to say a fifth columnist and communist infiltrator - see this WP page on the Canadian Socialist League)


#20

Oh it’s worse than that, Galton coined the word “eugenics.” Although he was one of histories great creative minds, he was painfully aware that his intelligence left him on edge of madness. Eugenics was more of his hobby horse as he got elderly.

He invented a good bit of modern statistics, which is generally not the creation of mathematicians. Mathematicians are not people that go around having deep insights into physical reality or they’d be physicists.


#21

You simply can’t advocate for good statistics as a socialist.

Selection bias if I ever saw one.

people are not equal, and they differ in mentality due primarily to genetics rather than environment

Doesn’t it strike you as contradictory that as an advocate of a radical meritocracy you assume a diametrically opposed concept to govern the intelligence of individuals?