# Statistics Done Wrong: The Woefully Complete Guide

â€śfewer than 3% of articles" in prestigious journals Science and Nature â€ścalculate statistical power before starting their study."

But what is the statistical power of that statistic?

Gord Doctorow : Why link to Amazon instead of publisher (http://shop.oreilly.com/product/9781593276201.do)?

Did you know over 73% of statistics are just made up? Itâ€™s true!

(Also, I think itâ€™s *arrant* nonsense)

this would pair so nicely with Darrell Huffâ€™s â€śHow to Lie with Statisticsâ€ť required reading for a young political consultant in the late 1980s, it was the definitive instruction manual for how to present data when you need to influenceâ€¦

of course Huff published his in 1954, and we donâ€™t like remembering things that far back.

''There are three kinds of lies: lies, damned lies, and statistics."

I read a 1990s paper with a failed 2x2 clinical trial where n=800. Itâ€™s true! I seen it!

Basically, with a scenario like that the treatment would have to be nearly a miracle cure to be statistically significant.

I donâ€™t think this is the result of a general lack of statistical literacy, I think it has more to do with the general apathy and constant cutting of corners. If someone is inclined to ignore important strategic decisions, one more math course wonâ€™t make any difference in their management style. Mostly itâ€™s a question of doing the project with a certain budget by skiving on the power calculations.

''There are three kinds of lies: lies, damned lies, and Lamaze"

Oh I do! I read that book in middle school.

Looks like Oâ€™Reilly is the US distributor for No Starch Press (â€śthe finest in geek entertainmentâ€ť â€“ I love it!).

I just ordered it straight from them, and I got the paperback plus a DRM-free ebook. And I got a nice discount with a coupon code that I found with a quick Google search.

I think every book should come like this. Read the paper edition, then pass it on to a friend while keeping the ebook for reference. Brilliant!

Why oh why do high schools and colleges teach so much calculus and so little statistics?!

Probably because advanced college math courses treat statistics as a form of calculus, rendering statistics a very esoteric discipline. Which is a shame, because we could be teaching statistics in third grade. After all, much of statistics was figured out literally by bean counters, not theorists. Even today, statisticians are always discovering the theoretical basis of whatâ€™s been known empirically for decades.

Iâ€™ve known people who had to sweat through the full blown graduate level statistics classes that were supposedly essential and they never, ever look at it again.

Pretty good because itâ€™s a discrete value rather than a continuous range, and discrete values are the best endpoint because they can have definitions like â€śaliveâ€ť or â€śdead,â€ť but not â€śmostly deadâ€ť except in â€śThe Princess bride.â€ť

The t- and F- tests *are* a form of calculus. Itâ€™s just that William Sealey Gosset (AKA â€śStudentâ€ť) calculated tables when he designed the *t-*tests so that Guinness brewers only needed to look up the best fit numbers and didnâ€™t have to do the underlying complicated mathematics. Just imagine, Studentâ€™s *t-*test could be even more hideous to calculate manually than it is â€¦

Itâ€™s a descriptive statistic not a inferential statistic, so it doesnâ€™t have statistical power.

In terms of power calculations, the inferential tests for discrete data have *less* statistical power than those for interval and ratio (continuous) data and therefore need *larger* sample sizes to achieve statistical significance. Hence the kludge of turning ordinal data (dimensionless values with a rank order like â€śReally Badâ€ť;â€śBadâ€ť;â€śOKâ€ť;â€śGoodâ€ť;â€śReally Goodâ€ť) into equal ranks (1,2,3,4,5) in order to use the â€śmore powerfulâ€ť tests.

Statistical significance itself *is* over rated. Significant at p=0.05 *also* means that there is a 1 in 20 probability that random sampling threw up a weird result and weâ€™re barking up the wrong tree. Replication studies should fix that problem, but negative replication studies rarely get published â€¦

[quote=â€śaeon, post:15, topic:58070â€ť]

The t- and F- tests are a form of calculus. Itâ€™s just that William Sealey Gosset (AKA â€śStudentâ€ť) calculated tables when he designed the t-tests so that Guinness brewers only needed to look up the best fit numbers and didnâ€™t have to do the underlying complicated mathematics. Just imagine, Studentâ€™s t-test could be even more hideous to calculate manually than it is â€¦ [/quote]

But statistics courses walk the students through the actual formulas used to create the tables, which is a tall cold glass of who the hell cares. Thereâ€™s no reason not to be teaching statistics to third graders.

Oh, if only I had a dollar for every software salesperson doing the statistical Rumpelstiltskin sales pitch promising to turn straw into goldâ€¦

If youâ€™re doing a University level Stats course for a discipline that uses a lot of Statistics thatâ€™s surely a good thing? Otherwise, yeah, not so much.

My kids seem to be doing a lot more on descriptive statistics than I recall doing at school and not just in mathematics lessons either. I never had to do a box-and-whisker plot until I hit University, but my son covered that age 13~ish. They can elect to do either calculus or inferential stats in higher maths in the last couple of years of High School. But maybe thatâ€™s just a New Zealand thing?

A **Galton** board? I have to question your ideological correctness, comrade. Galton, a member of the oppressor class, was well-known for his counter-revolutionary belief in genetic differences in intellect. Trotsky (PBUH) would have had him purged, even if he hadnâ€™t been a known cousin of the notorious anti-Lysenkoist Darwin. One might almost question your belief in the inevitable rightness of social equality, or even suspect you are some species of hereditarian deviationist.

You simply canâ€™t advocate for good statistics as a socialist. It leads to thought, which leads to questions, which leads to realizing that the whole enterprise is founded on lies.

The truth is that people are not equal, and they differ in mentality due primarily to genetics rather than environment, and this will not be changed by any sort of indoctrination or infiltration of institutions. (Gord is a notorious Trotskyite entryist, which is to say a fifth columnist and communist infiltrator - see this WP page on the Canadian Socialist League)

Oh itâ€™s worse than that, Galton coined the word â€śeugenics.â€ť Although he was one of histories great creative minds, he was painfully aware that his intelligence left him on edge of madness. Eugenics was more of his hobby horse as he got elderly.

He invented a good bit of modern statistics, which is generally not the creation of mathematicians. Mathematicians are not people that go around having deep insights into physical reality or theyâ€™d be physicists.

You simply canâ€™t advocate for good statistics as a socialist.

Selection bias if I ever saw one.

people are not equal, and they differ in mentality due primarily to genetics rather than environment

Doesnâ€™t it strike you as contradictory that as an advocate of a radical meritocracy you assume a diametrically opposed concept to govern the intelligence of individuals?