If correlation doesn't imply causation, what does?

maggiek · July 23, 2014, 3:27am

oldtaku · July 23, 2014, 6:00am

Wait, wait, wait. Correlation does not prove causation, but it sure as heck does imply it. MH17 is a good example of that.

Jeopardy music while I go read TFA… ding Okay, ‘imply’ in the scientific sense has a much stronger implication (har) than in the colloquial sense.

My attempt to translate: Correlation suggests causation.

Nice article, thanks Maggie. Today I Learned, and how can you top that?

Kimmo · July 23, 2014, 10:07am

Reading the article now, just popped back to drop this bomb:

I find it more than a little mind-bending that my heuristics about how to behave on the basis of statistical evidence are obviously not just a little wrong, but utterly, horribly wrong.

O_O

You have my attention. Go on.

peregrinus_bis · July 23, 2014, 10:35am

Causation is demonstrated by evidence.

Kimmo · July 23, 2014, 10:43am

Phew…

Eyes… glazing…

I’ll have to come back to this. In the meantime, I think I’ll get back to Gödel, Escher, Bach, which seems like taking a break from the heavy going.

William_Holz · July 23, 2014, 10:59am

In a nutshell, correlation DOES imply causation when imply is used colloquially.

…which is how the general public uses it most often.

Honestly it’s a phrase we should get rid of, it’s worse than the theory fiasco.

Thanks again, English language.

anon85905360 · July 23, 2014, 11:11am

What are you implying here?

anon85905360 · July 23, 2014, 11:15am

“Correlation suggests causation.”

What about “correlation implies you should check for causation”?

William_Holz · July 23, 2014, 11:18am

That when the English language (d)evolves in ways that causes perfectly sensible and useful statements to be completely misused by the general populace it literally makes my head explode.

anon62122146 · July 23, 2014, 11:45am

Correlation does certainly suggest a connection, with a degree of strength proportional to the correlation. It doesn’t tell you which way the causality goes, or whether both phenomena are actually caused by (the same) something else, or whether your experiment was designed in such a way as to pre-select connected phenomena to look at. However, in many, if not most, cases involving a sufficiently strong correlation, we know enough to connect it to other things we know that will tell us which of those it is.

Kimmo · July 23, 2014, 12:06pm

I could care less; that’d make me sad.

catgrin · July 23, 2014, 5:24pm

If you’re looking to show causation, you can use the criteria set down by Bradford Hill. They’re for epidemiology, but do translate to other types of problem.

Using them may suggest causation, and the more that do apply, the stronger the case that causation exists. The wikilink has examples of this process as used in medicine.

Medievalist · July 23, 2014, 6:04pm

I didn’t find anything in the article that showed any difference in my understanding of the word ‘imply’. Can you point it out to me?

In my mind correlation does imply causation, which is why experiments and theories are devised to either make causation explicit or to disprove unsupported implications.

disarticulate · July 23, 2014, 6:09pm

Coincidence does not imply causation.

Correlation does imply causation.

Causation is determined via experiment where one can observe both the causation and resultant correlation.

Correlation -> Causation is often about having a good description of the mechanism.

I always refer to this chart, knowing that pirates enjoy and benefit from global warming:

catgrin · July 23, 2014, 6:26pm

The phrase “correlation doesn’t imply causation” is being used in place of a statistics term where correlation doesn’t equal causation.

A better way to say it is that “correlation alone doesn’t imply causation”. That’s because there are correlations where no causation exists, and (like you say) testing or simply further examination may bear out the lack of causation.

Correlation alone doesn’t necessarily imply causation. Here are some examples why not. As humans we love to see patterns, and that can result in a false leap of logic. There are different types of correlations, with different strengths, and stronger correlations are more likely to imply a causation, but more information is still needed to prove it. It’s best to hold off on assuming causation until you have a fuller picture.

andy_hilmer · July 23, 2014, 6:53pm

That’s the best form of the statement I’ve seen so far.

Richard_Kirk · July 23, 2014, 6:53pm

It’s not about chance. Your data could have fallen out that way first time, but if you carry on repeating the experiment then it becomes increasingly unlikely to fall out the same way. We can even formulate laws about quantum things which predict average behaviours of things that are innately unpredictable. No, here we are talking about repeatable experiments which can lead us to the wrong conclusions.

Look at the graph on the right hand side of this article, and the bit of text with it…

That explains the voting problem.

A lot of the apparent correlations of A & B, particularly the funny ones, are generally because A and B are both varying with time. That is the mobile phones and Greek currency argument.

Next, there are a huge lot of possibilities where A & B are linked by a whole raft of causes and intermediate variables. An the final catch-all is that Decartes’ deamon arranged for your data to come out like that, for no particular reason that you shall ever know, other then he’s a bit of a dick.

That’s pretty much it. You can understand it all without going up a hat size.

ejeffrey · July 23, 2014, 7:59pm

The math is a bit complicated, but if you scroll down to the “tobacco causes lung cancer” example, it makes some intuitive sense. In that example, if your only three random variables are “smokes tobacco”, “Gets cancer”, and a hidden variable"unknown genetic factor" which could potentially cause both smoking and cancer, you can’t prove that smoking causes cancer.

So in a simplified example you add an extra variable that represents an intermediate observable variable, “tar in lungs”, which is potentially caused by smoking, potentially causes cancer, but which you believe is not (directly) influenced by the unknown genetic factor. In this particular case, that may or may not be a valid assumption, but this is also a simplified model of a real causal network. If you do that, and you measure all the correlations between the observable variables (smoking, tar, cancer), you can now calculate, or at least bound, exactly how much smoking causes cancer independently of any genetic factor.

Of course, you never exclude all possible models, which isn’t really the point. The point is that this is a technique to formally reason about how much influence confounding variables can have when you have a partial model for how they would work.

Medievalist · July 23, 2014, 8:26pm

I don’t think that it’s valid for anyone to equate “correlation doesn’t imply causation” with “correlation doesn’t equal causation”. In my native language, American English, those are two different phrases, meaning two different things.

Implications aren’t necessarily true - they are possibilities or probabilities perhaps, but mostly they are just ideas that exist subjectively in the minds of observers. Different people might see entirely different implications given the same data set, due to factors like education or culture. To imply something is to suggest something indirectly, but even if a thing is suggested directly that does not make it true.

I’m going to continue to say that correlation doesn’t equal causation, and that good scientists are not necessarily good wordsmiths.

catgrin · July 23, 2014, 8:58pm

I wholly agree. American English is also my native tongue, and it has its foibles.

I didn’t say “Correlation doesn’t imply causation.” (That’s the old one.)
I said “Correlation alone doesn’t imply causation.”

The two statements are very different. Correlation may be a component to discovering causation, so the two can be related. It can, and often does, point out causality. The word “alone” provides a warning that just correlation isn’t enough to assume causation, and we need to be wary of false patterns. If correlation exists, we need to look for additional evidence of causality.

On the flip side: Correlation is almost guaranteed to exist if causality exists. So if you can prove causality, you may be able to find a correlation you were missing. Discoveries have been made made this way - kind of in reverse.

The two concepts “correlation” and “causation” aren’t equal, but I think my statement gives a clearer picture of their relationship.

Topic		Replies	Views
More chocolate = more Nobel Prizes boing	21	2029	November 26, 2017
RiYL podcast 013: Scott Aukerman boing	4	1609	September 28, 2013
Spurious correlations: an engine for head-scratching coincidences boing	16	3580	May 17, 2014
Stop spreading unconfirmed COVID-19 symptoms around boing	28	1437	April 13, 2020
500 phrases from scientific publications that are correlated with bullshit boing	25	3812	November 21, 2015

If correlation doesn't imply causation, what does?

Related topics