Skip to main content
Humanities LibreTexts

14.4: Inferring from Correlation to Causation

  • Page ID
    22044
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    Unfortunately, additional problems occur with the process of justifying causal claims. If we know that A causes B, we can confidently predict that A is correlated with B because causation logically implies correlation. But we cannot be so confident about the reverse—a correlation doesn't usually provide strong evidence for causation. Consider the correlation between having a runny nose and having watery eyes. We know neither causes the other. Instead, having a cold causes both. So, in this case we say that the association between the runny nose and the watery eyes is spurious, because a third factor is at work causing the correlation.

    In general, when events of kind A are associated (correlated) with events of kind B, three types of explanation might account for the association:

    1. The association is accidental, a coincidence.
    2. A is causing B, or B is causing A.
    3. Something else C is lurking in the background and is causing A and B to be significantly associated. That is, the association is spurious because of lurking factor C.
    Exercise \(\PageIndex{1}\)

    Why is the frequency of lightning flashes in Iowa in the summer positively correlated with popcorn production in Iowa in the summer? Hint: Your explanations should be able to show why the association is spurious.

    Answer

    The real causal story behind the correlation is that storms cause both the lightning and the rain that helps the corn grow.

    Given an observed correlation, how can you figure out how to explain it? How can you tell whether A is accidentally correlated with B, or A causes B, or B causes A, or some C is causing both? Here is where scientific sleuthing comes in. You have to think of all the reasonable explanations and then rule out everything until the truth remains. An explanation is ruled out when you collect data inconsistent with it. This entire process of searching out the right explanation is called the scientific method of justifying a causal claim. Let's see it in action.

    There is a strong positive correlation between being overweight and having high blood pressure. The favored explanation of this association is that being overweight puts stress on the heart and makes it pump at a higher pressure. Such an explanation is of type 2 (from the above list). One alternative explanation of the association is that a person's inability to digest salt is to blame. This inability makes the person hungry, which in turn causes overeating. Meanwhile, the inability to digest salt also makes the heart pump faster and thereby helps distribute what little salt there is in the blood. This pumping requires a high blood pressure. This explanation, which is of type 3, is saying that the association is spurious and that a lurking factor, the inability to digest salt, is producing the association.

    When someone suggests a possible explanation—that is, proposes a hypothesis—it should be tested if you want to know whether to accept the explanation as being correct. The test should look at some prediction that can be inferred from the explanation, some prediction that otherwise would not be expected. If the actual findings don’t agree with that prediction, then the explanation is refuted. On the other hand, if the prediction does come out as expected, we hold onto the hypothesis.

    However, it is not always an easy matter to come up with a prediction that can be used to test the hypothesis. Good tests can be hard to find. Suppose, for example, my hypothesis is that the communist government of the U.S.S.R. (Russia, Ukraine, and so on) disintegrated in the early 1990s because it was fated to lose power then. How would you test that? You can’t.

    The process of guessing a possible explanation and then trying to refute it by testing is the dynamic that makes science succeed. The path to scientific knowledge is the path of conjecturing followed by tough testing. There is no other path. (Philosophers of science say that the path to scientific knowledge is more complicated than this, and they are correct, but what we’ve said here is accurate enough for our purposes.)

    Notice the two sources of creativity in this scientific process. First, it takes creativity to think of possible explanations that are worth testing. Second, it takes creativity to figure out a good way to test a suggested explanation.

    Can you create a way to test my explanation of why magicians can pull rabbits from hats? I claim it is because magicians have a special ability to mentally harness magic power by uttering the three words "Alla Kazam Shazam." I

    If you think about my explanation, you will soon realize that it is terrible, and that you can test it and refute it. Begin your test by making sure that the magicians and assistants do not have access to any rabbits, and then see how well they do with producing rabbits.

    Exercise \(\PageIndex{2}\)

    A third explanation of the association or correlation between high blood pressure and being overweight is that stress is to blame. Stress leads to anxiety, which promotes overeating and consequent weight gain. This extra weight causes not only a higher body temperature but also extra blood flow. That flow in turn requires the heart to pump faster and thus to increase the blood pressure. This explanation is of which of the following three types?

    1. coincidence
    2. cause leads to effect
    3. spurious correlation
    Answer

    Answer (b). The explanation is of type 2. The correlation is not spurious; stress is the cause of the correlation. If this were really the correct explanation of the correlation, then tests should show that people who score high in psychological testing for being under stress are more likely both to have high blood pressure and to be overweight than is the average person who is not under stress.


    This page titled 14.4: Inferring from Correlation to Causation is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Bradley H. Dowden.

    • Was this article helpful?