Skip to main content
Humanities LibreTexts

20.4: Testing Hypotheses

  • Page ID
    95206
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    Ok, so we have a hypothesis, and it has yielded some predictions. Now they need to be tested. But what does it mean to test a hypothesis? A theory or hypothesis is testable just in case some sort of objective, empirical test could provide evidence that it is either true or false.

    When we formulate tests, we are figuring out the best way to check if the predictions we made are true. In an experimental test, we can bring about the test condition in a laboratory or some other controlled setting. Experimental sciences employ many experimental tests, but in some sciences (e.g., astronomy, meteorology) such tests are difficult to devise. We can obviously still test hypotheses in these areas, as we have made scientific advances in these fields, but we must bend our tests around observing naturally occurring phenomena instead.

    Whenever possible, the way we want to test is with a controlled experiment. This involves collecting data on two representative groups – one group, which serves as a baseline, is the control group, and the other, that is affected in some way that lets us test our hypothesis, is the experimental group. The end result is we can compare the data collected from the control group to that of the experimental group and measure the impact the change in treatment had. For example, in a clinical drug trial, researchers will give one group a new drug, give the other group no drug (or the same drug they’ve already been taking), and see if those who get the new drug have improved health outcomes.

    Not all tests are equally good at evaluating hypotheses. In general, a difficult or severe test of a theory is much better than a weak or easy test. The more unlikely a prediction seems to be before we check it, the better the test it provides of a theory. For example, if your local meteorologist has a theory that predicts it will rain in Seattle sometime this coming April, we won’t be bowled over if this comes true (we all knew that it would rain at least a little sometime during April, long before we ever the meteorologist’s theory). But suppose their theory predicts that it will rain between nine and nine and a half inches in Seattle between noon and 1 pm on April 7. If this happens, we are surprised, and take it to provide strong—though not conclusive— support for the theory: it must have something going for it, to get something like this right. Other things being equal, predictions that are extremely definite and precise provide a better test of a theory than predictions that are indefinite or vague.

    What we really want, though, is a falsifiable hypothesis. A theory or hypothesis if falsifiable just in case some sort of objective, empirical test could show that it is false. The reason this is preferable is because, if we can run a test that may show our view is false, but the test doesn’t show that it is false, then we have good reason for thinking this view is true. The same isn’t the case for tests that only yield positive evidence for a view. Positive evidence is pretty easy to come by. No matter how much positive evidence we have, we need to keep in mind that there always might be negative or disconfirming evidence out there. We can even find positive evidence for hypothesis we know are false. There is positive evidence that Santa Claus is real, for instance (there he is, at the mall).

    An additional kind of falsifiable test is a critical test. A critical test is one where two theories are pitted against each other. Critical tests are not particularly common, but what is great about them is that at the end of the process we have strong reason to support a hypothesis and strong reason to reject another.

    It is important to keep in mind that theories can’t be conclusively falsified because when predictions don’t turn out to be correct, the result can be pinned on one of the auxiliary hypotheses. An auxiliary hypothesis is a background assumption used in testing a theory or hypothesis of interest. Every test of any interesting scientific theory involves auxiliary hypotheses (e.g., about the workings of the measuring devices one employs, the presence or absence of various disturbing influences, and so on). This is not to say that we should never set aside a hypothesis or conclude that it is false. After a failed test, we should go back and check all auxiliary hypotheses. If they seem to be reasonably supported (the equipment is in good working order, etc.) we will have reason to discard the hypothesis.


    This page titled 20.4: Testing Hypotheses is shared under a CC BY-NC 4.0 license and was authored, remixed, and/or curated by Jason Southworth & Chris Swoyer via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is available upon request.