Skip to main content
Humanities LibreTexts

13.1.5: Statistical Significance

  • Page ID
    36886
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    Frequently, the conclusions of inductive generalizations are simple statistical claims. Our premise is "x percent of the sample is la-de-da." From this we conclude, "The same percent of the population is, too." When the argument is inductively strong, statisticians say the percent is statistically significant because this statistic is one that very probably is not due to chance. The number need not be significant in the sense of being important; that is the non-technical sense of the word significant.

    Suppose you are interested in determining the percentage of left-handers in the world, and you aren’t willing to trust the results of other people who have guessed at this percentage. Unless you have some deep insight into the genetic basis of left-handedness, you will have to obtain your answer from sampling. You will have to take a sample and use the fraction of people in your sample who are left-handed as your guess of the value of the target number. The target number is what statisticians call a parameter. The number you use to guess the parameter is called the statistic. Your statistic will have to meet higher standards the more confident you must be that it is a reliable estimate of the parameter.

    I once told my seven-year-old son Joshua that he was unusual because he was left-handed. That surprised him, so he decided to check out whether I was correct. In the sophisticated terminology of mathematical statistics, we'd say Joshua's goal was to determine whether a certain parameter, the percentage of left-handers in the whole world, is much less than 50 percent. Here is what Joshua did to acquire a statistic to use to estimate the parameter. He said, "You're left-handed, Dad. Mom and my little sister aren't. That is two and two." What Joshua had just done, more or less, was to take a sample of four from the vast population of the Earth, discover that two of the four are left-handed, and then calculate the statistic of 50 percent as his guess of the parameter. A statistician would say that Joshua's statistic is not significant because the sample is too small. If Joshua were to take a larger sample, the resultant statistical claim would be more believable.

    So Joshua set out to get a bigger sample. He asked all the children in his class at school whether they were left-handed. Two out of twenty-two. He also went around the neighborhood asking whomever he could. The new result from home, school, and neighborhood was seven left-handers out of thirty-seven. This statistic is more apt to be significant, and it is much less than 50 percent. The moral here is that the bigger the sample size, the more confident you can be that the calculated statistic is statistically significant. The more sampling, the less likely that the result is due to chance. Patterns that appear in small samples might disappear as the sample size grows; they might be shown to be coincidental. Significant patterns and significant statistics are those that are likely not to be accidental or coincidental; they are likely to be found to hold true on examination of more of the target population.

    We still haven't answered the question of whether Joshua's statistic of 7/37 is statistically significant. Is it? It definitely is a better guess than 2/4, but to compute whether it is significant requires some sophisticated reasoning involving complex formulas about margins of error and levels of confidence, which we won't pursue here. We can, however, sketch three features of the answer.

    First, the margin of error: We need to decide just how accurate we want our guess to be. Can we be satisfied with an accuracy of plus or minus 10 percent, or do we need a smaller margin, say plus or minus 1 percent? Second, the confidence level. Are we willing to be only 95 percent sure that we have the right answer, even allowing for the margin of error? Or must we be 99 percent sure? All other things being equal, the more confident we need to be, the less significant will be the statistics we have gathered. Third, how biased was the sampling? Was it random? Was it diverse? Population size is not normally something that needs to be taken into account if the population is large compared to the sample size.


    This page titled 13.1.5: Statistical Significance is shared under a not declared license and was authored, remixed, and/or curated by Bradley H. Dowden.