Skip to main content
Humanities LibreTexts

13.1.2: Sample Size

  • Page ID
    36883
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    If you hear a TV commercial say that four out of five doctors recommend the pain reliever in the drug being advertised, you might be impressed with the drug. However, if you learn that only five doctors were interviewed, you would be much less impressed. Sample size is important.

    Why? The answer has to do with the fact that estimations based on sampling are inductive and thus inherently risky. The larger the sample, the better its chance of being free of distortions from unusually bad luck during the selection of the sample. If you want to predict how California voters will vote in the next election it would be better to have a not-quite random sample of 10,000 future voters than a perfectly random sample of two future voters.

    To maximize the information you can get about the population, you will want to increase your sample size. Nevertheless, you usually face practical limits on the size; sampling might be expensive, difficult, or both.

    In creating the government census, it is extremely difficult to contact and count those people who live temporarily on the couch at a friend's apartment and those who live in their cars and have no address and those who are moving to a new job in a different state. You can make good estimates about these people, but if you're required to disregard anyone you haven't talked to during your census taking, then you'll under-represent these sorts of people in your census results. People who complain that the government census will make an educated guess about how many people live in a city even if they haven’t counted all of the people, never seem to complain when their doctor samples their own blood rather than takes all of it to examine.

    So, when is your sample size big enough for your purposes? This is a fascinating and difficult question. To illustrate, suppose you are interested in selling mechanical feeding systems to the farmers in your state. You would like to know what percentage of them do not already own a mechanical feeding system—they will be your potential customers. Knowing that this sort of information has never been collected, you might try to collect it yourself by contacting the farmers. Since it would be both difficult and expensive to contact every single farmer, you would be interested in getting your answer from a sample of small size. If you don't care whether your estimate of the percentage of farmers without a mechanical feeding system is off by plus or minus 10 percent, you can sample many fewer farmers than if you need your answer to be within 1 percent of the (unknown) correct answer. Statisticians would express this same point by saying that a 10 percent margin of error requires a smaller sample size than a 1 percent margin of error. All other things being equal, you’d prefer to have a small margin of error than a large one.

    Let's suppose you can live with the 10 percent margin of error. Now, how sure do you need to be that your estimate will fall into that interval of plus or minus 10 percent? If you need only to be 90 percent sure, then you will need a much smaller sample size then if you need to be 97 percent sure. Statisticians would express this same point by saying that a 90 percent confidence level requires a smaller sample size than a 97 percent confidence level. Just exactly how much smaller is a matter of intricate statistical theory that we won't go into here, although we will explore some specific examples later.

    A margin of error is a margin of safety. Sometimes we can be specific and quantify this margin, that is, put a number on it such as 6%. We can say that our sampling showed that the percentage of farmers without a mechanical feeding system is 60 percent plus or minus 6 percent. Sometimes we express the idea vaguely by saying that the percentage is about 60 percent. At any rate, whether we can be specific or not, the greater the margin of error we can permit, the smaller the sample size we need.

    To appreciate the desirability of a small margin of error, imagine that you are trying to forecast tomorrow’s temperatures in cities around the globe and you claim that you have a great model for doing this, whose only side effect is that your model predicts a temperature between absolute zero and the temperature of the sun—a gigantic margin of error. You use your model and predict that tomorrow’s temperature in New York City will be three thousand degrees. If you claim that your prediction is within your margin of error, you will be correct, but your model will clearly be useless because we want temperature predictions that have a much smaller margin of error.


    This page titled 13.1.2: Sample Size is shared under a not declared license and was authored, remixed, and/or curated by Bradley H. Dowden.

    • Was this article helpful?