Skip to main content
Humanities LibreTexts

13.2: Obstacles to Collecting Reliable Data

  • Page ID
    36888
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    So far in our discussion of significant statistics, we have worried about how to make decisions using reliable information from a sample of our population. To obtain significant statistics we try to obtain a representative sample by getting one that is diverse, random, and large. A major obstacle to obtaining a representative sample is that unreliable data too easily creep into our sample.

    If you own a radio station and decide that over 80% of your listeners like that song by singer Katy Perry because over 80% of those who texted your station (about whether they like that song) said they liked it, then you’ve made a too risky assumption. Those who texted you weren’t selected at random from your pool of listeners; they selected themselves. Self-selection is a biased selection method that is often a source of unreliable data.

    There is the notorious problem of lying to pollsters. The percentage of polled people who say they’ve voted in the election is usually higher than the percentage of people who actually did. More subtly, people may practice self-deception, honestly responding "yes" to questions such as "Are you sticking to your diet?" when they aren't. Another problem facing us pollsters is that even though we want diversity in our sample, the data from some groups in the population may be easier to obtain than from other groups, and we may be tempted to favor ease over diversity. For example, when counting Christians worldwide, it is easier for us to get data from churches of people who speak some languages rather than others and who are in some countries rather than others and who are in modern cities rather than remote villages.

    There are other obstacles to collecting reliable data. Busy and more private people won't find the time to answer our questions. Also, pollsters occasionally fail to notice the difference between asking "Do you favor Jones or Smith?" and "Do you favor Smith or Jones?" The moral is that natural obstacles and sloppy methodology combine to produce unreliable data and so to reduce the significance of our statistics.


    This page titled 13.2: Obstacles to Collecting Reliable Data is shared under a not declared license and was authored, remixed, and/or curated by Bradley H. Dowden.

    • Was this article helpful?