Skip to main content
Humanities LibreTexts

13.1.6: Designing a Paired Comparison Test

  • Page ID
    36887
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    Suppose you own a food business and are considering marketing what your researcher/cook says is a better version of one of your old food products say, a vegetarian burrito. The main factor in your decision will be whether your customers will like the taste of the new product better than the taste of the old one. You can make your marketing decision by guessing, by letting your cook choose, by asking advice from your friends, or by some other method. You decide to use another method: ask your own customers which of the two vegetarian burritos they like best. Why not? If the customers in your sample prefer the new product, you will believe that the whole population will, too, and you will replace the old product with the new one.

    A good way to do this testing would be to use a procedure called paired comparison. In this kind of test, you remove the identifying labels from the old and new burrito products and then give a few tasters the pairs of products in random orders. That is, some tasters get to taste the new burrito first; some, the old one first. In neither case are they told which product they are tasting. Then ask your taster/judges which product they like better. If a great many of them like the new one better than the old one, you can go with the new product.

    How many tasters do you need in order to get useful results? And if most of the tasters like the new product but many do not, then how much disagreement can you accept and still be sure your customers generally will like the new product better? If three out of five tasters say the new product is better but two out of five disagree, would a conclusion that over half your customers would prefer the new burrito product be a statistically significant result? These are difficult questions, but they have been studied extensively by statisticians, and the answers are clear.

    Before those difficult questions can be answered, you need to settle another issue. How sure do you have to be that your tasters' decision is correct, in the sense of accurately representing the tastes of the general population of your customers? If you need to be 99 percent sure, you will need more tasters than if you need only to be 95 percent sure. Let's suppose you decide on 95 percent. Then, if you have, say, twenty tasters, how many of them would have to prefer the new product before you can be 95 percent sure that your customers will like the new product better, too? If your taster-judges are picked randomly from among your population of customers and aren't professionals in the tasting business, then statistical theory says you would need at least 75 percent (fifteen) of your twenty judges to prefer the new product. However, if you had more judges, you wouldn't need this much agreement. For example, with sixty judges, you would need only 65 percent (thirty-nine) of your judges to give a positive response in order for you to be confident that your customers will prefer the new product. What this statistic of thirty-nine out of sixty means is that even if twenty-one out of your sixty judges were to say that your new burrito is awful, you could be 95 percent sure that most consumers would disagree with them. Yet many business persons who are not versed in such statistical reasoning would probably worry unnecessarily about their new burrito if twenty-one of sixty testers disliked the product.

    Statistical theory also indicates how much agreement among the judges would be required to raise your confidence level from 95 percent to 99 percent. To be 99 percent sure that your customers would prefer the new product to the old, you would need seventeen positive responses from your twenty judges, or forty-one positive responses from sixty judges.

    Let’s try another example. You recently purchased a new service station (gas station) and have decided on an advertising campaign both to increase your visibility in the community and to encourage new customers to use the station. You plan to advertise a free gift to every customer purchasing $10 or more of gasoline any time during the next two weeks. The problem now is to select the gift. You have business connections enabling you to make an inexpensive purchase of a large supply of either six-packs of Pepsi or engraved ballpoint pens with the name of a local sports team. You could advertise that you will give away free Pepsi, or else you could advertise that you will give away the pens. The cost to you would be the same. You decide to choose between the two on the basis of what you predict your potential customers would prefer. To do this, you could, and should, use a paired comparison test. You decide you would like to be 95 percent sure of the result before you select the gift. You randomly choose twenty potential customers and offer them their choice of free Pepsi or a free ballpoint pen. Ten are told they can have the Pepsi or the pen; ten are told they can have the pen or the Pepsi. You analyze the results. Three customers say they don't care which gift they get. Five say that they strongly prefer Pepsi to the pen because they don't like the sports team. Six say they would be happy with either gift but would barely prefer the Pepsi. Four customers choose Pepsi because they have enough pens. The rest choose pens with no comment. From this result, can you be confident that it would be a mistake to go with the ballpoint pen?

    Yes, you can be sure it would be a mistake. Your paired comparison test shows fifteen of twenty prefer Pepsi. At the 95 percent confidence level, you can be sure that over 50 percent of your customers would prefer the Pepsi. By the way, this information about numbers is for illustrative purposes. You as a student aren’t in a statistics class, so you won’t be quizzed on making these calculations. But if you did own that service station you should use a paired comparison test and get some number advice by looking up the info on the Internet or by asking somebody who has taken a statistics class.

    Suppose you learn that your favorite TV program was canceled because the A. C. Nielsen Corporation reported to CBS that only 25 percent of the viewers were tuned to your program last week. CBS wanted a 30 percent program in that time slot. You then learn more about the Nielsen test. Nielsen polled 400 viewers, 100 of whom said they were watching your program. Knowing that the United States has 100 million TV sets, you might be shocked by CBS's making a major financial decision based on the simple statistical claim that 100 out of 400 viewers prefer your program. Can this statistic really tell CBS anything significant about your program? Yes, it can, provided CBS can live with a 2 percent error. Nielsen and CBS can be 95 percent confident that the statistics from a sample of 400 will have an error of only plus or minus 2 percent.


    This page titled 13.1.6: Designing a Paired Comparison Test is shared under a not declared license and was authored, remixed, and/or curated by Bradley H. Dowden.

    • Was this article helpful?