Skip to main content
Humanities LibreTexts

13.4: Don't trust AI- sometimes it makes things up

  • Page ID
    354061
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\dsum}{\displaystyle\sum\limits} \)

    \( \newcommand{\dint}{\displaystyle\int\limits} \)

    \( \newcommand{\dlim}{\displaystyle\lim\limits} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    Lawyer Steven Schwartz got caught. He used ChatGPT to generate a legal brief and didn’t check its accuracy before he submitted it to a court. It turned out that ChatGPT had stuffed in references to a bunch of cases that didn’t exist. “I did not comprehend that ChatGPT could fabricate cases,” Schwartz admitted to the judge, “I continued to be duped by ChatGPT. It’s embarrassing.” Schwartz was fined $5,000 and severely reprimanded.

    Why would ChatGPT include fabricated –cases? You could answer by asking, “Why not?” As we have seen, chatbots are based on text prediction. They “guess” what humans might write next. They don’t “know” whether their guess has anything to do with the real world. They are trained on lots of text, some of it untrue. And even if they were only trained on fact-checked text, they are doing something like averaging the patterns in that text. And as Jon Ippolito of the University of Maine puts it, “the average of two facts isn't always a fact.” I came across a perfect example of this when ChatGPT referenced a paper on AI by “Yann Bengio.” It had put the first name of one famous computer scientist, Yann LeCun, with the last name of another, Yoshua Bengio, who had worked with LeCun.

    What should we call it when AI makes things up?

    Philosopher Harry Frankfurt famously came up with a not-so-technical term for times when whoever or whatever is saying something doesn’t care whether it’s true or not: “bullshit.” Writing in the Harvard Business Review, Ian P. McCarthy, Timothy R. Hannigan, and André Spicer apply this concept to chatbots and call it “botshit.”

    Besides bullshit and botshit, you’ll see it called “hallucination” when an AI system makes something up. But AI doesn’t know what’s true. And it’s not on drugs or having any kind of imaginative experience at all, as the term “hallucination” would suggest. Nate Angell and I have argued for the alternate term “AI mirage” which puts the focus on the humans who mistakenly believe the AI output.

    Botshit is easy to miss

    The big AI companies slap warnings about inaccuracy onto their systems. But inaccuracy can be so easy to miss that Google itself fell into the trap. When it launched a ChatGPT competitor, it proudly shared a sample chat session:

    Prompt and response from Bard AI that reads "what new discoveries from the James Webb Space Telescope can I tell my 9 year old about? Your 9-year old might like these recent discoveries made by the James Webb Space Telescope (JWST): In 2023, the JWST spotted a number of galaxies nicknamed "green peas." They were given this name because they are small, round, and green, like peas. The telescope captured images of galaxies that are over 13 billion years old. This means that the light from these galaxies has been traveling for over 13 billion years to reach us. JWST took the very first pictures of a planet outside of our own solar system. These distant worlds are called "exoplanets." Exo means "from outside." These discoveries can spark a child's imagination about the infinite wonders of the universe."
    Image credit: Screenshot of a Google blog post dated February 6, 2023.

    The third point wasn’t true. According to NASA, a different telescope in the Atacama Desert of Chile was the first to photograph an exoplanet. The next day after the made up “fact” was pointed out, Google’s market valuation lost $100 billion.

    Will chatbots ever get it right?

    Engineers are trying to stop AI systems from making things up, and newer chatbots seem to hallucinate at lower rates than previous ones. In the pros and cons of AI for research*, we’ll look at systems that are designed to reduce the problem by basing their results on real sources and linking to real sources to help us fact-check. We’ll see that even those systems still make things up.

    Most experts do not expect to see this problem fixed. Ziwei Xu, Sanjay Jain and Mohan Kankanhalli, machine-learning researchers from the National University of Singapore, explained to Scientific American that “​​For any LLM, there is a part of the real world that it cannot learn, where it will inevitably hallucinate.”

    How can we use systems that make things up?

    You can probably guess what I’m going to say. We can choose not to use them. Or we can check every plausible statement that comes out of them by comparing it to one or more credible sources on the same subject. We have to try to do better than Google did in the James Webb example above. Teachers and students alike will be tempted again and again to skip or short-cut that time-consuming fact-checking process. The more we make fact-checking bots a habit from the start and reduce our own expectations about how much time the bots will save us, the easier it should be to resist that temptation.

    Further reading

    Acknowledgment

    The use of the Google example was inspired by Reed Hepler’s adaptation of a course licensed by the Center for Teaching Excellence and Innovation (CTEI, Rush University) under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.


    This page titled 13.4: Don't trust AI- sometimes it makes things up is shared under a CC BY-NC 4.0 license and was authored, remixed, and/or curated by Anna Mills (ASCCC Open Educational Resources Initiative) .