Skip to main content
Humanities LibreTexts

5.7: The Five-Paragraph Theme Teaches “Beyond the Test”

  • Page ID
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    Author: Bruce Bowles, Jr., English, Texas A&M University–Central Texas

    “Tell them what you are going to say, say it, tell them what you said.” I remember learning this strategy from my English teacher my senior year of high school. While mentioning this might seem like a cheap shot at a former teacher, it is quite the contrary—he was one of the best teachers I have ever had. This strategy was taught to us as a general format to follow, yet modify, for a variety of writing tasks. However, with the increase in high-stakes testing that education has seen over the last 10–15 years, this strategy frequently becomes a rather rigid, prescriptive formula: the five-paragraph theme (FPT).

    Most of us are familiar with this structure, even if we do not refer to it as the five-paragraph theme. Traditionally, the FPT contains an introductory paragraph that moves from a general overview of a topic to an explicit thesis statement that highlights three main points. The three supporting paragraphs each take up one of these three main points, beginning with a topic sentence and then moving into more detailed description. Finally, the FPT ends with a standard conclusion that is, oftentimes, merely a restatement of the thesis statement and reiteration of the three main points. (Yes, this theme is referred to as a five-paragraph essay, FPE, elsewhere in this book, but the concept is exactly the same. That it goes by different names while having the same outcome shows its ubiquity in school-based writing situations.)

    Advocates for the FPT contend that it is a phenomenal tool that teaches all students a basic organizational structure that can be built upon in the future and, in addition, is especially useful for teaching students who struggle with basic organization when writing. While such a view may indeed have a small degree of merit, the persistence and popularity of the FPT has a more sinister source— standardized testing. With its mechanical formula, the FPT is the perfect vehicle to ensure inter-rater reliability (consistency amongst test graders), allowing for both efficient and economic scoring sessions for testing companies. Essentially, the FPT influences writing instruction as a result of what is referred to as assessment washback, with standardized testing indirectly dictating curriculum. Thus, the FPT has become the primary genre, not because of its educational merit or real-world applicability, but as a result of its pragmatic benefits for testing companies. Even worse, as a result of its rigidity and the manner in which testing companies assess the FPT, it imparts a hollow, formulaic notion of writing to students that emphasizes adherence to generic features rather than focusing on quality of content, informed research practices, effective persuasive techniques, and attention to the specific contexts in which students will compose.

    To understand why the FPT is immensely beneficial for testing companies, it is essential to understand the concept of interrater reliability. The original attempts at assessing writing ability on large-scale, standardized assessments relied on multiple-choice questions that dealt with grammar and stylistics primarily. However, such assessment methods were scrutinized since they did not actually have students compose (and, as other chapters in this collection discuss, acontextual grammar instruction that multiple choice tests rely on doesn’t improve writing). Such methods were critiqued as lacking construct validity; in essence, they were not measuring the actual construct (writing ability) they were purported to measure. Instead, they merely focused on specific skills that did not reflect one’s overall ability to write.

    As a result, testing companies were forced to transition to holistic scoring—evaluating writing as a whole as opposed to its isolated components. Yet, this presented quite a dilemma. At its core, the evaluation of writing is a subjective, interpretative endeavor. After all, we have all disagreed at one time or another with an friend or colleague about the quality of a particular book, newspaper article, and so on.

    However, such disagreement is especially problematic for the standardized testing community. How can a standardized writing assessment be an accurate reflection of students’ writing ability if the people scoring the students’ writing disagree wildly as to the quality of it? Thus, it is necessary for these testing companies to produce consistency amongst scores. That’s what inter-rater reliability is meant to do—create reliability among the raters, and it is paramount for testing companies to justify the accuracy and fairness of their writing assessments. At its core, inter-rater reliability is a measure of how often readers’ scores agree on a specific piece of writing. An inter-rater reliability of 1 indicates perfect agreement and, for standardized testing purposes, an inter-rater reliability of .8 (80% agreement) is usually seen as the benchmark for reliability. However, obtaining a .8 inter-rater reliability is not as easy as it may seem. Even on a holistic scale of one to six (one poorest, six highest), raters will frequently disagree.

    As a general rule, the more rigid and precise the criteria for evaluation of a piece of writing, the more likely a high inter-rater reliability will be achieved. This is why the FPT is so efficient in scoring standardized writing assessments; with its prescriptive formula and distinct features, raters can be normed (i.e., trained to agree) on the presence and quality of these rather specific features. Is there a clear and concise thesis statement with three main points? Check. Does each supporting paragraph have a topic sentence and move into a more detailed description? Check. Did the conclusion effectively restate the argument? Check.

    Although the scoring session for a standardized writing assessment may not necessarily be as mechanical in nature, the general premise still holds. If raters can be trained to identify specific features or qualities in writing, they will be more likely to agree on an overall score; this agreement saves testing companies time and money since they do not need to resolve disagreements between raters. While such a practice may seem to be merely a practical solution to a troubling assessment problem, it actually has profound consequences for writing instruction at the elementary, secondary, and collegiate levels.

    Ideally, assessments should reflect the curriculum taught in schools. Yet, when high-stakes testing tethers students’ scores to school funding, teacher bonuses, students’ acceptance to colleges, and so on, the reverse frequently happens—the curriculum taught in schools begins to align with the assessments. The state of Florida provides an illustrative example of this. Surveying students across four Florida high schools, Lisa Scherff and Carolyn Piazza have found that persuasive and expository writing (styles frequently associated with the FPT and standardized testing) were heavily emphasized in the 9th and 10th grades, not surprisingly the grades in which the students took the state’s standardized writing test. Instead of receiving instruction on composing in a variety of genres and for a variety of purposes, students are rigorously drilled on how to effectively compose for the genre featured on the state assessment, which—to no one’s surprise—is a five-paragraph persuasive or expository essay.

    Beyond restricting writing instruction to a formulaic genre, this assessment washback has other negative effects on students’ writing development as well. Prominent composition scholars Chris Anson and Les Perelman (featured in this collection) have found that students can be coached to perform better on these standardized writing assessments by following a few general guidelines: Follow the structure of the FPT, write more words (length of essay tends to directly correlate with score), use big words (the higher the vocabulary the better, regardless of whether the words are used correctly), use multiple examples (whether they are relevant to the overall argument or not), and provide a lot of supporting details and evidence, whether they are factually correct or not, since raters are trained not to account for factual accuracy. (One of the students Perelman coached wrote that the Great Depression was primarily a result of American competition with Communist Russia, admitting that he made something up since he could not remember the specific details.) These standardized writing assessments reinforce notions about quantity over quality in writing. I believe it is fair to surmise that most English teachers would not support these practices as methods for improving writing!

    The simple solution to all of these problems would appear to be merely reducing our reliance on, or removing, the FPT from our curricula. However, as long as policy makers rely on standardized writing tests, and those writing assessments rely on the FPT, such a change in curriculum will not be possible. The manner in which we assess writing will always exert a tremendous influence over how we teach writing. Since, for inter-rater reliability and economic purposes, standardized testing relies on the FPT, the only sure-fire way to reduce—or eradicate—the use of the FPT in our curriculums is to reduce or eradicate our reliance on standardized testing.

    Over the last two decades, we have consistently been fed a lie that teacher evaluation is biased; as a result, standardized testing is necessary to hold schools accountable for student learning. This cunning ruse has deceived us into believing that standardized assessment evaluates student learning better and predicts future growth and performance more accurately. The logic is that teachers and administrators are biased; standardized testing provides a level playing field for everyone involved.

    And yet, surprisingly (or not surprisingly), the reverse is true, at least when it comes to predicting future academic success. Pop quiz: Which measure is the most accurate predictor of high-school students’ success at the collegiate level? If you answered SAT scores or performance on state-wide assessments, you would be wrong. Time and time again, studies show that a student’s high school GPA is the most accurate predictor of collegiate success! Essentially, the supposedly biased and poorly trained local educators are the most apt at assessing students’ growth, learning, and future performance. The expertise and localized knowledge of our teachers is rendered irrelevant by standardized testing; in an effort to remove the purported bias of local educators, standardized testing removes a wealth of local knowledge and expertise from the process of assessing writing.

    Transitioning to localized writing assessments would not only take advantage of educators’ local knowledge and expertise, it would enable more authentic, valid forms of writing assessment. Students could produce capstone projects that require them to compose in genres and media that adequately reflect the composing challenges they will face in college or their future professions. Writing portfolios would enable local educators to assess how students perform on a multitude of writing tasks across a variety of contexts. Electronic portfolios would even allow students to practice technological literacy skills. Local educators could work collaboratively with state and federal agencies to create challenging writings assessments that would accurately reflect the composing challenges students will face in their futures, while ensuring oversight to prevent any possible bias or padding of the results of these assessments.

    As long as we remain tethered to standardized testing as our primary method for assessing students’ writing proficiency, the FPT will exert a prominent influence over curricula. However, by allowing local educators—who work diligently with our students and children throughout the school year and know students’ abilities and needs best—to play a prominent role in developing and administering such localized assessments, more valid writing assessments can be developed that will influence curriculum in a positive, educationally productive fashion.

    Further Reading

    If you are interested in learning more about the negative influences of standardized testing on curriculum and instruction, and

    the benefits of localized assessment, Chris Gallagher’s “Being There: (Re)Making the Assessment Scene” (College Composition and Communication) provides profound insights into the dangers of drawing upon business practices in an educational context, critiquing the idea of using accountability as the driving logic behind educational practices. Diane Ravitch’s The Death and Life of the Great American School System: How Testing and Choice are Undermining Education (Basic Books) also provides a scathing critique of standardized testing from the perspective of someone who initially advocated for testing and school choice.

    In regard to the damaging effects of the FPT and standardized testing, Chris Anson’s “Closed Systems and Standardized Writing Tests” and Les Perelman’s “Information Illiteracy and Mass Market Writing Assessments” (both found in College Composition and Communication) discuss the adverse consequences of these assessments on students’ development as writers.

    Finally, if you are interested in alternatives to standardized testing, Bob Broad’s What We Really Value: Beyond Rubrics in Teaching and Assessing Writing (Utah State University Press), Brian Huot’s (Re) Articulating Writing Assessment for Teaching and Learning (Utah State University Press), and Darren Cambridge, Barbara Cambridge, and Kathleen Yancey’s Electronic Portfolios 2.0: Emergent Research on Implementation and Impact (Stylus Publishing) provide valuable models of more localized, context-sensitive writing assessment practices.


    assessment washback, curriculum, five-paragraph theme, interrater reliability, localized assessment, portfolio assessment, standardized testing, validity

    Author Bio

    Bruce Bowles, Jr. is an assistant professor of English and director of the writing center at Texas A&M University–Central Texas. His research focuses on formative assessment practices, in particular responding to student writing, as well as methodologies for writing assessment, the ethics of writing assessment, and the influence of writing assessment on curriculum and pedagogy. He has taught writing and rhetoric at the college level for several years and, unfortunately, has seen the negative influence the five-paragraph theme has on students’ writing development.