Skip to main content
Humanities LibreTexts

1.1: The Sound System of Mandarin

  • Page ID
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    Mandarin Chinese, as spoken in Mainland China, can be written using a system of spelling called pinyin, which uses Latin alphabet letters together with diacritical tone marks.  Written Chinese uses characters (for example: 你 or 好), and each character represents a syllable with an accompanying tone.  Rarely, some characters may have more than one way to pronounce them. 

    A syllable in Chinese consists of either one of the following:

    • an initial sound + a final; or
    • a final without an initial

    An initial is always a consonant.

    A final can be:

    • a simple vowel, e.g. a, e, i, o, u;
    • a compound vowel, e.g. ao, uo, ou; or
    • a vowel followed by "n" or "ng," e.g. in, en, ang, ong.

    Most rules given here in terms of English pronunciation are approximations, as several of these sounds do not correspond directly to sounds in English.

    Pronunciation of Mandarin Initials

    Pinyin IPA Explanation Examples
    (audio files courtesy of Project Shtooka)
    b [p] unaspirated p, as in spit
    帮 bāng, to help
    p [pʰ] as in English
    炮 pào, gun; cannon
    m [m] as in English
    马 mǎ, horse
    f [f] as in English
    风 fēng, wind
    d [t] unaspirated t, as in stand
    大 dà, big
    t [tʰ] as in English
    头 tóu, head
    n [n] as in English
    男 nán, male
    l [l] as in English
    老 lǎo, old
    g [k] unaspirated k, as in skill
    格 gé, grid
    歌 gē, song
    k [kʰ] as in English
    看 kàn, to see
    h [x] like the English h if followed by "a"; otherwise it is pronounced more roughly (not unlike the Scots ch)
    好 hǎo, good
    喝 hē, to drink
    画 huà, to draw
    j [tɕ] like q, but unaspirated. (To get this sound, first take the sound halfway between joke and check, and then slowly pass it backwards along the tongue until it is entirely clear of the tongue tip.) While this exact sound is not used in English, the closest match is the j in ajar, not the s in Asia; this means that "Beijing" is pronounced like "bay-jing", not like "beige-ing".
    叫 jiào, to call
    家 jiā, home, family
    近 jìn, close
    尖 jiān, sharp
    q [tɕʰ] like j above, but with strong aspiration. Similar to church; pass it backwards along the tongue until it is free of the tongue tip
    气 qì, air, gas
    桥 qiáo, bridge
    x [ɕ] like sh, but take the sound and pass it backwards along the tongue until it is clear of the tongue tip; very similar to the final sound in German ich, Portuguese enxada, luxo, xícara, puxa, and to huge or Hugh in some English dialects
    小 xiǎo, little, small
    心 xīn, heart
    想 xiǎng, to think; to want
    zh [tʂ] ch with no aspiration (take the sound halfway between joke and church and curl it upwards); very similar to merger in American English, but not voiced
    长 zhǎng, to grow
    中 zhōng, center, middle
    重 zhòng, heavy
    ch [tʂʰ] Like zh above, but with strong aspiration. Similar to chin, but with the tongue curled upwards; very similar to nurture in American English, but strongly aspirated
    吃 chī, to eat
    茶 chá, tea
    sh [ʂ] as in shinbone, but with the tongue curled upwards; very similar to undershirt in American English
    沙 shā, sand
    手 shǒu, hand
    上 shàng, up, on
    r [ɻ] similar to the English r in rank, but with the lips spread and with the tongue curled upwards
    日 rì, sun
    热 rè, hot
    z [ts] unaspirated c (halfway between beds and bets), (more common example is suds)
    紫 zǐ, purple
    c [tsʰ] like ts, aspirated (more common example is cats)
    草 cǎo, grass
    次 cì, time(s)
    s [s] as in sun
    送 sòng, to send
    y [j], [ɥ] as in English. If followed by a u, pronounce it with rounded lips
    月 yuè, moon
    音 yīn, tone
    w [w] as in English
    外 wài, outside

    Pronunciation of Mandarin Finals

    Pinyin IPA Final-only form Description
    Simple finals a [a:] a as in "father"
    o [ɔ:] o as in "got"
    e [ɤə] e a backward, unrounded vowel: first place the tongue between [ŋ] and [ə] to produce [ɤ], and then lower the tongue to slide to [ə]

    a bit like English "duh", but not as "open"

    (ê) [e] ê as in "get"
    i [i:] yi as in "he"
    (-i) [ɻ̩], [ɹ̩]   i is a buzzed continuation of the consonant when it appears after these initials: z-, c-, s-, zh-, ch-, sh- and r-
    u [u:] wu as in "who"
    ü [y:] yu as in German "üben" or French "lune" (to get this sound, say "ee" with rounded lips)
    Complex finals ai [ai̯] ai like "eye", but a bit lighter
    ei [ei̯] ei as in "say"
    ui [uei̯] wei like "way", but a bit lighter
    ao [au̯] ao like "cow", the a is much more audible than the o
    ou [ou̯] ou as in "so", "dough"
    iu [iəu̯] you as in "Leo"
    ie [i̯e] ye like "yet"
    üe [y̯e] yue as pinyin ü + ê
    er [aɚ̯] er as in "bar" in Amerian English (the r is always pronounced) (this final doesn't combine with any initials)
    an [an] an as in "stun", "fun"
    en [ən] en as in "taken"
    in [in] yin as in "in"
    un [u̯ən] wen as pinyin u + en
    ün [yn] yun as pinyin ü + n
    ang [aŋ] ang as in "young", like "song" in American English
    eng [əŋ] eng repalce the [n] in en with [ŋ]
    ing [iŋ] ying as in "thing"
    ong [ɔŋ]   replace the [n] in "yawn" with [ŋ]

    Rolled finals

    Rolled finals (儿化音) are a phenomenon in spoken Mandarin. People from northern China like to roll their tongue when saying specific words, usually certain nouns and verbs, in daily speech. On the other hand, people from southern China rarely do so. Non-native learners of Chinese need not learn this pronunciation, as this is sometimes considered as a northern Chinese accent instead of standard Mandarin. This table's purpose is to enable learners to recognize and understand these types of finals when hearing somebody using them.

    Pinyin IPA Explanation
    e'r [ɤ˞] as e + er (not to be confused with the final er on its own, e'r only exists with an initial character before it)

    air, anr

    air, anr

    [aɚ̯] as ai + er, an + er
    aor [au̯˞] as ao + er
    our [ou̯˞] as ou + er
    angr [ãɚ̯̃] as ang + er
    iar, ianr [i̯aɚ̯] as ia + er, ian + er
    inr, ir [i̯ɚ] as in + er, i + er
    ingr [i̯ɚ̃] as ing + er
    ur [u˞] as u + er
    uor [u̯o˞] as uo + er
    uir [u̯ɚ] as ui + er
    ongr [ʊ̃˞] as ong + er
    ür [y̯ɚ] as ü + er

    Mandarin Tones

    Every syllable in Chinese has a clearly defined pitch of voice associated with it to distinguish characters with the same sound from each other. Unfortunately, there is no indication of the tone given when reading a character, so the tones for words must be individually memorized. To help with this, pinyin uses four easily-remembered diacritical marks to tell you what the tones of words are. The five tone marks are:

    • First tone ( ˉ ), high level.
    • Second tone (ˊ), middle rising.
    • Third tone ( ˇ ), low dipping.
    • Fourth tone (ˋ), high falling.
    • Neutral tone (without any marks), low level unstressed syllable.
    The diagram below shows the pitch changes of the four tones on a five-bar scale going from lowest (1), to highest (5).

    Relative pitch changes of the four tones


    Tone marks are always placed over vowels, never consonants. If there is more than one vowel in the syllable, the mark placement is determined by three simple rules.

    1. If there is an a or an e, the tone goes on the a or the e. No pinyin syllable contains both an a and an e.
    2. In the ou combination, the o takes the tone mark.
    3. In all other cases, the final vowel gets the tone mark.

    Pronouncing the tones

    Each bar of this musical staff represents the relative pitch changes when saying tones 1, 2, 3 and 4

    Say the first tone as if you were singing a high note. The second tone is pronounced like a question in English, with your pitch rising at the end of the syllable. Third tones are low and extended, noticeably longer than the other tones because of the dip. The fourth tone is said abruptly and forcefully, like a curt command in English. The neutral tone's pitch depends on the tone that precedes it. It is described more fully below, but in general, they are pronounced quickly and softly. The classic example used to show the difference tones make is:

     ()    ()    ()    ()    (ma)

    (Being "mother", "hemp", "horse", "scold" and a question particle, respectively.)

    mā má mǎ mà
    A sound sample of the four tones

    In many cases, several characters can have exactly the same syllable and tone. For example, along with 马, the characters 码 and 蚂 are also pronounced exactly the same (mǎ). 马 can be used alone to mean the animal "horse." It can also be combined with other characters for new meanings. 马上 (mǎshàng 'immediately'), 马球 (mǎqiú 'polo'), and 马路 (mǎlù 'street'). Other characters with the same pronunciation will be used differently as well. 数码相机 (shùmǎ xiàngjī 'digital camera'), 蚂蚁 (mǎyǐ 'ant), and others. Since these characters alone sound exactly the same in conversation, the only way to distinguish them is through context.

    Tone changes

    The third tone, with its dip-and-rebound, is hard to fit into a continuous sentence. This is why the third-tone changes depending on its environment. There are two rules:

    1. If a third tone comes before another third tone, then it is pronounced as a second tone.
    2. If a third tone comes before any other tone, then it only dips, and doesn't rebound and is called a half-third tone (see image).

    Because of these broad rules, the majority of third tones you encounter will be spoken as second tones or half-third tones. Be mindful of this because the written tone marks remain unchanged despite the differences in actual pronunciation.  The following diagram shows how the third tone changes:

    The shape of the 3rd tone when before 1st, 2nd and 4th tones
    Aside from 3rd tone, there are also a couple examples of specific characters that can also change tones:
    • The high-frequency character 一 (yi, "one") can be pronounced in the first, the second or the fourth tone, depending on the tone following it.  It is pronounced in the second tone if it is followed by a four tone or a neutral tone; e.g. yí jiàn; yí ge.  It is pronounced in the fourth tone when it is followed by a first, second, or third tone; e.g. yì zhī; yì qún; yì zǔ  It can also be pronounced in the first tone no matter what tone follows it; e.g. yī zhī; yī qún; yī zǔ; yī jiàn; yī ge
    • The character 不 (bu) can be pronounced in either the second tone or the fourth tone depending on the tone following it.  It is pronounced in the fourth tone when it is followed by the first, the second or the third tone; e.g. bù shuō, bù néng, bùgǎn.  It is pronounced in the second tone when it is followed by a fourth tone; e.g. bú yào  It is pronounced in the fourth tone when nothing follows it; e.g., nà kě bù

    Neutral Tones

    Some syllables don't have a tone and carry no tone mark. They are not stressed, and they take their tone from the syllable before them:

    • If it follows a first- or second-tone syllable, then the toneless syllable is mid-range.
    • If it follows a third-tone syllable, then the toneless syllable is high, as if the dip-and-rebound of the third-tone continues right into it.
    • If it follows a fourth-tone syllable, then the toneless syllable is low, as if the fall of the fourth-tone continues right into it.

    Writing in Pinyin

    There are certain rules for how to write in pinyin.  You don't really need to focus much on these rules for the purpose of this course, but it might be helpful for you to know that an apostrophe (’) is used to separate two syllables in a single word, where lack of the apostrophe would lead to ambiguity: e.g. Xī'ānnǚ'ér.  The first letter of a proper name should be capitalized, as in Zhōngguó (China), Běijīng, and Shànghǎi.  Other than that, just learn the pinyin for vocabulary as presented over the course of this semester, and you'll be fine.

    When typing or writing pinyin, you must always remember:

    • Always include tone marks.  Pinyin without tone marks is not correct pinyin!

    There are two different ways to write tone marks in pinyin.  The first is with diacritical marks, which are lines over certain vowels.  This is how you will generally find pinyin presented in lessons for this course.  For example:

    • nǐ hǎo
    • suǒyǐ
    • yí jiàn
    • yí ge
    • yì zhī
    • bù néng
    • bú yào

    Alternatively, you could instead use numeric tone marks, using numbers to represent the tones, as follows:

    • ni3 hao3
    • suo3yi3
    • yi2 jian4
    • yi2 ge or yi2 ge5
    • yi4 zhi1
    • bu4 neng2
    • bu2 yao4

    I recommend that you use numeric tone marks when completing the quizzes, tests and exams on Laulima for this course, as they will be easier to type.  The numbers should go at the end of each syllable.  Thus, ni3 hao3 is correct, but ni3 ha3o is not correct.

    This page titled 1.1: The Sound System of Mandarin is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Carl Polley (裴凯).

    • Was this article helpful?