Skip to main content
Humanities LibreTexts

6.6: Form of Characters

  • Page ID
    89649
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\dsum}{\displaystyle\sum\limits} \)

    \( \newcommand{\dint}{\displaystyle\int\limits} \)

    \( \newcommand{\dlim}{\displaystyle\lim\limits} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    Traditionally, Chinese characters are subdivided into six categories according to the way they are thought to have been formed. These categories are called the 六書 liù shū ‘six scripts’, and include graphs that are derived from drawings (like 馬 ‘horse’ the earliest versions of which look quite like a horse), those that are formed as indications (like 上 and 下, which represent meaning diagrammatically), or those that are borrowed (like the graph 不 which was borrowed to represent a word of nearly identical sound, like 4 in the shorthand 4U).

    Though the ‘six scripts’ are sometimes claimed to be descriptive, in fact it requires considerable historical knowledge to decide to which type a graph belongs. For the beginner, seeking a way to gain a foothold on the sheer face of the [written] language by trying to rationalize the relationship between the sound/meaning of a word and the form of its character, there are only two useful kinds of relationship. One is pictorial, or representational: the shape of the character suggests its meaning; 上 ‘on’, 下 ‘under’, 中 ‘middle’, 心 ‘heart’. The other is relational: the character resembles another of the same or similar sound: 嗎 ma ‘Q’, sounds like 馬 ‘horse’ and 媽 ‘mother’. These two types can be labeled ‘representational’ and ‘phonosemantic’, respectively. The former are often cited for their pictorial qualities; but it is the latter, the phonosemantic, that are the most common. New characters are almost always created on the phonosemantic model.

    Representational characters

    As noted earlier, compound characters are those that can be decomposed into constituents that are themselves characters (or combining versions of characters). Non-compound characters, such as 中, 馬 or 王 (or the parts of compound characters such as 女, 生, 木 and 日) can be called ‘simplex’. It is probably true that most simplex characters derive ultimately from drawings or indications that relate to the original meaning of the graph. The following characters all have forms that can be rationalized fairly easily in terms of their meaning:

    èr sān shàng xià zhōng
    one two three on below middle
    xīn huǒ
    heart must fire rain rice (tree), wood
     
    yuè shān chā gōng  
    moon hill convex fork bow  
    鱼/魚   鸟/鳥   伞/傘  
      niǎo   sǎn  
    fish   bird   umbrella  

    A particular graph can be viewed as representational regardless of whether the historical data supports the notion. Thus, if you agree that 伞/傘 sǎn looks [vaguely] like an umbrella, then you are regarding the graphs as representational, and that image can help you to remember them. Similarly, once the graph for xīn ‘heart’ is known, ie 心, then 必 ‘must; have to’ can be viewed as representing the notion of obligation as ‘a line crossing the heart’. Conversely, the pictorial origins of some graphs may have been obscured by historical change. The graph 象 used for xiàng ‘elephant’ may not look like an elephant until someone makes the case either by citing a more realistic earlier graph, or by drawing attention to a trunk, head, body, tail, in the modern character.

    Beginning students show great skill at creating nonsense etymologies (even for compound characters). Thus the character 哭 ‘to cry’ is seen as ‘two eyes and a tear’; or 電/电 diàn ‘electricity’ is seen as ‘an appliance with an electrical cord running out the bottom’. Or – to cite a more extreme case – 會 (会 in simplified form) ‘to be able; capable’ (among other meanings) is seen as Darth Vader, complete with helmet and breathing equipment – a man of impressive capabilities. But while it is useful to find representational elements in complex characters, it is often not possible even with a high degree of creative license. There is not much to be said for, say, 皮 ‘skin’, 衣 ‘clothes’, or 豆 dòu ‘beans’. They are simplex (and may well derive directly from representations) but their forms are difficult to account for without historical research – or a very creative imagination.

    Additive characters – or blends

    A small set of compound graphs can be interpreted as semantic blends, in which the meaning of the whole seems to be related to both its parts. Occasionally, as in the (b) examples, both meaning and sound are involved.

    a) Semantic blends

    jiān ‘sharp’, made up of 小 xiǎo ‘small’ and 大 ‘big’, ie ‘wedge shaped’;

    zhōng ‘loyal’, made up of 中 zhōng ‘middle’ and 心 xīn ‘heart’;

    xìn ‘believe; letter’, made up of イ rén ‘person’ and 言 ‘language’;

    yùn ‘be pregnant’, made up of 乃 nǎi ‘exist’ and 子 ‘child’;

    hǎo ‘be good; well’, made up of 女 ‘woman’ and 子 ‘child’, ie ‘goodness’

    尿 niǎo ‘urine’ made up of 尸 shī ‘body’ and 水 shuǐ ‘water’;

    shǐ ‘shit’ made up of 尸 shī ‘body’ and 米 ‘rice [grain]’.

    b) Blends of sound and meaning (rare)

    béng ‘no need to’, made up of 不 ‘not’ and 用 yòng ‘use’.

    乒乓 pīngpāng ‘pingpong’, whose graphs suggest a pingpong table, but which also take their sound from the graph, 兵 bīng ‘soldier’.

    Blends are one of the traditional character types (one of the liùshū), but most cases represent more of an apparent than a real historical process of character creation. As with the simplex characters, students and teachers frequently ignore the historical facts and enlarge the category of blends with their own etymologies: 名 míng ‘name’ from 夕 ‘evening’ and 口 kǒu ‘mouth’, explained as ‘at dusk, you have to call out names to identify people’; or 東 dōng ‘east’, made up of 日 ‘sun’ superimposed on 木 ‘wood’ (originally ‘tree’) and explained as ‘sunrise through the eastern trees’; or 杯 bēi ‘cup’, made up of 木 ‘wood’ and 不 bu ‘not’, because ‘cups aren’t make of wood’.

    Phonosemantic characters

    Once the repertoire of characters begins to grow, it becomes more effective to relate characters not to things (their referents), but to each other. Thus, as noted earlier, once 馬 ‘horse’ is learned, then it is easy to relate it to 嗎 ma ‘Q’, or 媽 ‘mother’ – or eventually to 螞 ‘ant’ and 碼 ‘number’. The historical process that gives rise to such ‘phonetic sets’ is borrowing followed by specification: 馬 is borrowed to write words similar in sound (mother, ant, number, etc.); then to prevent confusion, the graph is specified by the addition of a classifying character (口, 女, 石 or 虫, etc.)

    Many phonetic sets are quite regular, like the 馬 set, or the following set based on 青 qīng (which, as a free form, means ‘green’ or ‘young’):

    請,  情 , 晴 , 清 , 氰, 蜻 ,
    qǐng qíng qíng qīng qíng qīng qīng
    invite feelings clear clean hydrogen dragonfly mackerel

    In some cases, phonetic correspondences that were once regular have been obscured by historical changes in the language; such is the case for 饿 and 我, or 陳 and 東, where the pronunciation of members of the set (è and , in the first case, chén and dōng in the second) remains close but no longer identical. But even the ‘irregular’ sets show patterns of correspondence, as illustrated by the set based on 重 below, which either begins with zh or with d (initials that differ only slightly in their place of articulation).

    重, 種, 踵, 腫, 動, 懂,
    zhòng zhǒng zhǒng zhǒng dòng dǒng dǒng
    heavy category heel swell move understand to lead

    The common sound elements, the phonetics, are called shēngpáng in Chinese; the specifying elements, the radicals are bùshǒu. As shown at the beginning of this lesson, radicals do have concrete meanings (言 ‘speech’, 心 ‘heart’, 日 ‘sun’, 水 ‘water’ etc.), and initially the selection of a particular radical to form a compound character would have been inspired by meaning. But in many cases, the original impetus has been obscured by linguistic and cultural change. The presence of the water radical in 海 ‘sea’, 河 ‘river’ and 洗 ‘wash’ reflects a connection with water; but its presence in 漢 Hàn ‘Chinese’, 溫 wēn ‘warm’ and 活 huó ‘to live’ is harder to explain. Ultimately, the function of radicals in compound characters is one of differentiation (活 is not 适 or 括; 漢 is not 難, 嘆 or 艱); and classification (活 and 漢 are found under the water radical).

    Character retrieval

    Alphabetic writing systems, regardless of the regularity of their spelling, make use of relatively few symbols, so ordering titles in filing systems or words in dictionaries is a matter of alphabetization – establishing an order for the symbols and remembering it. For character writing systems, in which the number of symbols ranges in the thousands, retrieval is much more problematical.

    The most common method of ordering characters (and ultimately, retrieving them) was suggested by the large number of compound characters that arose from processes of borrowing and specification described above. Compound characters could be grouped by radical, and then subgrouped by number of additional strokes (the second of the figures written under each large-format characters introduced in the sets of characters in each lesson). Thus 請 could be found under the speech-radical,言, amongst those characters with 8 (additional) strokes; 蜻 would be under the insect-radical, 虫, 8 strokes, etc. Simplex characters that were themselves radicals (such as 言, 日, 气, 魚) would be listed at the head of their own set. Other simplex characters were brought into the same system by designating parts of their graphs – sometimes rather arbitrarily – to be radicals. Thus 中, 北, 甲 (all simplex) are assigned the radical | (the vertical stroke called shù); 也 is assigned the radical 乙 (even though the character does not contain a stroke of that shape); 元 is assigned 儿, and so on.

    Eventually, by Qing times, with the publication of the great Kangxi dictionary, the number of radicals was settled at 214, ordered by numbers of strokes in each. Students of the language, like literate Chinese, who had to be able to look up characters efficiently or search through indexes ordered by radical, came to know the radical chart virtually by heart. Because of their important classificatory role, and because they are stable (each character having one radical assigned to it) and of fixed number, introductory textbooks have tended to focus on radicals (noting general meanings where possible) rather than phonetic sets. Yet both are useful, and in fact, the information on pronunciation obtained from phonetic elements is probably more useful to the learner (in allowing dictionary searches by pronunciation, for example) than the information on meaning provided by radicals, which is often too general to be of much use.

    The radical system of retrieval is not the only one in use, but it remains one of the more popular systems for looking up characters in dictionaries or other reference works in cases where the pronunciation is not known. Adoption of the simplified set of characters was accompanied by some changes in the assignment of radicals, and altered the arrangement and number of radicals in the chart. The new system has 189 rather than the traditional 214.

    The main difficulty in using the radical system is identifying the radical – particularly in simplex characters which are not themselves radicals and which were assigned a radical to make them conform to the system. Nowadays, most dictionaries are organized alphabetically by the pinyin pronunciation of the first character, but they also contain lists organized by radicals that allow a user to look up characters when the pronunciation is unknown. Only one dictionary, The ABC Chinese-English Dictionary (cited in the bibliography) is organized by pinyin and word (rather than character), so that words are ordered uniquely, irrespective of the particular character of the first syllable.

    An illustration

    The couplet pictured on the next page was observed on a shop door in the city of Zhenjiang, not far downstream from Nanjing. It provides some good examples of phonosemantic characters. Despite being a product of the Mainland, the ‘scroll’ reads vertically in the traditional fashion, right to left, ie Jùn ào chí, etc. Each character contains the now familiar element 馬, but this time, not as a phonetic, but as a radical, so that the set of characters shows no particular commonality of sound. Rather, they all refer to types of horses or to attributes of horses.

    The word-for-word glosses below are only very rough indications of meaning. Each set of 4 characters in a column forms a sentence consisting of an adjective and a noun, followed by an adverb and a verb. The sense is one of aspiration and hope.

           
    駿 xiāng Jùn Adj galloping Outstanding
    N foal fleet+horse
    huān ào Adv joyously proudly
    téng chí V soars races

    The saying is not a well known one; in fact, though they would get the gist of the meaning, many Chinese would be hard pressed to say precisely what the difference was between a and a , (the second characters of each [vertical] line).

    Chinese encountering rare characters such as [some of] those in the couplet, are quite likely to make use of radical and phonetic to remind them of meaning and pronunciation, respectively. Students of the language need the hints even more. With some allowance for 馳 which needs to be referred to other compounds (池 chí, 弛 chí) rather than just the right-hand element (也 ), the pronunciation of the phonetic element alone matches that of the compound (except in tone). Thus 驥 and 冀 are both pronounced ; 驁 is ào, 敖 is áo, 驤 and 襄 are both xiāng, etc.


    This page titled 6.6: Form of Characters is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Julian K. Wheatley (MIT OpenCourseWare) via source content that was edited to the style and standards of the LibreTexts platform.