Labelling of voice quality

4. Labelling of voice quality

phonemic non-phonemic + pathology settings

As in any phonetic description, the description of voice needs appropriate, unambiguous and distinctive labels. Abercrombie (1967) and later Laver (1991:173) use three strands, all simultaneously and continuously (permanently) present that describe the segmental features of voice, the features of voice quality and the features of voice dynamics. Thus, the description of voice involves the corresponding labelling of any of these strands. Laver distinguished impressionistic and phonetic labels of voice (Laver, ibid.) The former requires an audible demonstration of the type of voice referred to before the listener can construct an accurate interpretation of the label (for example "flat", "thin", "bird-like" or "velvety", or other so called "imitation labels"). The latter should be a part of a well-organized vocabulary and should have an exact and agreed upon definition which can be assigned to a label by a group of trained phoneticians. Phonetic labels of voices consist of sets of labels that cover all aspects of voice production, assuming standard anatomy and physiology. In fact, they act as instructions for achieving a certain articulation with a certain voice quality (e.g. loud, slow, nasalized, harsh, whispery, creaky, falsetto). Unfortunately, as of yet no standardized labelling system of voice quality exists, and phonetic labels are not mutually exclusive and sometimes ambiguous.

In the linguistic literature voice quality is generally looked at from two perspectives: is it phonemic or non-phonemic?

Phonemic voice quality has a contrastive function in the phonological system of a language. In most languages a contrast between segments is achieved on an articulatory basis rather than by different phonation types (defined in section 5.1), although for example breathiness is phonemic for vowels in Gujarati and for stops in Igbo (Ladefoged & Maddieson, 1996:47, 304). The languages using phonation contrasts are summarized in Table I.

Table 1: Examples of languages which use phonation types distinctively (after: Ladefoged & Maddieson, 1996)


Language	contrastive phonation types
most languages	voiced vs. voiceless: contrast among obstruents only
Icelandic	voiced vs. voiceless: contrast between nasals¹
Ik, Dafla, Amerindian languages of the Plains and Rockies, Bantu languages of the Congo basin, Indo-Iranian languages of the border region	voiced vs. voiceless: contrast between vowels
Gujarati, !Xóõ	modal vs. breathy voice: contrast between vowels
Indo-Aryan languages	modal vs. breathy voice: contrast between voiced stops
Mpi	modal vs. stiff (slightly creaky) voice: contrast between tones
Parauk	slightly breathy vs. slightly stiff voice: contrast between tones
Jalapa Mazatec	modal vs. breathy vs. creaky: contrast between vowels
Korean	stiff vs. modal voice: contrast for voiced stops
Javanese	stiff vs. slack voice: contrast for voiced stops

¹Jessen & Pétursson (1997)

The changes in voice source behavior may be associated with segmental or suprasegmental elements on the linguistic layer of communication. Of the different phonation types (see section 5.1) modal, creaky (laryngealized), breathy and harsh (Nì Chasaide & Gobl, 1997:452) are used linguistically. It is rather striking that the tense/lax voice opposition (in the sense of the degree of overall muscular tension) is used linguistically (Maddieson & Ladefoged, 1985). In a segmental context voice quality is used contrastively for vowels and consonant in South African, South East Asian and native North American languages as shown in Table I (Ladefoged & Maddieson, 1996; Nì Chasaide & Gobl, 1997). Although the laryngeal differences are associated with voice quality distinctions between consonants, they are primarily located at the onset or offset of a vowel (e.g. in the breathy nasals of Tsonga the acoustic effects affect mostly the vowel onset; vocal fold abduction for the breathy voiced nasal begins during a nasal consonant (Ní Chasaide & Gobl, 1997:454). A suprasegmental property such as intonation, tone or stress also affects the production of voice. In this regard the respective characteristics are perceived to be dependent on the language used. Studies have shown that listeners with different native languages judge voice quality differently (Hurme & Sonninen, 1986). In other words, the judgements of voice quality are affected by a listener's phonological system (Lin 1995:18).

An interesting but still not researched function of voice quality is that it is perceived unconciously. Independently of what is said, it can be perceived as friendly, curious, vicious, off-putting etc. Helmholtz (1863) named this direct perception of emotions based on voice quality `unbewußtes Schließen'.

Another issue concerning voice quality is its contribution to what is commonly called pathological voice . As already mentioned above, the labelling of different voices is not unambigous and the perception of voice quality is not universal, as it depends on both cultural differences in general and the phonological system of a listener's native language. The description of pathological voice, however, attempts to be universal and is based primarily on more abstract laryngeal functions.

Among the various systems of pathological voice description the most common ones concentrate on the degree of "hoarseness" (Hirano, 1981; Nawka & Anders, 1996). Hoarseness is a term used to explain the perceived voice abnormality as originating at a voice source rather than resulting from abnormalities in vocal tract configuration and is perceptually related to the noise generation during phonation. The perception of voice abnormality through hoarseness can be graded, if we provide a detailed and language-independent description of a voice quality. Hirano (ibid.) proposes a scale of voice judgements which includes quantifiable perceptual dimensions related to a set of descriptive parameters for acoustic phenomena (Lin, 1995:20). The factors involved in the classification include:

the degree of hoarseness (G), amount of noise in the produced sound
the grade of roughness (R), in relation to the irregular fluctuation of the fundamental frequency
grade of breathiness (B), the fraction of the non-modulated turbulence noise in the produced sound
asthenicity (A), the overall weakness of voice and
"strained quality" (tenseness of voice, overall muscular tension) (S).

Each of those labels can be graded from 0 to 3. This labelling system is known as the GRBAS classification (Isshiki &Takeuchi, 1970; Hirano, 1981, 1989).

It is widely used in the US and Japan. In Europe the labelling of asthenicity (A) has been criticized as highly correlated with breathiness. Also, the judgments of the tenseness of voice diverge considerably. For this reason a simpler system, the so called RBH system (Wendler et al., 1986; Nawka &Anders 1996:8), which is based only on three perceptual dimensions (roughness, breathiness and hoarseness) has come into use.

Listen to a voice graded to R3B2H3 (WAV file, 100 kB)

In Laver's (1991) framework it is possible to describe non-pathological voice qualities in a relatively objective manner.

Perceived voice quality can be described using phonetic settings (Table II). The settings are grouped into:

- laryngeal settings,
- supralaryngeal settings,
- overall muscular tension settings.

The description of a particular setting is usually given in terms of the degree of deviation from a neutral setting. The neutral setting is defined as a normal position relative to possible adjustments (Laver, 1991:186). Within this description voice quality is regarded as a superposition of a setting and an "organic component" which, to a wide extent, characterizes the baseline of the speaker's voice, i.e. its neutral setting.

Table 2: Phonetic settings of voice quality (from *Laver 1991: 227)*
Supralaryngeal Settings	Laryngeal Settings
Longitudinal axis: labial labial protrusion labiodentalization laryngeal raised larynx lowered larynx	Simple phonation types: modal voice falsetto whisper creak
Latitudinal axis settings: labial close rounding open rounding lip-spreading lingual tip/blade tip articulation blade articulation retroflex articulation tongue-body dentalized palato-alveolarized palatalized velarized pharyngealized laryngopharyngealized mandibular close jaw position open jaw position protruded jaw position retracted jaw position	compound phonation types: whispery voice whispery falsetto creaky voice creaky falsetto whispery creak whispery creaky voice whispery creaky falsetto breathy voice harsh voice harsh falsetto harsh whispery voice harsh whispery falsetto harsh creaky voice harsh creaky falsetto harsh whispery creaky voice harsh whispery creaky falsetto
velopharyngeal settings: nasal denasal	Overall muscular tension settings: tense voice lax voice

4. Labelling of voice quality

Language

contrastive phonation types

voiced vs. voiceless: