5.1. Types of phonation

voiceless       nil                         whisper
voiced   modal creak breathy harsh falsetto

The basic features of  the laryngeal adjustments to the different phonatory settings can be summarized as proposed by Hirose (1996:127):

  1. abduction or adduction of the vocal folds;
  2. constriction of supraglottal structures; adjustment of length,
  3. stiffness and thickness of the vocal folds;
  4. elevation and lowering of the larynx.

The tension and adjustment forces acting on the vocal folds are depicted in Fig.5.

Figure 5. The tension and adjustment forces (from: Ní Chasaide & Gobl,1997:444).

The active longitudinal tension of the vocal folds is achieved through the contraction of the vocalis muscle, whereas the passive longitudinal tension is achieved through contraction of the cricothyroid muscle. The medial tension (compression) is obtained by contracting the lateral thyroarytenoid muscles. The adductive tension is caused by contraction of the interarytenoid muscles and the lateral cricoarytenoid muscles. Each phonatory class has a different specification in terms of these physiological parameters. Below, the influences of the tensions and adjustments of the vocal folds on the phonation process and on voice quality will be described briefly(after Eckert & Laver, 1994).

Voicelessness ( nil phonation ) is realized either by blocking the airflow from the lungs with fully adducted vocal folds or with the vocal folds widely abducted and wide opening of the glottis, when the airflow is laminar. In both cases no sound is generated and no acoustic energy is injected into the vocal tract.

Voicelessness at higher flow speeds causes turbulence even with widely abducted vocal folds. This type of phonation is called breath. An obvious example is the pronunciation of [h] at the beginning of a word like German [h a n t] where the volume velocity flow can reach about 1000 cc/s (cubic centimetres per second).

Figure 6. Voiceless whisper phonation.
Arrows mark the tensions according to Fig.5

(see Eckert & Laver, 1994:80).


Whisper phonation (Fig.6) is characterized by a triangular opening of the cartilaginous glottis (the shape of an inverted Y). Adductive tension is very low and medial compression, as well as longitudinal tension, are moderately high.

Whisper sound quality is produced through turbulences generated by the friction of the air in and above the larynx with vocal folds not vibrating.

 (WAV file,  10 kB)

Apart from the rather seldom linguistic uses, whisper is widely used paralinguistically to signal secrecy and confidentiality.

Quite different and much more varied types of phonation are involved in the vibration of the vocal folds. The aerodynamic aspects of vocal fold movements have been already addressed above and thus description of the effects of muscular settings on vocal fold movements is to follow now.

The neutral mode of phonation is modal voiced phonation . In the normal case the vibration of the vocal folds is periodic with full closing of glottis, so no audible friction noises are produced when air flows through the glottis. All muscular adjustments are on a moderate level and the frequency of vibration, as well as loudness are in the lower to mid part of the range normally used in conversation. The modal phonation of a male speaker occcurs at an average of 120 Hz, while for a female speaker it is approx. 220 Hz. For voiced sounds the glottis is closed or nearly closed, whereas for voiceless sounds it is wide open, actually the distance between the folds amount to only a fraction of a milimeter. The degree of opening and its timing is relative to the articulatory gestures and depends on the phonetic environment of a generated sound. The average flow rate is between 100 and 350 cc/s.

 (WAV file,  16 kB)

One of the characteristics of modal phonation is the build-up of the contact between the vocal folds. During the open phase of vibration the glottis has a triangular form with wider opening at the arytenoids. As the vocal folds close, they do not do so in all places at the same time, but with a vertical phase difference (in accordance with the body-cover model of vocal fold vibrations), with the lower parts of the edges closing and opening before the upper edges6. For this reason the contact area is triangular during the opening and closing of the vocal folds, and, consequently, the glottis takes on the shape of a tetrahedron, as depicted in Fig.7.

Figure 7. Triangular shape of the contact between the vocal folds.


Figure 8. Creaky voice phonation. Arrows mark the tensions
according to Fig.5 (see Eckert & Laver, 1994:77)


Creak phonation (also called vocal fry) is also produced with vibrating vocal folds but at a very low frequency. The vocal folds (Fig.8) are strongly adducted and of weak longitudinal tension. Both this factors cause the vocal folds' thickening. Additionally, they may come in contact with the false folds creating an unusually thick and slack structure.

The resulting low tension and heavy vibrating mass are responsible for the slower and irregular vibration. Both subglottal pressure also the glottal airflow are lowered compared to modal phonation. Creak is produced at a flow rate of 12-20 cc/s while pulses are produced in a frequency range from 25 to 50 Hz.

 (WAV file,  16 kB)

Figure 9. Breathy voice phonation. Arrows mark the tensions
according to Fig.5 (see Eckert & Laver, 1994:77)


Breathy voice is normally regarded as a compound phonation type (voiceless+modal), but I have decided to view it as an independent phonation type because of diverse adjustments of laryngeal structures in comparison to other phonation modes. Muscular tension is low, with minimal adductive tension, weak medial compression and medium longitudinal tension of the vocal folds (Fig. 9).

Vocal fold vibration is inefficient and, because of the incomplete closure of the glottis, a constant glottal leakage occurs which causes the production of audible friction noise. Air flows through the vocal folds at a high rate. The vibrations' frequency of is just below the value typical of the modal voice.

 (WAV file,  17 kB)

Breathy voice differs from voiced whisper because of the weaker medial compression and the smaller degree of voicing effort. However, as pointed out by Laver (1980), there is no clear perceptual boundary between whispery and breathy voice.

Figure 10. Harsh voice phonation. Arrows mark the tensions
according to Fig.5 (see Eckert & Laver, 1994:88)


Harsh voice (Fig.10) is due to the very strong tension of the vocal folds (especially medial compression and adductive tension), which results in an excessive approximation of the vocal folds. When the whole larynx is subjected to this extremely high tension, the upper larynx becomes highly constricted with the ventricular folds pressing on the upper surfaces of the vocal folds, making their vibration ineffective.

Harsh phonation is therefore irregular in both cycle duration and amplitude. The characteristic fundamental frequency is above 100 Hz.

 (WAV file,  18 kB)

A lighter degree of tension is sometimes described as a tense voice7

Figure 11. Falsetto phonation. Arrows mark the tensions
according to Fig.5 (see Eckert & Laver, 1994:84)


The frequency of vibrations in falsetto phonation is noticeably higher than in modal voice. The vocal folds are stretched longitudinally, thus becoming relatively thin. Consequently, the vibrating mass is smaller and the generated tone higher (eq.(1)). The adduction of the folds is high and the medial compression is also strong (Fig.11).

The glottis often remains slightly open, resulting in low subglottal pressure (due to constant glottal leakage) and the generation of the audible friction noise component.

 (WAV file,  8 kB)

Not all phonation types are mutually exclusive, on the contrary, some of them work together to modify phonation. Only modal and falsetto are incompatible because they use the structure of the larynx differently. The possible combinations of phonation types are given in Tab. 2. The compound phonation types are used solely on the para- and extralinguistic layers of communication.

7. the term of tense voice is also used to describe a higher degree of tension in the entire vocal tract (Ní Chasaide & Gobl, 1997:451)