The visual methods

In earlier years direct laryngoscopy was performed by looking into a tube inserted directly into the pharynx. This requires anaesthesia, disturbs the articulation and is highly uncomfortable for the speaker. The use of fiberoptics (Sawashima &Hirose, 1968) permits a better view of laryngeal behavior. The flexible glass fibres are introduced through the nose and positioned near the tip of the epiglottis. The light guide conducts the light needed for the illumination of the larynx, while the second guide transmits the image from the objective to the external scope or a screen. The nasal muscles and the walls of the epiglottis are anaesthetized (Hirose, 1996:118). Although articulation is possible, both methods are invasive and hence not comfortable for speakers.

The visual observation of the vocal folds and other larynx structures allows the substantiation of the auditory evaluation. In indirect laryngoscopy, the larynx is viewed via a mirror inserted into the back of the mouth.

Davis (1978:135) noticed that, "The gross structure and movements of the vocal folds are observed; however, the amount of detail made available by indirect laryngoscopy is limited because the field of view is small and the distance from the larynx is relatively long". The method was first developed by Manuel Garcia at the end of the 19th century. Although the rapid vibratory motions of the vocal folds cannot be observed, the use of a stroboscope flashing at high rate yields a slowly moving image of the vocal folds (stroboscopic laryngoscopy). The image, which is composed of many vibration cycles, does not show the fine details of the movements.

Vocal fold movements can be analysed more precisely using the technique of high-speed filming which was pioneered by Farnsworth in 1940 (also Moore, 1968; Koike &Takanashi, 1971). In this first attempts a high-speed camera, a special light source and a laryngeal mirror were used with exposition ca. every 0.5 ms. Further developments include simultaneous acoustic recordings and the use of a laryngeal tele-endoscope (Hirose, 1988) with higher picture resolution and dynamics (brightness). The main disadvantage of the method is its high cost. It is also invasive and uncomfortable. The capability to observe natural articulatory gestures is also limited due to the insertion of the tele-endoscope.

stroboscopic view  of  the movement of the vocal folds

( SGI movie file 1.4 MB)


The independent work of Zemlin (1959, see: Zemlin, 1988) and Sonesson (1959) lead to the elaboration of the transillumination technique. A beam of light is directed into the speaker's neck just below the cricoid cartilage. As the vocal folds open and close, respectively, a different amount of light is passing to a photoelectric sensor situated in the pharynx. Anaesthesia is also required. The strength of the modulation of the light beam depends on the size of the glottal area. The method was further modified by various researchers: Frokjaer-Jensen (1967) introduced a small photosensor which was inserted through the nasal cavity. Sawashima (1968) reversed the process by inserting a fiberoscope for illumination and mounting a photosensor at the anterior neck. This modification facilitates the study of laryngeal behavior during articulation. The methods are called photoglottography (PGG) and transillumination, respectively. The validity of the technique was proven by various researchers who concluded that "PGG reveals essentially the same information on glottal area function as that provided by ultra high-speed photography" (Harden, 1975).

Other methods of visual inspection of laryngeal behavior include radiography and computer tomography. In the former, later modified by Hollien (1964; Hollien & Coleman, 1970) and called laminography, a frontal section of the larynx is achieved by moving the film during exposure proportionally in the direction opposite to the simultaneous movement of the x-ray source. A view of the larynx is possible even during phonation (Zemlin, 1988:140). Additionally, the combination of the stroboscopy and laminography (an X-ray beam is pulsed at the rate of the vocal folds vibrations) permits the slow motion filming of the vibration (strobolaminography - STROL). The latter (computer tomography) allows only a static view into the laryngeal structures at the moment. However, it is to be expected that in the near future the examination of motion will also be possible.

Ultrasonography is not widely used to observe the functions of the larynx, primarily due to the poor quality of the produced pictures. The technique is used to observe articulatory movements (Stone, 1997:20-24) rather than laryngeal behavior.


The latest technique consists in the use of a laser to measure the excursion of the vocal folds during phonation. As in endoscopy the light source and the fiberscope are inserted into the larynx. An infrared laser diode (904 nm light length) is fitted in the head of the fiberscope, and as the vocal folds move the reflected light is modulated by their vibration (Igielski et al., 1995; Bresinska, 1996). This method requires anaesthesia, and although it delivers very detailed information about phonation (one can observe even the behavior of chosen parts of the vocal fold), normal articulation is still not possible due to the insertion of the tele-endoscope.

Direct measurements of the transglottal pressure

The aeroacoustic measurements of laryngeal behavior include the direct and indirect measurement of transglottal pressure. For the latter method a technique of inverse filtering is used.

The direct measurement of transglottal pressure involves placing miniature pressure transducers fitted into small catheters below and above the vocal folds (Kitzing et al., 1982). This method is of course rarely employed. Despite local anaesthesia being needed to insert the catheters (this may affect the phonation process), the results are important for other methods, especially for the modelling of vocal fold movement (see section 25). Based on the results of their investigation Kitzing et al. stated that:

1. The maximum subglottal pressure during normal vowel phonation (/a/ and /e/) by a male speaker is about 14-18 cm H2O, while minimum pressure depends on the vowel and lies between 8.8 and 9.5 cm H2O.

2. The mean subglottal pressure for creaky voice is smaller and is comes close to 6 cm H2O. For creaky voice, the pressure decreases and increases rapidly during the opening and closing of the glottis, while remaing constant during the closed portion of the cycle.

3. The subglottal pressure for breathy voice is comparable to that of creaky phonation.

4. For rising pitch (up to falsetto) the subglottal pressure remains unchanged compared to normal phonation.

5. For falling pitch pattern the pressure drops from 13.9 cm to 12.6 cm H2O.

These results are used in the modelling of the glottal source (section 27.1).

Follow this link to find more about inverse filtering of speech.