13. Electroglottographic measures of paralinguistic factors.

The use of electroglottography on the paralinguistic layer of communication can be divided into two areas of interest:

Both areas are usually researched in combination and the description here will also follow this tradition.

Wechsler Fourcin Motta Esling Holmberg Higgins

The classification of voice quality using electroglottography has a long history. The first investigations involving the qualitative description of the Lx waveform for various voice qualities were conducted by Fourcin and Abberton in the seventies (Fourcin & Abberton, 1971; Fourcin, 1974). The relations between the Lx waveform and stroboscopically observed glottal opening were established among others by Fourcin & Abberton in 1977. In their investigations the Gx component of the signal was used to find evidence for movements of the whole larynx (Abberton, 1972).

The use of the EGG for studying pathological voice quality was pioneered by Wechsler (1977) who observed both closure deficiencies and irregularities in patients. The primary application of the EGG device lay in the reliable measurement of pitch and its deviations. Wechsler (ibid.) suggested that the EGG maight be useful in the detection of anomalous laryngeal behavior even when the voice appears normal. In Neil, Wechsler and Robinson (1977) give an example of a laryngeal disorder which improved after a therapy. Although the voice was auditorily normal, the EGG waveform was still peculiar in shape.

Fourcin (1993) studied normal, creaky and breathy voices using the laryngograph. He claims that three main factors contribute to the perception of voice quality: the regularity of the vibration of the vocal folds, the rapidity of closure and closed phase duration (only partially responsible for the quality of voice). Those claims are supported by examples where the differences in perceived voice quality are related to distinct closure rapidities and closed phase durations observed in the EGG domain. All three factors are measurable in the EGG waveform and "[...]they have the further (and more important) advantage of being obtained directly from the source of excitation in the larynx, rather than to being inferred from measurements of the acoustic output from the vocal tract"(Fourcin, 1993:44). Fourcin also pleads for an examination of long samples of speech in the investigation of laryngeal diseases.

Hanson et al. (1988, 1990) provide an elaborate study of laryngeal paralysis, primarily comparing the photoglottography (PGG) waveforms, although they also use the EGG. Despite their conclusion that electroglottography alone can not provide the information necessary to calculate the OQ and the SQ, which were used as primary indices of the type of paralysis, they provide evidence from EGG waveforms for all types of paralysis. The results are evaluated using the recordings of 49 patients producing a sustained /i:/. At least 50 representative cycles of phonation were analysed. The EGG waveforms of the patients suffering from recurrent nerve paralysis were less flat (irregular) in the open phase and with a shorter contact phase than in normal voices. The pattern of isolated superior laryngeal nerve paralysis differed from that of recurrent nerve paralysis and from normal voices in the shorter plateau of the no-contact phase. The EGG signals of idiopathic paralysis patients indicated an even shorter closed phase and a flat open phase. Although the relation of the EGG to the glottal area waveform varies, the closure is constantly delayed compared to the PGG. Hanson et al. conclude that the EGG signal primarily reflects the degree of contact between the upper edges of the vocal folds and cautioned, that when there was a lack of normal approximation of the folds, the signal-to-noise ratio of the EGG signal decreases substantially, rendering the EGG less useful.

The study of Motta et al. (1990) provides documentary evidence for functional and organic dysphonia. In the experiment, 50 modal speakers, 151 patients with functional dysphonia (hypo- and hyperkinetic) and 231 with organic dysphonia (nodules, polyps and Reinke's edemas) producing sustained /e:/ at a constant fundamental frequency were examined. The recordings were repeated after surgical or logopedic treatment. The data was visually inspected. A sharp peak in the waveform was determined to be the primary correlate of functional dysphonia, whereas for hyperkinetic dysphonia the contact phase of the EGG was broad-peaked (with a "plateau" in the closed phase). These waveform deviations disappeared after logopedic treatment. In the case of organic dysphonia the waveform was also altered. A single notch (an additional, accidental and rather broad peak) occurred in the adductory phase of the vocal folds movements of patients suffering from nodules, polyps or Reinke's edemas. Sometimes, a double notch was registered in the ascending portions of the glottal wave for polyps and Reinke's edemas. For the latter disorder irregular traces were observed. Motta et al. note, however, that for 28% of the patients with nodules, 7% of those with polyps and 4% of those with Reine's edemas, the EGG waveform was indistinguishable from that of modal speakers. In the cases of benign neoplasms the notches in the EGG waveform disapeared after the removal of the neoplasms. However, a complete normalization of the EGG waveform was accomplished only after logopedic therapy. Overall statistic show that the diagnosis based on EGG evaluation was false only in 17.75% of the cases of organic dysphonia and in 5.3% of the cases of functional dysphonia. Summarizing, the authors recognized the importance and applicablity of electroglottography in clinical practice.

Esling (1984) expresses a definite relation between the glottal waveform and the phonation type. He relates creaky, modal, ventricular, harsh, whispery, breathy and falsetto voices to the Speed Quotient (compare section 7.5) and to fundamental frequency of the same speaker and he assigns specific intervals of F0 and SQ to each type of voice. This quantitative classification was compared to the visual inspection of high-speed films of the different vocal folds vibration patterns. Esling hypothesizes that the phonation type reflects an articulatory-based model in which vibratory behavior depends on two types of laryngeal stricture:

1) a continuum of anterior-posterior stricture for changing pitch

2) a continuum of lateral stricture for glottal openness

The phonation types in column (1) in Table 7 relate rising SkewnessEGG to rising pitch and to rising latitude of anterior-posterior stricture and in column (2) to voices for which there is no relation between the increasing skewness and pitch.

Table 7: Classification of voice quality based on antero-posterior and lateral larynx stricture (see Essling, 1984)
antero-posterior stricture lateral stricture
                                                    ^

growing stricture |

                                                     |

                                                     |

creaky voice ventricular voice
harsh voice
modal voice modal voice
falsetto whispery voice
breathy voice

Holmberg et al. (1995) compare the flow adduction quotient (ratio of the duration of the closed phase in the glottal airflow signal and the pitch period duration) obtained from an inversely filtered speech signal with the EGG-obtained adduction quotient (ratio of the vocal folds contact duration and the pitch period duration). The quotients are calculated using amplitude levels of 30% and 65%, respectively as criteria. The authors generally criticize the use of the EGG. They point out both difficulties in recording signals, especially for female speakers, and sporadic disruptions of the signal due to gross larynx movements. However, they found out that for clear and strong EGG signals both measures (EGG and airflow) are highly correlated (r>0.85)9. They argue for a correlation between the Open Quotient of the glottal flow and the one computed in the EGG domain.

The study of Higgins and Saxman (1991) deals with gender and age differences in phonatory behavior. Apart from other measurements, they utilized EGG signals as well as airflow signals in order to calculate duty ratios in both domains. The EGG duty ratio was computed using a 60%-peak-to-peak-amplitude10 criterion. Balanced groups of young and elderly males or females (ca. 10 subjects per age and gender group) who were in good health and show no evidence of a laryngeal disease took part in the experiment. The prolonged vowel /æ:/ and the syllable /bæp/ served as stimuli. Combinations of normal/high pitch and normal/soft/high loudness (6 possibilities) were used. Statistically significant results (using ANOVA and t-tests) were found. Aged women exhibit a smaller EGG duty cycle (OQ) during vowel prolongation and lower F0 during syllable repetition than young women, whereas aged men have a significantly greater EGG duty cycle during vowel prolongation and syllable repetition than young men. A significant gender effect was found for in the condition of high loudness, with males displaying smaller EGG duty cycles than females. The results showed that the duty cycles depend on age but that the direction of changes is opposite for males and females. Their study generally supports the use of EGG in para- (or rather extra-) linguistic research. The results obtained by means of the EGG measures were in line with those obtained by airflow evaluations.


9 r denotes the correlation coefficient.

10 with inverted signal orientation.