It has already been mentioned that the phonation type strongly influences the properties of a generated sound. The differences are caused by a change in the excitation pulse. Thus, the glottal waveform is different for every individual phonation type.
Its specific characteristics depending on the respective phonation type are described in comparison to modal phonation, taking over the approach of Ní Chasaide and Gobl (1997), Stevens (1994), Trask (1996) and Zemlin (1988):
The present descriptions of pathological voice involve both specific methods and common measurements also used for healthy speakers. A broad class of parameters is used to describe the "roughness" of the voice, its fluctuations in the temporal and amplitudinal domains. Especially the stability of the fundamental frequency was investigated. Various indicators were proposed in the literature (Koike, 1973; Kitaijma et al., 1975; Gubrynowicz et al.1980; Davis, 1978; Pinto & Titze, 1990; Titze & Liang, 1993; Baken, 1987). They include pitch perturbation factors over long and short periods of time (jitter), as well as an excess which describes the distribution of values (pitch period lengths) in relation to the normal distribution (Hays, 1988). Pathological voices are characterized by higher values for those parameters. Other types of parametrization include the autocorrelation function in order to characterize RMS fluctuation (shimmer) which is averaged over pitch periods (Davis, 1978). They also cover cepstral measurements (Gerull et al., 1992).
In the frequency domain "hoarseness" is investigated. "Hoarseness" is generally perceived as the level of noise in produced speech (see also section 4). It is often measured by means of a visual inspection of spectrograms or the use of a long-time averaged spectrum (LTAS) (Frokjaer-Jensen & Prytz, 1976; Gauffin & Sundberg, 1977, 1989). The latter technique distinguishes between certain types of voices in particular frequency bands. The spectral flatness of the residue signal (Markel & Gray, 1978) also demonstrates its dependence on the spectral noise level. The more sophisticated methods enable the researchers to compare the energy generated by voicing (harmonic component) and turbulent flow (friction or breath noise) (Teager Energy Operator, Gavidia-Ceballos et al., 1996; Cairns et al., 1996). The harmonic-to-noise ratio (Davis, 1978) is often used to characterize pathological voices.
It should be mentioned however, that the existing parametrization of pathological voices often fails in practice, primarily due to the high complexity and multidimensionality of pathological voice quality.
All methods presented above are either invasive or relatively complex. Hardly any of them allow an objective and robust description of the crucial parameters (the Open and Speed Quotients) of the glottal waveform which appear to correlate best with various types of voice quality.
A method which attempts to overcome these problems will be described in the following chapter.