24. Modelling of the vocal tract

The main goal of the modelling of the vocal tract is to find the anatomical cues to the acoustic and phonetic features of a produced sound. Numerous models of the vocal tract were constructed (Wakita & Fant, 1978; Flanagan 1972; Maeda, 1982; Kröger, 1993; Nowakowska & Zarnecki, 1989) and most of them used numerical methods of computer simulation.

In the first attempts the vocal tract was modelled in the form of a chain of homogeneous tubes. The segments are cylindrical and their number depends on the changing physiological structures of the vocal tract. Due to the complexity of the anatomical structure involved in speech production modelling, consideration is generally limited to only those segments which are significant to the articulation process.

The vocal tract is made up of three cavities: the pharyngeal, oral and nasal cavities and their contribution to the acoustic properties of the vocal tract depend on their configuration and connection to the whole articulatory system.

Sagittal view of the vocal apparatus (from Laver (1994:120)) 

The coupling to the nasal cavity is controlled by the position of the soft palate. The soft palate adheres to the back wall of the pharyngeal cavity and can prevent air flow through the nasal cavity. In nasals or nasalized sounds the soft palate is lowered, which results in airflow through the nasal cavity. The acoustical properties of the vocal tract are then determined by the parallel connection of two resonators - the nasal and the pharyngeal-oral cavity.

A considerable part of the surface of the pharyngeal-oral tract walls is formed by the moving articulators: tongue, soft palate, lips and jaw. Thus, the geometrical configuration of the vocal tract is subject to substantial variations during the articulation process. The structure and geometry of the nasal channel on the other hand, exhibit only individual variations and are almost independent of articulation (Nowakowska & Zarnecki, 1989).

These insights have led to the simulation of the vocal tract (as depicted in Fig. 42).

Figure 42. The approximated articulatory model of the vocal tract: a) an elementary segment of the vocal tract represented by a cylindrical tube of the length l and the cross-sectional area A, b) electrical equivalent of the elemntary segment, c) the system of the pharyngeal-oral and the nasal cavity (see Nowakowska & Zarnecki, 1989:72)

The acoustic properties of the elementary uniform cylindrical tube segment (of the length l and the cross-sectional area A) are equivalent to the electrical impedance network (Fig.42b) made up of the following elements (Nowakowska & Zarnecki, 1989):

(36)                               

The symbols used in eq.(36) denote the following constants:

rho- air density, µ - air viscosity coefficient, eta- adiabatic air constant , lambda- thermal conductivity of the air, xi - specific heat of the air. Furthermore, l is the length of a single segment, c is the sound velocity,  A is the area of a tube opening and S is circumference of the tube opening.

In more sophisticated models the tissue impedance is also included, leading to the addition of specific terms describing the co-vibration of the walls and the air (which changes the definitions of G and C). The typical length of a segment is l=0.5..1 cm (the number of the segments varies between 15 and 30).

For the purpose of this study it is sufficient to model the articulation of the non-nasalized vowels only, which leads to the following simplifications:

where An is the area of the mouth.