2.1. The segmental structure of speech

All the communication layers defined above are perceived in four domains: quality, duration, pitch and loudness (Laver 1994:27). In order to analyse the voice quality in those domains it is necessary to describe sounds in a well-organized manner. It is thus it is necessary to distinguish a set of phonetic units. Following Laver's description those units are: feature, segment, syllable, setting, utterance and speaking-turn (ibid.:110).

The set of phonetic features constitutes the minimum set of descriptive parameters used in order to account for the phonological differences between phonetic units (Laver 1994:110-112, Clements & Hume, 1995) of a language. The features may be articulatory features (defined in terms of the action of the organs of speech), acoustic features (defined in terms of the physical properties of the speech sound relevant to the feature) or perceptual features (defined in terms of the perception of the given sound by the ear and the brain) (Clark & Yallop, 1995:ch.10). For example, the feature "voice'" (also called "glottal stricture") in Ladefoged's feature system describes glottal activity and has five values: glottal stop, laryngalized, voice, murmur and voiceless (Ladefoged, 1975). In the Lindau (1978) feature system "voice" refers to different shapes of the glottis (with the values of "glottal stop", "creaky voice", "voice", "murmur" and "voiceless"). The set of all features forms a model of a language and its structure.

The portion of speech with relatively constant phonetic features is called a phonetic segment . A given feature may be limited to a particular segment but may also be longer (as a suprasegmental feature) or shorter (as a subsegmental feature). Segments, usually phonological units of the language, such as vowels and consonants, are of very short duration.

Utterances are then built of linear sequences of segments. Typically, a speech segment lasts approximately 30 to 300 milliseconds.

Phonetic segments form a syllable . The syllable can also be defined in phonological terms and itself represents a level of higher linguistic organization.

Another category of phonetic description is the phonetic setting . It was explicitly proposed and defined in the following way by Laver (1991:184):

"There is an alternative, wider approach to the task of articulatory description, that concerns itself with both differences and similarities in vocal performance in speech and sees individual segments as momentary actions superimposed on a long-term SETTING of the vocal apparatus. The setting accounts for the similarities and the segments for the differences, as it were. A setting gives a background, auditory colouring to sequences of short-term segmental articulations."

A setting could be as long as a whole utterance but it could also describe only a short part of an utterance, as short as one phonetic segment. After the seminal works of Laver (1991, 1994) it has become a convention to describe voice quality using the features of setting. Laver (1994:396) defines phonetic settings more stringently as the coordinating tendency underlying the production of the segments in the chain of speech while maintaining a particular configuration or a state of the vocal apparatus. The settings in this sense share common features across the successive segments or syllables; they give an impression of a feature characteristic of a particular speaker or his/her behavior during conversation. The phonetic settings are very useful in describing human voices thanks to their power in describing similarities of speech production in longer portionsof speech. The settings are used at every stage of the description of the speech production.

To complete the framework of speech segments the utterance may be defined as a stretch of speech by a single speaker delimited by silence and containg no internal pauses whereas the speaking-turn consists of one or more utterances and denotes one speaker's contribution to a conversation (ibid.:116).

Linguistic features and phonetic settings are brought together in the process of speech production.