15.1. Word stress in German

In a phonological sense word stress captures a relation of prominence between syllables of phonological word. The phonetic manifestation of the most prominent syllables depends to a large extend on whether or not it is associated with a pitch accent (of a particular intonational contour) at the level of an intonational phrase. From this perspective, word stress can be seen as the abstract structure to which phrasal accent can be associated. But, since the stressed syllables are not necessary associated with pitch accent, pitch is not the most invariant correlate of word stress. Duration or spectral energy distribution are more likely candidates for such a reliable correlates in Germanic languages (Sluijter, 1995).

It seems that in German word stress has several functions including word demarcation and rhythm or, to quite limited extend is also to distinguish between word meanings. The location of stress within a word, especially main stress (longer words may also have secondary stress) cannot be predicted solely on phonological grounds, but also involves lexical and morphological factors. The location of the stress is quite difficult to predict for German (Jessen, 1993,1995 for an overview; Schulz, 1996 for further experimental evidence).

The difference (on the word stress level) between stressed and unstressed syllables is realized phonetically by the differences between the acoustic realizations of vowels (as the main elements carrying the stress/unstressed distinction) within and outside the stressed syllable. The difference in both the quality and quantity of a vowel (e.g. the "tense-lax" distinction) fully manifests itself under main stress, although the acoustic correlates of tenseness and stress are partially the same. The experimentally established acoustic cues of word stress in German include vowel duration, pitch and intensity changes as well as laryngeal features (Jessen et al., 1995; Claßen et al., 1996).

The duration of the vowel (as well as the overall syllable duration and the durations of other parts of the syllable) is regarded as the primary cue to the in German word stress. Dogil (1995) statistically validates the duration of the syllable as the main correlate of word stress in the measured parameters. This result was confirmed by a classification of the same data using a machine-learning paradigm developed by Rapp (1994). Jessen et al. (1995) (also in: Jessen, 1993) state that in the stressed syllable the duration of the vowel as well as the duration of the closure in the plosive preceding the vowel are significantly longer than in the unstressed units.

The phonetic realization of word stress also involves fundamental frequency change, almost always realized in German as an increase of F0 (Jessen et al., 1995; Dogil, 1995). However, one should be aware of the fact that the pitch changes may be caused also by the sentence intonation, which can influence the outcomes of experiments where these prominence categories are not independently examined (Sluijter, 1995; Möhler & Dogil, 1995).

The intensity (loudness) of a stressed vowel in German is higher than that of an unstressed vowel, but the contrast is weaker than for other cues (Dogil, 1995; Rapp, 1994; Jessen et al., 1995; Isatschenko & Schädlich, 1966). It was also confirmed for other Germanic languages (Sluijter, 1995:41; Fry, 1995 for English) that the intensity correlates of stress are weaker than the duration and pitch changes.

Stressed and unstressed syllables in German also differ in articulatory organisation. Dogil (1995) describes reduced coarticulation within stressed syllables. The formants of the stressed vowel are decentralized on the F1-F2 plane.

The main goal of this study is to validate recent findings that word stress is also manifests itself in voice quality (especially in phonation process). Sluijter (1995:42) hypothesizes that stress is related to higher vocal effort. This means that the overall intensity of a produced sound is not only a correlate of loudness, but that it also affects the shape of the spectrum of the produced sound. Sluijter suggests that the acoustic correlate of greater effort (and thus stress) is a decrease of a negative spectral tilt (more gradual fall of the spectrum). This can be described as a change in spectral balance: "A stressed syllable might be perceived as louder, and therefore more prominent, than an unstressed one due to the increased intensity levels in the higher part of spectrum"(ibid.:68). This supposition was successfully proved in a perception experiment (ibid.:79f) as well as through measurements for Dutch (ibid.,39f). Sluijter et al. (1995) for American English and Dutch and also Campbell (1995) for English confirm the relation of spectral tilt to linguistic prominence.

The reason for this is more effort in voice production, which results not only in a higher amplitude of the glottal flow volume velocity pulse, but also generates higher skewness of the pulse. The pulses during stressed syllables should have shorter closing phases so that the falling flanks of the pulses are much steeper than in unstressed syllables.

In Fig. 24 the effects of the changing skewness of the glottal pulse are depicted. Glottal pulses were synthesized with the ratio between opening and closing durations (Speed Quotient) ranging from 100% to 30%. The spectral envelope of pulses differs primarily with respect to the spectral tilt. At the frequency of 2000 Hz the magnitude of the spectrum is about 12 dB higher for the pulse with SQ=60% than for the pulse with SQ=100%. The magnitude of the spectrum at the frequency of 2000 Hz for the pulse generated with SQ=30% is even higher that of the previous one. The growth again amounts to about 12 dB.

Figure 24. Effect of changing glottal pulse skewness on the spectral tilt. Speed Quotients (closing to opening duration ratio) are equal to 100%, 60%, 30% for the first, second and third pulse, respectively. Spectral envelopes are estimated using a 4th order autocovariance model