Processing of the database
from speech and text to a prosodic database
automatic segmentation
- uses HMM speech recognition technology
- produces
- word label files
- syllable label files
- phoneme label files
- from
- speech
- orthographic text
demo about automatic segmentation
details about automatic segmentation
automatic part-of-speech tagging
- Tagger developed at IMS (Helmut Schmid)
- a wrapper produces POS tag label files from word label files and orthography
details about Helmut Schmid's tagger
automatic ToBI-labelling
- automatic length extraction
- pauses, syllables and phonemes
- speaker and phoneme dependend normalization
- automatic intensity extraction
- rms values for syllable nuclei
- various sub band rms values to capture spectral tilt
- speaker and phoneme dependend normalization
- automatic F0-parametrization
- 7 parameters per syllable
- parameters have phonetic interpretation
- taylored for ToBI labelling
- prediction of the phonological prosodic labels (ToBI) (still work in progress)
- symbolic attribute value pair learners
- first order predicate learners
details & demo about F0-parametrization
- labelling guidelines
- training data for automatic labelling
- correction of errors of automatic procedure
- segments a news broadcast into news stories
- relies on prosodic and lexical marking
- two methods of finding reoccuring news stories
- helps establishing a corpus of repeated speech
for research on prosodic variation - analyzes word and ToBI label files
- shows prosodic variation
- ToBI labels of different spoken versions appear under the orthography
- produces LaTeX and PostScript files ready for inclusion in publications
- also produces a HTMLversion for easy worldwide collaboration
- finds all occurences of a word in the corpus
- pops up an xwaves signal display with label files
- integration with xkwic planned for the future
manual ToBI-labelling
see the labeling guide for our version of ToBI
selection and search utilities
news broadcast segmentation
reoccuring news detection
ToBI overview with wordontones
see a clickable HTML output of wordontones
corpus search
see the corpus tools pages at IMS