Processing of the database

from speech and text to a prosodic database

automatic segmentation

  • uses HMM speech recognition technology
  • produces
    • word label files
    • syllable label files
    • phoneme label files
  • from
    • speech
    • orthographic text

demo about automatic segmentation
details about automatic segmentation

automatic part-of-speech tagging

  • Tagger developed at IMS (Helmut Schmid)
  • a wrapper produces POS tag label files from word label files and orthography

details about Helmut Schmid's tagger

automatic ToBI-labelling

  • automatic length extraction
    • pauses, syllables and phonemes
    • speaker and phoneme dependend normalization
  • automatic intensity extraction
    • rms values for syllable nuclei
    • various sub band rms values to capture spectral tilt
    • speaker and phoneme dependend normalization
  • automatic F0-parametrization
    • 7 parameters per syllable
    • parameters have phonetic interpretation
    • taylored for ToBI labelling
  • prediction of the phonological prosodic labels (ToBI) (still work in progress)
    • symbolic attribute value pair learners
    • first order predicate learners

    details & demo about F0-parametrization

  • manual ToBI-labelling

    • labelling guidelines
    • training data for automatic labelling
    • correction of errors of automatic procedure

    see the labeling guide for our version of ToBI

    selection and search utilities

    news broadcast segmentation

    • segments a news broadcast into news stories
    • relies on prosodic and lexical marking

    reoccuring news detection

    • two methods of finding reoccuring news stories
    • helps establishing a corpus of repeated speech
      for research on prosodic variation

    ToBI overview with wordontones

    • analyzes word and ToBI label files
    • shows prosodic variation
    • ToBI labels of different spoken versions appear under the orthography
    • produces LaTeX and PostScript files ready for inclusion in publications
    • also produces a HTMLversion for easy worldwide collaboration

    see a clickable HTML output of wordontones

    corpus search

    • finds all occurences of a word in the corpus
    • pops up an xwaves signal display with label files
    • integration with xkwic planned for the future

    see the corpus tools pages at IMS

    go back to the database pages

IMS Stuttgart / WWW@IMS.Uni-Stuttgart.DE / Mon May 4 11:05:45 1998 (hofmanaa)