EMA topic fields and courses in detail

Theoretical Linguistics top

Course No.: 5 Introduction to linguistics topic
This course provides a general overview of the field.

Course No.: 10 Syntax 1 topic
The course has two major concerns. Its first part offers an overview of central issues of German syntax. Categories and functions, tests for evaluation of syntactic structure and the peculiarities of clause structure in German (topologische Satzanalyse) are the main topics of this part. The presentation is not devoted to any specific theory of syntax but descriptively oriented.

The second part of the course gives an introduction to Lexical Functional Grammar (LFG), one of the generative theories of syntax which has also given rise to substantial implementations.

LFG provides for parallel representation of syntactic structure, namely constituent-structure and functional structure. In the course, the syntactic and lexical rules that constitute these structures are presented, as well as the principles that determine the well-formedness of given structures.

Along with the formal apparatus, LFG-analyses of common grammatical phenomena like passive, functional control, long distance dependencies are topics of the course.

Course No.: 18 Syntax 2 topic
This course is based on Syntax I. The topics are recent developments in Lexical Functional Grammar and their application to phenomena of German syntax. The theoretical concerns comprise Lexical Mapping Theory which determines the association of the (semantic) arguments of a predicate with grammatical functions; constraints on phrase structure including conditions on head mobility; principles of structure-function association that constrain possible functional annotation of phrase structures. With respect to German syntax the course focusses on a consistent explanation of the clause structure.

Course No.: 9 Morphology topic
Provides an overview of the field, focussing on inflectional and derivational morphology:
  • Study of the basic concepts of morphology like morpheme, allomorphy, derivation, compounding etc.
  • Thereafter, the focus is on the syntax-morphology interface and the morphology-lexical semantics interface.
  • Discussion of a number of approaches in computational morphology, such as e.g. the Two-Level Morphology. We will also look at the computational morphology database that has been developed at the IMS.
Practical exercises using this and other tools will help deepen and extend our knowledge about morphology.

References:
Some chapters from: Spencer, A. (1991): Morphological Theory. An Introduction to Word Structure in Generative Grammar. Blackwell, Oxford

The section about morphology from: Grewendorf, G.; Hamm, F. und Sternefeld, W. (1988): Sprachliches Wissen. Eine Einführung in moderne Theorien der grammatischen Beschreibung. Suhrkamp, Frankfurt

Fleischer, W. und Barz, I. (1992): Wortbildung der deutschen Gegenwartssprache. Max Niemeyer Verlag, Tübingen

Course No.: 6 Introduction to computational linguistics topic
Theoretical and practical introduction to CL techniques: top down, bottom up, and chart parsing, the use of lexicalist grammars, generation techniques, and basics of computational semantics are dealt with. Extensive exercises in encoding the respective algorithms in Prolog are required. The use of CUF (Comprehensive Unification Formalism) for Grammar design is introduced. Among the texts used is Natural Language Processing in Prolog by Gazdar/Mellish.

Course No.: 17 Semantics 1 topic
Lambda calculus and type theory are introduced, and the topics of (non-intensional) Montague Grammar, generalized quantifiers, and Discourse Representation Theory (DRT) are covered. From Discourse to Logic by Kamp/Reyle and Gamut's Logic, Language and Meaning, Vol. 1 form the basis of the course material.

Course No.: 32 Semantics 2 topic
This course is concerned with intensional (modal and temporal) logic, and again with Montague Grammar (the PTQ fragment) as well as with DRT (tense and aspect). The course uses Gamut's Logic, Language and Meaning, Vol. 2 and again Kamp and Reyle 1993.

Course No.: 22 Pragmatics topic
We usually make a distinction between semantic and pragmatic aspects of meaning. In communication, and thus also in computational approaches to language, an interaction of these two levels is an obvious prerequisite.

Our seminar aims at identifying the questions and goals of pragmatics on the basis of concrete examples and will discuss an integrated treatment of semantic and pragmatic aspects of meaning on the basis of the semantic representations of discourse representation theory. Prominence will be given to the classical problems of deixis, presupposition, implicature and speech acts. But discourse relations, conversational analysis and topic/focus will also be discussed.

Reference:
General survey: Levinson, Pragmatics, CUP [deutsch: Pragmatik, Niemeyer]

Cognitive Models top

Course No.: 15 Speech Recognition 2 topic
Methods of signal generation and manipulation for the purpose of speech perception experiments (time-domain editing, filtering, PSOLA resynthesis, Klatt formant synthesis) are introduced and theoretical issues in classical and current speech perception research discussed.

Small experiments are designed and carried out for hands-on experience. A WWW-based tool is used as a convenient environment for tasks like stimulus randomization and graphic presentation of the results.

Prerequisite: Introductory level phonetics; Credits through: Design and implementation of small experiment.

Literature for preparation: Chapter "Speech Perception" in J. Clark & C. Yallop 1995(2): An introduction to phonetics and phonology, Blackwell.

During the course further literature (journal articles) is recommended.

Course No.: 16 Speech Recognition 3 topic
This course consists of a theoretical and a practical computation part. The theoretical part is an introduction to automatic speech recognition with emphasis on hidden markov models. Themes of the practical computation part are: optimal sub word units, articulation variants, speaker recognition, etc.

Prerequisites: Introduction to phonetics and signal processing (recommended)

Course No.: 20 Language and Brain (Cognitive Science) topic
Contents:
  • Necessity of interdisciplinarity
  • Methodology: modern diagnostic procedures of neuroradiology (PET, fMRI)
  • Discussion of selected studies
  • Comparison with results from classical clinical neurolinguistic
References:
Deacon, T. 1998. The Symbolic Species: The Co-Evolution of Language and the Brain. W.W., Norton & Co. (Taschenbuch, 528 Seiten, DM 26,70)

Poeppel, D. 1996. A Critical Review of PET Studies of Phonological Processing. Brain and Language 55: 317-351.

Demonet, J. et al 1996. A Critical Reply to Poeppel. Brain and Language 55: 352-379.

Poeppel, D. 1996. Response to Demonet et al. Brain and Language 55: 380-385.

Jaeger, J. et al 1996. A PET study of regular and irregular verb morphology in English. Language 72: 451-497.

Seidenberg, M. and J. H. Hoeffner. 1998. Evaluating behavioural and neuroimaging data on past tense processing. Language 74: 104-122.

Jaeger, J., van Valin, R., Lockwood, A. 1998. Response to Seidenberg and Hoeffner. Language 74: 123-128.

Course No.: 25 Speech Synthesis 1 topic
Speech synthesis is a significant link of the man-machine interface. As well speech synthesis methods are used in phonetic research to gain insight into speech production or into acoustic properties of speech. This seminar is an introduction to speech synthesis methods: formant synthesis, unit selection, etc.

Practical exercises with the institute's synthesis software are a significant part of the course.

References:
Hess W. (1992):"Speech synthesis - a solved problem?" In: J. Vandewalle et. al (Ed.), Signal processing VI: Theories and Applications (Elsevier, Amsterdam), pp. 37-46.

Sproat R. (Ed.) (1998): Multilingual Text-to-Speech Synthesis, Kap. 7. (Kluwer, Dordrecht).

Dutoit T. (1997): An Introduction to Text-to-Speech Synthesis, Kap. 7. (Kluwer, Dordrecht).

Prerequisites: Introduction to phonetics and signal processing (recommended)

Course No.: 33 Knowledge Representation topic
Courses in Knowledge Representation are regularly taught at the Computer Science faculty (department of Artificial Intelligence); students of Computational Linguistics are supposed to take one or more of these courses.

Phonetics and Phonology top

Course No.: 11 Introduction to Phonetics and Phonology topic
Mostly articulatory and acoustic phonetics and speech processing are dealt with. An overview of contemporary approaches in phonology is given.

Course No.: 7 Software Laboratory 1 topic
This is a practically oriented course that provides a thorough grounding in Prolog and shows basic applications in the fields of syntax, phonology, and morphology. Furthermore, the basics of a Unix environment and the use of relevant software tools are introduced.

Significant programming exercises are to be completeted, for which tutoring is available.

  • Part A: Unix I, Unix II, Speech recording - Didactic method: laboratory exercises
  • Part B: Prolog - Didactic method: lecture, programming project

Course No.: 8 Software Laboratory 2 topic
This course is an introduction to practical computer application in the fields of acoustic phonetics, phonology and (corpus) linguistics.

Laboratory exercises:

  • speech analysis and speech synthesis;
  • automatic speech recognition;
  • text corpora;
  • multimedial presentation.

Course No.: 14 Speech Recognition 1 topic
Focusses on aspects of speech perception and signal analysis. Spectrogram interpretation is taught and several experiments (measuring of voice onset times or formant frequencies) are carried out by the students.

Signal Processing top

Course No.: 13 Signal Processing 1 topic
Contents: signals, samples, models (deterministic, stochastic), filters (linear time invariant systems), fourier transform, source filter model, linear prediction, spectrogram.

Didactic method: lectures, computer demonstrations;

References: Reiner Standke. "Methoden der digitalen Sprachverarbeitung in der vokalen Kommunikationsforschung". Peter Lang Verlag, Frankfurt am Main, 1992.

Alan V. Oppenheim, Ronald W. Schafer. Digital Signal Processing. Prentice-Hall, Englewood Cliffs, 1975.

Athanasios Papoulis. Signal Processing. McGraw-Hill 1984

Athanasios Papoulis. Probability, random variables and stochastic processes. McGraw-Hill, 1984.

Course No.: 26a Speech Synthesis 2 topic
Two major steps can be distinguished in the conversion of text to speech (TTS), viz. linguistic text analysis and acoustic speech synthesis. The course Speech Synthesis I focusses on synthesis; part II concentrates on the linguistic components of a TTS system.

The following problems of linguistic text analysis are discussed: Construction of the lexicon, including morphological paradigms; unknown word and compound analysis; disambiguation; phonological processes and pronunciation rules. The individual problems as well as methods and techniques to solve them, are illustrated and exemplified by means of practical exercises and tasks, which include implementing different types of linguistic descriptions in the form of, e.g., finite-state automata. Given sufficient interest, these exercises may be developed into projects and qualifying theses.

Previous courses: Speech Synthesis I (recommended)

Qualification for Credit: Exercises/tasks, final written test

References:
Richard Sproat (Ed.) (1998): Multilingual Text-to-Speech Synthesis. Kluwer, Dordrecht

Thierry Dutoit (1997): An Introduction to Text-to-Speech Synthesis. Kluwer, Dordrecht

Course No.: 26b Speech Synthesis 3 topic
This course gives an introduction to models of intonation and segmental duration. Current theory and research in prosody, esp. intonation, can be characterized as being quite iverse and controversial. This statement holds for both the phonological and phonetic levels of description. In this course the most important and recent intonation models are discussed, their phonological foundations and assumptions as well as implementation and application aspects, e.g. in the framework of speech synthesis.

Furthermore, segmental durations and their modeling in practical applications are discussed. Current approaches differ mainly in terms of their underlying theoretical assumptions (keywords: speech timing, synchrony assumption).

At the beginning of the course prosodic terminology, a major source of confusion in the literature, is introduced and discussed.

Previous courses: Introduction to Phonetics and Phonology

Qualification for Credit: Oral presentation and term paper.

References:
Papers by Beckman, Gronnum, Ladd und Möbius in: Proc. 13th Internat. Congr. of Phonetic Sciences, Vol. 1, Stockholm, 1995

D. Robert Ladd (1996): Intonational phonology. Cambridge University Press

Statistical Methods top

Course No.: 3 Logic and Formal Bases I topic
Mathematical concepts important to computational linguists, mostly set theory, propositional and first-order logic, and algebra are dealt with.

Course No.: 4 Logic and Formal Bases II topic
Continues Logic and Formal Bases I and includes proof theory (in the Hilbert calculus), and the completeness theorem and its implications. Both courses are based upon the textbooks Mathematical Methods in Linguistics by Partee/ter Meulen/Hall.

Course No.: 21 Statistical Methods 1 topic
Contents: probability theory; bayes theorem; descriptive statistics; inference statistics: comparison of means, t-distribution, F-distribution.

Didactic method: lecture, homework, computer exercises

References:
Ulrich Krengel: Einführung in die Wahrscheinlichkeitstheorie und Statistik, Vieweg, 1991.

T.Rietveld, R. van Hout, Statistical Techniques for the Study of Language and Language Bahaviour, Mounton de Gruyter, 1993 Bamberg, Baur, Statistik, Oldenburg.

Urban D., Becker-Richter, Bruns Th., Systematische Statistik für die computergestützte Datenanalyse, Gustav Fischer Verlag, 1992

Course No.: 34 Statistical Methods 2 topic
This course takes place as a reading group on new developments in statistical methods in computational linguistics. Possible topics are, e.g., maximum-entropy methods, Bayesian networks, other graphical models.

Prerequisites: Probability theory and statistics.

Natural Language Processing (NLP) top

Course No.: 28 Computational Lexicology and Lexicography topic
Basic Part: Introduction to the main techniques of corpus annotation (tokenizing, part of speech tagging, tagset design, lemmatization, morphological analysis), covering equally linguistic issues, technical approaches and existing practical systems.

Discussion of corpus query systems based on regular grammars and their use for the acquisition of raw material for a morphosyntactic, distributional, valency and collocational description of lexical items. Query-based terminology extraction, alignment procedures and their use.

Specialized Parts: There exist optional (periodic) courses on bilingual dictionary construction, alignment, cooccurrence phenomena, computational derivational and compounding morphology etc. Often one such specific topic is combined with the basic part into one course.

Course No.: 19 Algorithmic Syntax topic
The course will address the question of how to transfer results from theoretical linguistics into usable and efficient implementations. We will discuss commonalities and differences between several grammar formalisms, such as dependency grammars, HPSG, categorial grammars, and LFG.

Methods of 'grammar engineering' will be illustrated with selected examples of German syntax: Organizing the lexicon using templates, modularization, distributed grammar development, systematic testing and debugging, integration of additional modules like morphology and semantics.

For actual grammar coding in the lab sessions, we will use the LFG notation.

Course No.: 12 Parsing 1 topic
Several algorithms for automated syntactic analysis of natural language are introduced. Only algorithms which are based on grammars with a declarative semantics are considered in order to avoid ad-hoc solutions. The simplest grammar type considered are the context-free grammars. More expressive unification-based formalisms are e.g. DCG, LFG and HPSG.

The following parsing methods will be presented in this course:

  • parsing methods for context-free grammars: top-down parsing and bottom-up parsing with backtracking, chart parsing (Earley parser, left-corner parser), Tomita parser and deterministic parsing with LL(k) and LR(k) methods
  • parsing methods for unification-based grammars
  • restrictions on the grammar formalism to ensure the decidability and efficiency of parsing

Course No.: 35 Parsing 2: Statistical Parsing topic
The course introduces algorithms, formalisms and representations, and experimental techniques for lexicalized statistical parsing of natural language. Topics include probabilistic context free grammar, practical development of statistical grammars, inside-outside and flow algorithms, EM algorithm, practical estimation of statistical grammars, probabilistic language modeling, part of speech tagging, latent class models of lexical selection, and practical evaluation of statistical grammars.

Labs are organized around two statistical parsing environments: Glenn Carroll's Galacsy and Helmut Schmid's Lopar. The experimental part of the course provides the knowledge and experience to develop and train broad-coverage lexicalized statistical grammars.

Course No.: 36 Parsing 3: Statistical Grammar Development topic
This is an advanced seminar for linguistics and computational linguistics students engaged in research projects using broad-coverage statistical parsing as an experimental method. Topics include techniques for writing large-scale bilexical grammars, training regimes, and corpus-based lexicon acquisition. The design of existing broad-coverage lexicalized statistical grammars (for English, German, and Portuguese) is reviewed.

Course No.: 37 Natural Language Generation topic
The course gives an overview on the different architectures of natural language generation systems and concentrates then on the generation with reversible (unification) grammars (PATR, LFG, HPSG), grammars which can be used for both parsing and generation. We discuss approaches and problems of the generation from feature structures, from semantic representations and from so-called 'underspecified' representations, which allow it, e.g. in ambiguity-preserving MT systems, to produce output with specific readings.

Language Engineering Applications top

Course No.: 27 Algorithms and Data Structures for Managing Large Texts topic
Although modern programming languages include libraries for efficient text processing and storage (e.g. hash tables), it is still necessary to understand the basic principles of the used algorithms. In corpus linguistics and information retrieval, programmers additionally have to reimplement or adapt these standard algorithms in order to guarantee an efficient processing.

Course outline:

  • Sorting
    • Sorting on external media
  • Searching
    • self-organizing linear lists
    • inverted lists
    • hash tables
    • tree structures
    • b-trees
  • Searching on texts
    • Boyer-Moore-algorithm
    • PAT trees, PAT arrays

References:
Ottmann/Widmayer: Algorithmen und Datenstrukturen

Frakes/Baeza-Yates: Information Retrieval - Data Structures & Algorithms

Knuth: The art of computer programming

Course No.: 30 Dialogue Systems topic
This optional course gives an architectural overview of Spoken Language Dialogue Systems, as well as an overview of development and evaluation problems, approaches and current best practice; to this end, it relies on results of the European DISC and DISC-2 projects.

It includes a panorama of the major types of SLDSs in use and under development, as well as of their underlying technologies (keyword spotting - template filling - parsing - semantic/dialogue representations).

In the second part, the course focusses more on specific problems of dialogue processing, in particular the relationship between current semantic theories of discourse an dialogue management in SLDSs.

Reference:
Niels Ole Bernsen, Laila Dybkjaer, Hans Dybkjaer: Designing Interactive Speech Systems: From First Ideas to User Testing (Berlin/New York: Springer), 1998.

Course No.: 29 Machine Translation 1 topic
The course is an introduction to MT on the basis of the projection approach of Lexical Functional Grammar (LFG). After a short reminder on c-structures, f-structures, f-descriptions, constraints, set-valued attributes, etc., it discusses transfer projections on f-structures.

Examples come from DE/EN transfer, including lexical and structural transfer, divergences between source and target language, etc. The material used comes from technical sublangues. Practical work with DE and EN grammars, demonstrations.

References:
Ronald M. Kaplan: "The Formal Architecture of Lexical-Functional Grammar", in: Dalrymple et al. 1995

Ronald M. Kaplan, Klaus Netter, Juergen Wedekind, Annie Zaenen: "Translation by Structural Correspondences", in: Dalrymple et al. 1995

Mary Dalrymple, Ronald M. Kaplan, John T. Maxwell III, Annie Zaenen: Formal Issues in Lexical-Functional Grammar, 1995

Douglas Arnold, Lorna Balkan, R. Lee Humphreys, Siety Meijer, Louisa Sadler: Machine Translation: An Introductory Guide, (Oxford: NCC Blackwell), 1994.

Course No.: 38 Machine Translation 2 topic
The MT Level II course is wider in scope than Level I, by giving a broad panorama of MT issues, approaches and current research. The main topics include:
  • Transfer vs. Interlingua approaches;
  • Disambiguation for MT vs. linking of substructures;
  • Ongoing research systems: the example of Verbmobil;
  • Commercial systems and their technology: LMT- and KBMT-based systems, METAL/T1/L&H, etc.;
  • MT evaluation;
  • MT in practical application contexts.

Course No.: 31 Information Retrieval topic
The course gives an overview on the following:
  • basic methods and concepts of classical information retrieval;
  • linguistically based methods (content-based IR, extraction of technical terms);
  • IR applications, IR related R&D projects.
The course introduces the following concepts and methods: recall/precision, thesauri, vector space model, probabilistic models, IR system architectures.

References:
Frakes, W.; Baeza-Yates, R.: Information Retrieval. Data Structures and Algorithms, Prentice-Hall, N.J., 1992

Norbert Fuhr: Information Retrieval. Manuskript. Universität Dortmund, 1996.

C. J. van Rijsbergen: Information Retrieval. 1979. (out of stock)

Salton, G.; McGill,M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, New York 1983.

Prerequisites: basic computer science knowledge.

Qualification for Credit: presentation.