Institute

Studying

Research


 

Overview of projects

Title CRETA - Center for Reflected Text Analytics
Term January 2016 - December 2018
PI Jonas Kuhn, Sebastian Padó (Institut für Maschinelle Sprachverarbeitung), Manuel Braun (Institut für Literaturwissenschaft / Germanistische Mediävistik), Thomas Ertl (Institut für Visualisierung und Interaktive Systeme), Sabine Holtz (Historisches Institut / Landesgeschichte), Cathleen Kantner (Institut für Sozialwissenschaften / Internationale Beziehungen und Europäische Integration), Catrin Misselhorn (Institut für Philosophie / Wissenschaftstheorie und Technikphilosophie), Sandra Richter (Institut für Literaturwissenschaft / Neuere Deutsche Literatur I), Achim Stein (Institut für Linguistik / Romanistik), Claus Zittel (Stuttgart Research Centre for Text Studies)

Short description

The "Center for Reflected Text Analytics" (CRETA) focuses on the development of technical tools and a general workflow methodology for text analysis within Digital Humanities. Of particular importance is the transparency of tools and traceability of results, such that they can be employed in a critically-reflected way. CRETA is a collaboration project with partners from studies of literature, language, history, politics, philosophy, and computational linguistics and data visualization.

Sponsor German Federal Ministry for Education and Research (BMBF)
 
Title KABI: Confidence Estimation for Biomedical Information Extraction
Term January 2016 - December 2017
PI Roman Klinger

Short description

In the Life Sciences, most information is only available in free text in scientific publications. Automatic methods to extract such knowledge and to provide it in structured databases is challenged by a dilemma: Especially if potentially new information is detected in text, it is unclear if this information is actually correct or if it is wrongly extracted, for instance because the text is formulated in an uncommon way. In this project, methods will be developed which help to estimate the reliability of extracted information from biomedical publications.

Sponsor Ministerium für Wissenschaft, Forschung und Kunst in Baden-Württemberg und Universität Stuttgart (Program: RiSC – Research Seed Capital)
 
Title Debate Explorer
Term January 2016 - August 2016
PI Jonas Kuhn

Short description

Interactive Journalistic Investigation of Large Text Collections

Sponsor Volkswagen Stiftung
 
Title ePoetics
Term March 2013 - Februrary 2016
PI Jonas Kuhn

Short description

Corpus building and visualization of German poetics (1770-1960) for „Algorithmic Criticism"

Sponsor German Federal Ministry for Education and Research (BMBF)
 
Title eIdentity
Term May 2012 - April 2015
PI Jonas Kuhn

Short description

Multipe collective identities in international debates regarding war and peace since the end of the Cold War. Language technology tools and methods for the analysis of multi-lingual text in the social sciences.

Sponsor German Federal Ministry for Education and Research (BMBF)
 
Title Distributional Approaches to Semantic Relatedness
Term November 2011 - January 2017
PI Sabine Schulte im Walde

Short description

The project applies an interdisciplinary approach to explore the potential and the limits of distributional approaches to semantic relatedness.

Sponsor German Research Foundation (DFG)
 
Title Kobalt-DaF
Term September 2011 - December 2014
PI Heike Zinsmeister

Short description

A scientific network for corpus-based analysis of texts written by learners of German as a foreign language.

Project homepage (in German): http://www.kobalt-daf.de

Sponsor Deutsche Forschungsgemeinschaft (DFG)
 
Title CLARIN-D
Term May 2011 - April 2014
PI Jonas Kuhn

Short description

A web and centres-based research infrastructure for the social sciences and humanities.

Sponsor German Federal Ministry for Education and Research (BMBF)
 
Title Tree Transducers in Machine Translation
Term February 2011 - January 2017
PI Andreas Maletti

Short description

Firstly, we would like to develop an adequate translation model for syntax-based machine translation together with the basic algorithms that operate on it. This research effort should culminate in a competetive and publicly available toolkit that implements this translation model. Secondly, we would like to generalize the existing machine translation technology to fully support a syntax-based approach. This includes the development of tree-to-tree alignments, tree-based metrics, and syntax-based features. To illustrate the applicability of our results, we will develop a syntax-based translation system based on our toolkit.

Sponsor German Research Foundation (DFG)
 
Title Prosodic Phrasing in Auditory and Visual Sentence Processing
Term April 2010 - October 2014
PI Dr. Petra Augurzky, University of Tübingen

Short description

The project investigates the status of prosodic phrasing in auditory and visual sentence processing. Prosodic phrase boundaries can influence the syntactic structure as well as the processing of argument status. The aim of the project is to identify the exact timing and functional make-up of prosody processing by measuring event-related potentials (ERPs). As a starting point for these experiments, behavioral production and perception studies willl determine the typical prosodic realization for various structures.

DFG Grant to G. Dogil, A. Alexiadou & B. Kotchubey

Sponsor German Research Foundation (DFG)
 
Title Models of Morphosyntax for Statistical Machine Translation
Term October 2009 - September 2012 (first phase)
PI Alexander Fraser, Hinrich Schütze

Short description

The proposed project would use advances in automatic linguistic analysis of syntax and morphology to advance statistical MT. The dependencies between morphology, syntax and translation should be directly modeled. This will lead to the creation of translation models and search algorithms that will dramatically improve translation quality for morphologically rich languages.

Sponsor German Research Foundation (DFG)
 
Title Collaborative Research Center 732 "Incremental Specification in Context"
Term 2006-2018

Short description

The scientific goal of the SFB 732 is to achieve a better understanding of the mechanisms that lead to ambiguity control/disambiguation as well as the enrichment of missing/incomplete information and to develop methods that are able to fully describe these mechanisms. The basic hypothesis in the SFB is that such processes generally involve specification of an underspecified input. Our research involves statistical, rule-based, comparative and corpus-based methods, to which we will add  experimental methods in the second phase of funding.

Sponsor German Research Foundation (DFG)
 
Title Phonetic Perceptual Reference Space for Prosodic Phonological Categories
Term July 2006 - June 2009
PI Bernd Möbius, Grzegorz Dogil

Short description

The research program of the project is situated at the interface between phonology and phonetics. Its principal goal is to define the perceptual reference space for prosodic categories. The methodology applied towards this goal is both experimental and computational. In the experimental work classical paradigms such as Categorical Perception and the Perceptual Magnet Effect are applied to determine which prosodic categories posited by phonological theory have a distinctive representation in the perceptual phonetic reference space. The computational model that we are developing serves to formulate hypotheses and make predictions of test results.

Sponsor German Research Foundation (DFG)
 
Title The TIGER Project
Term 1999-2004
PI Peter Eisenberg (Potsdam), Christian Rohrer (Stuttgart), Hans Uszkoreit (Saarbrücken)
Sponsor Deutsche Forschungsgemeinschaft (DFG)
 
Title Verbmobil
Term February 1993 - September 2000

Short description

Verbmobil was a long-term project for the development of a mobile translation system for the translation of spontaneous speech in face-to-face situations.

Sponsor German Federal Ministry for Education and Research (BMBF)
 
Title Development of a prosodic module for Discourse Representation Theory (DRT)
Term 01.03.1995 bis 31.12.1997
PI Prof. Dr. phil. habil. Grzegorz Dogil

Short description

Das Ziel des Teilprojekts C4 ist es, ein prosodisches Modul für die Diskursrepräsentationstheorie (DRT) zu entwickeln. Die Ergänzung der DRT als dynamischer Theorie der Bedeutung um eine prosodische Komponente ist besonders sinnvoll und naheliegend, ist doch die Prosodie das Hauptausdrucksmittel der Sprachdynamik. Viele Ansätze schreiben der Prosodie intuitiv eine wichtige Rolle bei der Interpretation von Diskursen zu. In diesem Projekt soll diese Problemstellung empirisch untersucht werden.

Weitere Informationen zum Projekt findne sich hier.

Sponsor DFG
 
Title Relator
Term Dezember 1993 - Juli 1995
PI University of Pisa (Coordinator), DFKI, Universität des Saarlandes, Saarbrücken, LIMSI-CNRS (Orsay/Paris), University of Edinburgh (among others)

Short description

The project aims at defining a broad organizational framework for the creation, storage, dissemination and maintenance of language resources for both spoken and written language; such resources are necessary for the development of NLP and speech processing products and services but also for research.

Sponsor Funded at 100% by the Commission of the European Community, DG XIII E5, Luxemburg (under the LRE programme (Linguistic Research and Engineering)).
 
Title DELIS - Descriptive Lexical Specifications
Term 1993 - 1995

Short description

In a cooperation between computational and theoretical linguists, lexicographers and software builders, tools for the corpus-based construction of lexicons are developed. These tools support the acquisition of linguistic evidence from textual corpora, as well as the construction, maintenance and prototyping-like stepwise enhancement of lexical descriptions in the format of typed feature structures. Parallel dictionary fragments for the major lexical semantic classes of English, French, Italian, Danish and Dutch will be described, at the levels of syntax and semantics, including in particular the interaction between the two levels. The representation of lexical descriptions and the tools for population of dictionary models and for model evolution will be based on the typed feature structure system, TFS, an implementation of typed feature logics developed in a previous project of the institute, since 1988.

Sponsor Funded partly by DG XIII E 4 of the Commission of the European Community, Luxembourg (under the LRE programme, Linguistic Research and Engineering)
 
Title Textual corpora and tools for their exploration
Term 1993-1994, 1995-1996

Short description

In 1993/1994 the project collected textual material for German, French and Italian, developed a representation for texts and markups, along with a query language and a corpus access system for linguistic exploration of the text material. Texts and analysis results are kept separate from each other, for reasons of flexibility and extensibility of the system; this is possible because of a particular approach for storage and representation. Tool components under development, language-specific and general, range from morphosyntactic analysis to partial parsing, and from mutual information, t-score, collocation extraction and clustering to HMM-based tagging and n-gram tagging. Research on statistical models for noun phrases, verb-object collocations, etc. is going on.

Sponsor The Ministry of Science and Research of the Land Baden-Württemberg (MWF, Stuttgart), in 1993/1994 and 1995/1996, in the framework of the Forschungsschwerpunktprogramm Baden-Württemberg
 
Title IMS Textcorpora und Lexicon Group

Short description

The Textcorpora and Lexicon Group was a research group at IMS that brought together the researchers from different projects that were developing lexicons, corpora, and tools to work with them.

The major focus of the Textcorpora and Lexicon Group at the IMS is the creation of large-scale, high-quality lexicons for natural language applications. 'Large scale' and 'high quality' can only be obtained simultaneously if appropriate engineering methods are applied. Therefore, we use text retrieval tools and information extraction methods - specialized to the field of lexicography. Usually, this approach is called 'corpus-based lexicography'.

 
Title WordGraph

Short description

The goal of the research project WordGraph is to develop new approaches for the acquisition of lexical information from text corpora. These approaches are based on graph theory. In particular, we are investigating node similarity algorithms such as SimRank for the induction and extension of bilingual lexicons.

Sponsor German Research Foundation (DFG)
 
Title The ParGram Project in Stuttgart

Short description

The major goals of the project are the analysis and encoding of important and most generally occurring syntactic structures in German, and the development of parallel analyses for crosslinguistic phenomena. The parallel nature of the analyses is ensured through the concurrent development of German, English, Norvegian, and Japanese LFG-Grammars

 
Title QuaDramA - Quantitative Drama Analytics
Term April 2017 - March 2020
PI Nils Reiter (IMS), Marcus Willand (Institute for Literary Studies)

Short description

QuaDrama aims to extend the possibilities for large-scale analysis of dramas focusing the dramatic figure. More than 600 German dramas, mainly from 1740 to 1920, will be examined by integrating structure analysis with content analysis of the figure speech. This integration is enabled by the use of tools for natural language processing (NLP), which are domain-adapted to this specific text type. The planned integration is not a trivial task, because speech content and dramatic structure are very different sources of information that need to be analysed jointly and in a systematic way in order to provide added value for the literary interpretation. The project will empirically analyse different aspects of figures in dramas and relate empirical findings to existing drama-historic theories.

QuaDramA web page