Textcorpora und Erschliessungswerkzeuge
('textual corpora and tools for their exploration')

Partners
University of Stuttgart, Institute for Romance Linguistics
and the Institute for Computer Science (department of artificial intelligence)

Framework:
Funded at 100% by the Ministry of Science and Research of the Land Baden-Württemberg (MWF, Stuttgart), in 1993/1994 and 1995/1996, in the framework of the Forschungsschwerpunktprogramm Baden-Württemberg.

Description of the project:
In 1993/1994 the project collected textual material for German, French and Italian, developed a representation for texts and markups, along with a query language and a corpus access system for linguistic exploration of the text material. Texts and analysis results are kept separate from each other, for reasons of flexibility and extensibility of the system; this is possible because of a particular approach for storage and representation. Tool components under development, language-specific and general, range from morphosyntactic analysis to partial parsing, and from mutual information, t-score, collocation extraction and clustering to HMM-based tagging and n-gram tagging. Research on statistical models for noun phrases, verb-object collocations, etc. is going on.

Duration:
In 04-1993 through 12-1994 (extension until end of '96)
IMS Contact:
Uli Heid (uli@ims.uni-stuttgart.de)

More detailed descriptions are available for the following working packages of IMS:


IMS Stuttgart, Thu Oct 7 10:11:55 1999 (www@ims.uni-stuttgart.de)