- Partners
- University of Stuttgart,
Institute
for Romance Linguistics
and the Institute
for Computer Science (department of artificial
intelligence)
- Framework:
- Funded at 100% by the Ministry of Science and
Research of the Land Baden-Württemberg (MWF, Stuttgart), in
1993/1994 and 1995/1996, in the
framework of the Forschungsschwerpunktprogramm Baden-Württemberg.
- Description of the project:
-
In 1993/1994 the project collected textual material for German, French
and Italian, developed a representation for texts and markups, along
with a query language and a corpus access system for linguistic
exploration of the text material. Texts and analysis results are kept
separate from each other, for reasons of flexibility and extensibility
of the system; this is possible because of a particular approach for
storage and representation. Tool components under development,
language-specific and general, range from morphosyntactic analysis to
partial parsing, and from mutual information, t-score, collocation
extraction and clustering to HMM-based tagging and n-gram tagging.
Research on statistical models for noun phrases, verb-object
collocations, etc. is going on.
- Duration:
- In 04-1993 through 12-1994 (extension until end of '96)
- IMS Contact:
-
Uli Heid
(uli@ims.uni-stuttgart.de)
More detailed descriptions are available for the following working
packages of IMS:
-
The part-of-speech tagset for German, mainly developed by Anne
Schiller and Christine Thielen (University of Tübingen)
- The
various taggers
which have been built by André Kempe and
Helmut Schmid
- The
IMS Corpus Workbench
which has been developed by Oliver Christ and Bruno Schulze
IMS Stuttgart, Thu Oct 7 10:11:55 1999 (www@ims.uni-stuttgart.de)