 |
 |
|
 |
 |
|
The IMSLex dictionary database is our central lexicon repository. It
covers information on inflection, word formation, and valence for
several ten thousand German base forms (see details below). From the
IMSLex database, we derive specialized lexicon data for various
applications in natural language processing, information retrieval,
and information extraction. Where necessary, semantic information can
be added from the Tübingen
GermaNet
lexical-semantic dictionary.
Technology
In order to keep the dictionary data as flexible as possible, it has
been encoded on an XML basis. However, for efficiency reasons, the
data is stored in a relational database with the help of XML-DBMS.
The lexical data itself has been built up semi-automatically with the
help of specialized text mining methods from corpus linguistics.
Please refer to the dissertation of Arne Fitschen: Ein Computerlinguistisches Lexikon als
komplexes System (ps) (pdf), the IMSLex
related publications and the list of IMSLex related
projects
for further information.
Applications
The IMSLex data can be used in many ways:
-
creation of full form lexicons
With the help of the AMOR generator of inflected forms, one can create
full form lexicons like the one, which is incorporated in the
statistical lexicon for the TreeTagger
part-of-speech tagger.
A sample full form lexicon (adjectives, nouns, and verbs starting
with 'p') can be downloaded
here.
-
on-line morphological and syntactic analysis
For example, the German
ParGram
grammar uses IMSLex as its lexical database for deep syntactic analysis.
Current lexicon size
(May 2003)
|
|
inflection stem
|
derivation stem
|
composition stem
|
valence info
|
|
adjectives
|
11,000
|
23
|
80
|
2,000
|
|
adverbs
|
1,000
|
n/a
|
n/a
|
n/a
|
|
nouns
|
22,500
|
1,000
|
12,500
|
10,000
|
|
particles
|
300
|
n/a
|
n/a
|
n/a
|
|
proper nouns
|
10,000
|
|
|
|
|
verbs
|
6,000
|
350
|
160
|
6,000
|
|
167 derivation suffixes
|
For the size and contents of the GermaNet lexical-semantic dictionary,
please refer to the
GermaNet homepage.
|
|
|
|
|