The chunker for German was developed by Helmut Schmid and Sabine
Schulte im Walde. It is based on the German Head-Lexicalised
Probabilistic Context-Free Grammar (HL-PCFG).
The manually developed grammar was semi-automatically extended by
robustness rules in order to allow parsing of unrestricted text. The
model parameters were learned from unlabelled training data by the
probabilistic parser LoPar.
Chunking has been performed on nouns, verbs, adjectives and adverbs.
The chunker is freely available via FTP
for education, research and other non-commercial purposes. The
gzip-compressed file contains the trained chunker, example chunks,
chunked newpaper example sentences, and documentation on the chunker.