Bild von Institut mit Unilogo
home uni IMS suche Search kontakt Contact
unilogo University of Stuttgart
Institute for Natural Language Processing

Lexical Information for English

 
 

The statistical grammar model based on the trained version of the English Head-Lexicalised Context-Free Grammar represents a source for lexical information.

Verb Subcategorisation

The statistical grammar model provides lexical information, with emphasis on verb entries. We induced frequency and probability distributions for 16,946 verbs concerning subcategorisation frames and argument selection.

Viterbi Parses

On the basis of the statistical grammar model we parsed the whole BNC (117 million words) and determined their most probable parse trees. [example]



The data is freely available for education, research and other non-commercial purposes. Please contact Sabine Schulte im Walde for more information.