SFB 340 - project B12
"Methods for extending, maintaining and
optimizing a large grammar of German"
The two main focuses of this project are
- the development and extension of a broad-coverage LFG grammar of German, and
- methodological issues of grammar development.
The project cooperates closely with the Pargram project, also using the Xerox Linguistic Environment as its development platform.
In the past, pressure of time has often kept projects of broad-coverage grammar development from going into the assessment of different alternatives of analyzing a particular phenomenon, or the refinement and publication of methods for extending, maintaining, testing and optimizing grammar code.
The B12 project addresses these issues explicitly. For selected phenomena, the effect of adapting different analyses is investigated: which linguistic generalizations are captured, how maintenance is affected, what interactions occur, etc.
Closely related to such experiments is the work on providing tools and procedures to technically ensure controlled comparisons, and for evaluating the results. This includes among other things
- a specialized scheme of revision control, taking care of modules of grammar code, morphological analyzers, or reference corpora etc.,
- testsuite organization in a database scheme (cf. work in the TSNLP project),
- lexicon organization and issues of lexicon aquisition,
- annotation of test data with target expressions for regression testing,
- diagnostics for quantitative evaluation of interaction effects (i.e., well-directed comparison of parsing times for particular phenomena).
- Jonas Kuhn and Christian Rohrer. 1997. Approaching
ambiguity in real-life sentences -- the application of an Optimality
Theory-inspired constraint ranking in a large-scale LFG grammar.
(.ps.gz) DGfS-CL 1997, Heidelberg.
- Jonas Kuhn. 1998.
Some recent extensions of the LFG formalism and
their application in broad-coverage grammars,
slides of a presentation given at the Blaubeuren Workshop,
2-6 May 1998 ``Applications of Constraint-Based Programming
to Computational Linguistics''
slides (.ps.gz) -- 4 slides on a page (.ps.gz)
- Jonas Kuhn, Judith Eckle-Kohler & Christian Rohrer. 1998.
Acquisition with and for Symbolic NLP-Systems
-- a Bootstrapping Approach To appear in: Proceedings of the First
International Conference on Language Resources and Evaluation
(LREC98). Granada, Spain.
.ps.gz -- .ps
- Jonas Kuhn. 1998.
data-intensive testing of a broad-coverage LFG grammar, to appear
in Proceedings of KONVENS 98, Bonn, October 1998. .ps.gz
(reduced version 2-page: .ps.gz -- .ps)