1. Introduction

In this chapter, the TIGER language will be introduced, the query language for the TIGERSearch query engine. Actually, the TIGER language is not only a query language, but a general decription language for syntax graphs, i.e. restricted directed acyclic graphs. Syntax graphs are close relatives to (syntax) trees, to feature structures and to dependency graphs. Syntax graphs are syntax trees with two additions: Edges may have labels, and crossing edges are permitted. On the other hand, syntax graphs differ from feature structures due to the following two properties: First, the leaf nodes are ordered (like in syntax trees). Second, two edges must not join in a common node. This means that the structure sharing mechanism of feature structures is ruled out. However, structure sharing can be expressed on the additional level of so-called secondary edges.

We call the TIGER language a 'description language' since we want to emphasize that, in principle, the language serves to represent corpus annotations as well as corpus queries. In the TIGER system, for corpus annotation only a proper sublanguage of the TIGER description language may be used which does not include any possibility for underspecification. Note that, for reasons of technical convenience, the corpus annotation has to be encoded in the XML-counterpart of the corpus annotation sublanguage (TIGER-XML, cf. chapter V). The TIGER language accomodates a wide range of common treebank formats.

Acknowledgements and references

Syntax graphs are a formalization of the so-called Negra data structure, which has been used to encode the first major treebank for German, the Negra corpus (cf. [SkutEtAl1997]).

The design and the formal definition of the TIGER language has been influenced by other work on formal languages:

unification-based formalisms

(cf. [DoerreDorna1993], [Doerre1996], [DoerreEtAl1996], [EmeleZajac1990], [Emele1997], [HoehfeldSmolka1988], [Schmid1999])

tree description languages

(cf. [BlackburnEtAl1993], [DuchierNiehren1999], [MarcusEtAl1993], [RogersEtAl1992])

corpus query languages

(cf. [Christ1994], [ChristEtAl1999])

We want to thank our colleagues in the TIGER and DEREKO projects (cf. TIGER project homepage and DEREKO project homepage) for their creative input and their patience in discussing the details of earlier versions of this chapter.