The supported data model is based on so-called syntax graphs, i.e. directed acyclic graphs with a single root node. Thus, corpus graphs cannot be encoded by embedding XML elements. As a solution, all terminal and nonterminal nodes are listed and edges are explicitly encoded as elements. The following example illustrates the corpus graph encoding.
Figure: Example sentence and its annotation
<body>
<s id="s5">
<graph root="s5_504">
<terminals>
<t id="s5_1" word="Die" pos="ART" morph="Def.Fem.Nom.Sg"/>
<t id="s5_2" word="Tagung" pos="NN" morph="Fem.Nom.Sg.*"/>
<t id="s5_3" word="hat" pos="VVFIN" morph="3.Sg.Pres.Ind"/>
<t id="s5_4" word="mehr" pos="PIAT" morph="--"/>
<t id="s5_5" word="Teilnehmer" pos="NN" morph="Masc.Akk.Pl.*"/>
<t id="s5_6" word="als" pos="KOKOM" morph="--"/>
<t id="s5_7" word="je" pos="ADV" morph="--"/>
<t id="s5_8" word="zuvor" pos="ADV" morph="--"/>
</terminals>
<nonterminals>
<nt id="s5_500" cat="NP">
<edge label="NK" idref="s5_1"/>
<edge label="NK" idref="s5_2"/>
</nt>
<nt id="s5_501" cat="AVP">
<edge label="CM" idref="s5_6"/>
<edge label="MO" idref="s5_7"/>
<edge label="HD" idref="s5_8"/>
</nt>
<nt id="s5_502" cat="AP">
<edge label="HD" idref="s5_4"/>
<edge label="CC" idref="s5_501"/>
</nt>
<nt id="s5_503" cat="NP">
<edge label="NK" idref="s5_502"/>
<edge label="NK" idref="s5_5"/>
</nt>
<nt id="s5_504" cat="S">
<edge label="SB" idref="s5_500"/>
<edge label="HD" idref="s5_3"/>
<edge label="OA" idref="s5_503"/>
</nt>
</nonterminals>
</graph>
</s>
</body>
Please note: Feature values, represented as attribute-value pairs, cannot be omitted.
If a feature value or edge label does not make sense for a
token or inner node (e.g. in the example sentence the feature morph is
sometimes unspecified), please use a meaningful symbol instead.
We recommend you to use the symbol -- which is also used in our
implemented import filters. When viewing a matching corpus graph using the
TIGERGraphViewer, the display of a feature value or edge label such as
-- can be suppressed
(cf. subsection 7.5, chapter IV).