2.5 Corpus query matches

To be as flexible as possible, the TIGER-XML format has also been designed to represent corpus query matches. The following example illustrates the encoding of the match information for the query #v:[cat="NP"] > #w:[pos="NN"] and the matching corpus graph.

    <match subgraph="s5_500">
      <variable name="#v" idref="s5_500"/>
      <variable name="#w" idref="s5_2"/>

Please click to enlarge!

Figure: Example sentence and its match visualization (red-colored)

Matches are represented by <match> elements. The <variable> elements refer to the corresponding graph nodes matching the variables #v and #w. Hence the IDs of the <t> and <nt> elements are essential for both the edge linking and match reference mechanism. The subgraph attribute of a <match> element refers to the root node of the matching subgraph.

In total, we get the following encoding of the corpus graph and query result:

<s id="s5">
  <graph root="s5_504">
      <t id="s5_1" word="Die" pos="ART" morph="Def.Fem.Nom.Sg"/>
      <t id="s5_2" word="Tagung" pos="NN" morph="Fem.Nom.Sg.*"/>
      <t id="s5_3" word="hat" pos="VVFIN" morph="3.Sg.Pres.Ind"/>
      <t id="s5_4" word="mehr" pos="PIAT" morph="--"/>
      <t id="s5_5" word="Teilnehmer" pos="NN" morph="Masc.Akk.Pl.*"/>
      <t id="s5_6" word="als" pos="KOKOM" morph="--"/>
      <t id="s5_7" word="je" pos="ADV" morph="--"/>
      <t id="s5_8" word="zuvor" pos="ADV" morph="--"/>
      <nt id="s5_500" cat="NP">
        <edge label="NK" idref="s5_1"/>
        <edge label="NK" idref="s5_2"/>
      <nt id="s5_501" cat="AVP">
        <edge label="CM" idref="s5_6"/>
        <edge label="MO" idref="s5_7"/>
        <edge label="HD" idref="s5_8"/>
      <nt id="s5_502" cat="AP">
        <edge label="HD" idref="s5_4"/>
        <edge label="CC" idref="s5_501"/>
      <nt id="s5_503" cat="NP">
        <edge label="NK" idref="s5_502"/>
        <edge label="NK" idref="s5_5"/>
      <nt id="s5_504" cat="S">
        <edge label="SB" idref="s5_500"/>
        <edge label="HD" idref="s5_3"/>
        <edge label="OA" idref="s5_503"/>
    <match subgraph="s5_500">
      <variable name="#w" idref="s5_2"/>
      <variable name="#v" idref="s5_500"/>
    <match subgraph="s5_503">
      <variable name="#w" idref="s5_5"/>
      <variable name="#v" idref="s5_503"/>