Institut

Studium

Forschung


 

The TIGERSearch software suite covers the following features:  
 

Flexible data structures

  -  Directed acyclic graphs

TIGERSearch can handle texts whose sentences (or other linguistically meaningful segments) have been annotated with directed acyclic graphs. This means that a wide range of linguistic descriptions like syntax trees, functional structures, dependency-style structures, predicate-argument structures can be accommodated. Currently, the number of nodes in a graph are restricted by some upper bound.

 

  -  Feature records

A node in a TIGERSearch graph structure may be labelled with a feature record. A feature record is a restricted form of a feature-value structure or attribute-value structures as known from unification-based grammar formalisms.

 

Linguistically motivated query language

The query language of TIGERSearch is a generalization of the specification language for TIGERSearch data structures. Nodes are described by Boolean expressions over feature-value pairs. Feature values themselves can be either POSIX regular expressions or arbitrary Boolean expression over constants.

[word="Abend" & pos="NN"]
[word=/Ma.*/ & pos= ("NN"|"NE")]

From node descriptions, node relations can be formed by use of the relations of direct precedence (.), direct dominance (>), and a set of derived node relations available like underspecified dominance (>*) or siblings ($).

[cat="NP"] > [pos="ART"]

Node relations may be combined by logical conjunction and disjunction. In order to be able to state that a node takes part in several node relations, logical variables can be used:

  (#n1:[cat="NP"] >* [pos="ART"]) & (#n1 > [pos="NN"])

Furthermore, hierarchical type definitions can be used to give structure to the corpus nomenclature. And template definitions help to break down complex queries into reusable modules.

 

Graphical user interface

  -  Query formulation

In order to use all the features of the software, new TIGERSearch users first have to learn the TIGERSearch query language. However, learning a formal language takes time and patience. As a consequence, we have also developed a graphical query input which lets you draw queries in a very intuitive way.

So queries can either be drawn in the graphical query mode or be written in text form. Both modes give menu-based support by listing the available feature and feature value names, etc.

 

  -  Convenient browsing and export of results

The results of a query can be explored with the help of the TIGERSearch GraphViewer. You can navigate through the matching sentences and matching subgraphs and export your favorite matches to various graphics formats (SVG, TIF, JPG etc.). The export of animated SVG images makes query results available for off-line browsing in an internet browser, without access to TIGERSearch.

In addition, the TIGERSearch tool itself allows you to export the query results in the TIGER-XML format. If you prefer a different format, you can use XSLT stylesheets for transformation. Basic tools to aggregate query results, e.g. into a frequency tables, are available, as well.

 

Easy software installation

The standard installation of TIGERSearch requires just a few mouse clicks.

 

Corpus administration GUI, XML-based data import

The TIGERSearch software suite includes TIGERRegistry, a graphical user interface for corpus administration. With the help of TIGERRegistry, you can prepare new corpora as TIGERSearch corpora. You can delete existing corpora or reorganize your corpus folders. TIGERSearch includes filters which convert various corpus formats to the TIGER-XML format. In the case that your specific corpus format is not yet covered by the existing input filters, you will have to write your own corpus conversion script.

 

Availability for all major platforms

TIGERSearch is available for the following operating systems:

  -  Microsoft Windows
  -  Linux
  -  Solaris Sparc
  -  Mac OS X

Please check the system requirements page for the individual platform requirements.