8. Viewing the matches - Statistics

8.1 Introduction

The statistical viewer has been developed as a specialized view on the match results. Users can export their favourite matches as tables. We support a text-based output format, an proprietary XML-based format, and the Excel format.

In the present section, the description of the statistical viewer is illustrated by the results of the following TIGERSearch query:

#np:[cat="NP"] &
#art:[pos="ART"] &
#adj:[pos="ADJA"] &
#nn:[pos="NN"] &
#np > #art &
#np > #adj &
#np > #nn &
arity(#np,3)

Please note: To identify nodes within the statistical viewer in a unique way, every node must be labelled, i.e. it must be identified by a node variable (e.g. #art).

To show the statistical viewer, press the Statistics button in the button toolbar of the TIGERSearch main window, or select the Statistics item in the Query menu.

Please note: You can easily switch between the TIGERSearch main window, the GraphViewer window, and the statistical viewer using the shortcut buttons in the lower left corner of the three windows. If you press a shortcut button, the corresponding window will be moved in front of all other windows on your desktop.

8.2 Specifying nodes and features

First of all, you have to specify the nodes and node features you are interested in. These are selected in the first two rows of the statistics table. Select the node in the first row and the node feature in the second row (cf. screenshot). If you like to display more than just one node feature, you can add more node feature columns by clicking the Add button or selecting the Add column item in the context menu of the feature columns. The Remove and Clear buttons (and its corresponding items in the context menus) can be used to delete a single column or to delete all currently used columns, respectively.

Please click to enlarge!

Figure: Specifying nodes and features

You might also choose the default arrangement of the nodes and features by pressing the Default button in the button toolbar. All terminal nodes specified in the query will be presented as columns and the default feature (usually the word feature) will be used.

Next, you have to build the table. Press one of the two Build buttons in the upper left or lower right corner, respectively. The statistics table will be filled with feature value information.

Please click to enlarge!

Figure: Building the table

The left two columns of the table show the corpus graph ID and the number of the current submatch (Remember that a query can be matched by a corpus graph more than once.). By default, the rows are ordered corresponding to the ordering of the corpus graphs. If you like to change the sort sequence of the rows, consult subsection 8.4.

To adapt the table layout, you can also change the width of a column or change the column ordering with the help of your mouse device.

8.3 Corpus view and Frequency view

The default view of the statistical viewer is the Corpus view, i.e. the rows are ordered with respect to the corpus graph ordering. However, to analyze the data it is sometimes helpful to group indentical rows and display their frequency. To switch to the Frequency view click on the Frequency button.

Please click to enlarge!

Figure: The frequency view

Now the left column shows the number of occurences of the rows displayed in the table. The rows are ordered by frequency.

8.4 Changing the sort sequence

The ordering of the table rows in both Corpus View and Frequency View can be changed easily. Just double-click on the upper border of the column you like to select as the sort sequence column. Or select the Sort by column item in the context menu of the column headline (activated by a right button mouse click on the column headline). In the following example, the second column has been selected. If you double-click the column again, the rows are sorted in reverse order.

Please click to enlarge!

Figure: Changing the sort sequence

If you select the Sort by column item in the context menu of the GraphID or Submatch column within the Frequency View, the default ordering of the table will be restored.

8.5 Communicating with the GraphViewer

If you are simultaneously working with the GraphViewer and the Statistical Viewer, you can easily switch between these two views using the shortcut icons in the lower left corner of both windows.

However, the two windows have also been designed to communicate with each other. The match currently displayed in the stattistical viewer will be displayed and highlighted in the GraphViewer by double-clicking the corresponding row (mouse device must be on the graph ID or submatch column) or by selecting the Open match in GraphViewer item in the context menu of the row (activated by a right mouse button click on the graph ID or submatch column).

8.6 Exporting the results

To export the statistics, first mark the rows you want to export. Use the mouse to mark the rows or select the Mark all item in the context menu. To unselect the marked rows, just select the Clear selection item in the context menu. Click the export button to display the export dialogue window.

Please click to enlarge!

Figure: Exporting the results

We have implemented three export formats:

Text format

Columns are separated by tabs, rows are separated by carriage return.

XML format

The data is exported in an XML-based format that can be used for further processing.

Excel format

The data is exported as an Microsoft Excel table.

As an alternative, you can also copy the text format output into the clipboard. Just select the Copy button in the button toolbar.