2. Loading a corpus

2.1 Selecting a corpus

TIGERSearch corpora are organized in a hierarchical file system, i.e. related corpora are grouped in folders. To see the corpora available, select the Open tab in the information panel. Now you can browse through the corpus tree (upper-hand side of the tab), have a look at the corpus properties (lower-hand side), and finally load a corpus. The corpus properties are displayed if you mark a corpus symbol:

Please click to enlarge!

Figure: The TIGERSearch corpus tree

Corpus loading is activated by double-clicking the respective corpus symbol, or selecting the Open selected corpus item in the context menu (activated by right mouse click on the corpus symbol). The corpus loading process can be aborted any time by pressing the Cancel button in the corpus loading progress window:

Please click to enlarge!

Figure: Corpus loading progress window

Corpus hotkey

An interesting alternative to load a corpus is the so-called corpus hotkey. This is a brief list of the corpora last opened by the user (up to 15 corpora). You can view the list by pressing the hotkey button in the upper-left corner of the main window (cf. the following screenshot). If you select a corpus in the shorthand list, the corpus loading process will be started immediately.

Please click to enlarge!

Figure: Corpus hotkey

(Un)Succesful corpus loading

If the corpus has been successfully loaded, there will be up to two additional corpus information tabs in the information panel which are described in the following subsections (corpus bookmarks and corpus templates). The currently loaded corpus is also indicated by the corpus hotkey and by the title bar of the main window.

If there are any problems during the corpus loading process, all warning messages are stored for inspection. These messages can be displayed by clicking on the warning symbol in the corpus information tab. In the case of corpus loading problems, please check the corpus configuration in the TIGERRegistry tool.

Corpus autoload

An interesting feature is the so-called corpus autoload. If this feature has been activated (cf. Preferences item in the Options menu), the corpus opened when leaving the tool will be automatically loaded when the tool is started for the next time. By default, the autoload feature is activated.

2.2 Corpus documentation

After corpus loading, the corpus information tab is activated. It includes the following pieces of information which are presented as groups within an information tree (cf. screenshots):

General documentation

The first group comprises general corpus documentation. Pressing one of the group icons displays the corresponding document in the lower part of the tab. The so-called Summary view contains some meta information about the corpus which has been specified in the corpus process (cf. subsection 4.2, chapter VI). The Detailed view also lists all corpus features and their corresponding feature values used in the corpus. Both information pages can be printed by selecting the Print Current Documentation item in the context menu which is activated by a right-button mouse click (in the lower left information panel):

If the loaded corpus comprises corpus bookmarks or corpus templates, corresponding indication icons are also placed in the documentation group (cf. screenshot above). Corpus bookmarks and corpus templates are described in subsection 2.3 and subsection 2.4, respectively.

Edge labels

The second group contains documentation about optional edge labels and secondary edge labels. All (secondary) edge labels and an optional short description are listed. If you type in a character, the cursor jumps to the first item which begins with this character. If you double-click a (secondary) edge label, it is copied into the corpus query editor:

Please click to enlarge!

Figure: Edge labels / Secondary edge labels

Features

The third and fourth groups comprise the nonterminal and terminal features of the corpus. All feature values and an optional description are listed. If you type in a character, the cursor jumps to the first item which begins with this character. If you double-click a feature values, the corresponding feature-value pair is copied into the corpus query editor:

Please click to enlarge!

Figure: Corpus features

If a type system has been defined for a corpus feature (cf. section 8, chapter III), it is also documented. Just click on the type icon which is placed under the corpus feature icon (cf. screenshot below). The type system is presented as a type tree. If you click on a type symbol or on a feature value, the corresponding feature-value / feature-type pair is copied into the corpus query editor:

Please click to enlarge!

Figure: Feature types

2.3 Corpus bookmarks

The Bookmarks tab comprises the user's favourite queries. Queries can be saved as bookmarks in the query editor (cf. subsection 3.4). In the TIGERRegistry tool, such bookmarks can be linked to a corpus as the so-called corpus bookmarks (cf. subsection 4.4, chapter VI). So if the user opens a corpus, the predefined bookmarks will be available in the Bookmarks tab (cf. screenshot below).

In order to differentiate between corpus bookmarks and the user's bookmarks, corpus bookmarks are displayed green-colored. To see the bookmark's name, the corpus used by the bookmark query, and the bookmark query itself, press the bookmark icon. To copy a bookmark query into the query editor, just double-click the bookmark icon:

Please click to enlarge!

Figure: Corpus bookmarks

2.4 Corpus templates

If templates have been declared for the corpus, they are presented in the Templates tab. In the upper part of the tab, all templates are presented. If you press a template icon, the corresponding template name, template path, and template definition are displayed. To copy the template call into the query editor, just double-click the template icon:

Please click to enlarge!

Figure: Corpus templates

2.5 Exploring the corpus

After corpus loading, the corpus exploration button in the toolbar (leftmost button, right neighbour of the corpus hotkey) is activated. Pressing this button will open the TIGERGraphViewer which visualizes the corpus graphs (cf. section 7). So you can browse through the corpus without processing corpus queries. This feature is very helpful if you are not familiar with a corpus and its annotations.

Please note: You can easily switch between the TIGERSearch main window and the GraphViewer window using the shortcut buttons in the lower left corner of both windows. If you press a shortcut button, the corresponding window will be moved in front of all other windows on your desktop.

Please click to enlarge!

Figure: Exploration of the corpus