A common problem for applications such as TIGERSearch is the keyboard input of characters which are not included in the ISO-Latin-1 character set. If you are working with a corpus that makes uses of such characters, you should consider the following three alternatives:
Please note: Typing in Unicode characters implies that Unicode charaters can be displayed
(rendered) by the software. Thus, one of the Unicode fonts supported by TIGERSearch
must have been installed on your system. Please consult
section 3, chapter II for instructions.
Unicode encoding
The first alternative to encode a Unicode character is to type in its hexadecimal Unicode encoding. For example, the Greek capital letter Omega is represented by \u03a9. If you have typed in the Unicode encoding, just select the Expand Unicode Encodings option in the Input Help menu of the context menu to expand the character:
Figure: Expanding Unicode encodings
The Unicode encoding will be replaced by its corresponding character (cf. screenshot below). Please remind that a Unicode font must be installed to render the character properly.
Figure: Expansion of Unicode encodings
If you are frequently working with corpora using characters outside the ISO-Latin-1 character set, you should activate the Expand automatically option in the Input Help menu of the context menu.
Input help (operating system)
On many platforms, specialized tools have been developed to type in characters outside the ISO-Latin-1 character set. These tools are usually called input methods. As e.g. Greek characters do not exist on a German keyboard, these charaters are typed in as an abbreviation. For example, the string Omega might be used as an abbreviation for the Greek character that will be automatically expanded if the abbreviation has been typed in. Please consult the manual of your operating system to find out which tools are available for your platform.
Input help (TIGERSearch)
In the TIGERSeach Project we have implemented specialized input methods for 16 European languages which can be used in the TIGERSearch query editor (cf. subsection 3.2, chapter II). To activate the TIGERSearch input methods, press the upper left corner of the TIGERSearch window (Windows: press the tiger icon) and select the last option in the corresponding menu (usually called Choose input method).
The following screenshot shows how the input method is activated on a German Windows platform. The display will look similar on different platforms.
Figure: Activating the TIGERSearch input methods (1)
Now you are asked to choose one of the supported European languages. In the following screenshot, the Greek language (modern) is chosen:
Figure: Activating the TIGERSearch input methods (2)
The input method mode has been activated. A small status window is placed in the lower right corner of the screen. This window shows which language has been chosen and whether the input method is activated or deactivated:
Figure: Input method status window
To select a different language, you can either process the input method activation procedure described before or you can switch between the languages using the F7 key. To activate or deactivate the current input method please use the F8 key.
Please note: To deactive the TIGERSearch input methods (especially to deactivate the input method
status window), start the input method selection procedure again, but choose
system input methods in the input method menu.
How is the input method used in the query editor? All characters that are not included in the ISO-Latin-1 character set are represented by special abbreviations. To allow the input of the Latin characters as well as the special characters side by side in one mode, we have chosen encodings conventions used in the LaTeX system. For example, the German character ä is represented as \"a which is its LaTeX encoding. So if you have chosen the German keyboard mapper and you type in the character sequence \"a, it will be automatically expanded to ä by the TIGERSearch input help system.
Please note: Of course, all German characters are included in the ISO-Latin-1 character set.
However, German special characters (ä,ö,ü,ß) can only be typed in on keyboards
manufactured for the German market. Otherwise,
an input method for the German language is necessary in order to work with German
treebanks such as the TIGER treebank.
For languages such as Greek which comprises many special characters, a side by side usage of Latin and Greek characters is not possible. In this case, most Greek characters are represented by Latin characters. For example, the capital letter Omega is represented by the Latin character V. So if you type in V in the query editor, this input string is automatically expanded as the capital letter Omega. The following screenshot illustrates how Greek characters are typed in:
Figure: Typing in Greek characters
The mapping tables for the 16 supported European languages can be found in the
file europe.pdf which is placed in the
doc/pdf/ subdirectory of your TIGERSearch installation.
It can also be downloaded from the TIGERSearch homepage
(cf. http://www.tigersearch.de).