Subsection: Unicode fonts

3.1 Unicode fonts

As the TIGERSearch software suite has been entirely implemented in Java, it is able to process corpora using the Unicode encoding. For the import of corpora, an XML-based approach is used to read any Unicode characters (cf. chapter V). The query processor is also able to process Unicode characters (cf. subsection 3.3, chapter IV).

As a platform-independent software, the TIGERSearch software suite is not able to analyze the font configuration of the user's platform in order to automatically detect an appropriate Unicode font. Therefore only the following two popular Unicode fonts are supported by the software:

Arial Unicode MS (Arialuni.ttf, 52,000 characters; 23 MB installed)

This font package is not freely available. However, it is included in the following commercial software packages: Microsoft Windows XP, Microsoft Office 2000, and Microsoft Publisher. You can easily check in the Windows System Control if the font package has already been installed on your computer. Otherwise, you will find the package on the CD-ROM of your commercial software.

Cyberbit Bitstream (Cyberbit.ttf, 30,000 characters; 12.5 MB installed)

This font package is freely available on the following web page: ftp://ftp.netscape.com/pub/communicator/extras/fonts/windows/. Please pay attention to the license agreement of the font package.

If one of these two font packages has been installed on your system, the TIGERSearch software will automatically detect and use it. Please consult the manual of your operating system how to install font packages on your computer.

Please note: If you are working on a system where you do not have the user rights to install a font package, but you have already installed the TIGERSearch system on your computer, there is a workaround to install the font package to be used by the TIGERSearch software only: Just copy the font file (suffix .ttf) to the following subdirectory of the TIGERSearch installation directory: jre/lib/fonts/