RFTagger

A tool for the annotation of text with fine-grained part-of-speech tags

RFTagger

Type
Tool
Author
Helmut Schmid and Florian Laws
Description

The RFTagger is a tool for the annotation of text with fine-grained part-of-speech tags. It has been trained on German, Czech, Slovene, Slovak, and Hungarian data.

The tagger is described in the following paper:

Helmut Schmid and Florian Laws: "Estimation of Conditional Probabilities with Decision Trees and an Application to Fine-Grained POS Tagging" (pdf)

Here is some sample output:

word part of speech
Das  PRO.Dem.Subst.-3.Nom.Sg.Neut
ist  VFIN.Sein.3.Sg.Pres.Ind
ein  ART.Indef.Nom.Sg.Masc
Testsatz  N.Reg.Nom.Sg.Masc
SYM.Pun.Sent
Reference

Please cite the following publication if you want to refer to the RFTagger:

Helmut Schmid and Florian Laws: Estimation of Conditional Probabilities with Decision Trees and an Application to Fine-Grained POS Tagging, COLING 2008, Manchester, Great Britain. (pdf)

Download

The source code of the RFTagger can be downloaded here. It comes with parameter files for German, Czech, Slovene, Slovak, and Hungarian (Linux PCs only) and is freely available for education, research and other non-commercial purposes. The package also contains Linux binaries.

 

General Contact IMS

Pfaffenwaldring 5 b, 70569 Stuttgart

 

Webmaster of the IMS

  • Write e-mail
  • If you have any problems with the website, please directly contact the webmaster.
To the top of the page