Bild von Institut mit Unilogo
home uni IMS suche Search kontakt Contact
unilogo University of Stuttgart
Institute for Natural Language Processing

SFST - Stuttgart Finite State Transducer Tools

 
 

What is SFST?

SFST is a toolbox for the implementation of morphological analysers and other tools which are based on finite state transducer technology.

The SFST tools comprise

  • a compiler which translates transducer programs into minimised transducers
  • interactive and batch-mode analysis programs
  • tools for comparing and printing transducers
  • an efficient C++ transducer library

Features

  • freely available under the GNU Public License
  • easy to learn for users who are familiar with grep, sed, or Perl.
  • efficient implementation in C++
  • supports
    • a wide range of transducer operations
    • UTF-8 character coding
    • weighted transducers (basic functionality only)

Downloads

  • Source code of the SFST tools
    • version 1.4.6g (comments are now optionally allowed in the lexicon, faster fault-tolerant lookup)
    • version 1.4.6a (Improvement of the efficiency of the minimisation and composition operations. Many thanks to Anssi Yli-Jyrä for his valuable suggestions!)
    • version 1.4.4 (Bug related to multi-character symbols in the input was fixed.)
    • version 1.4.3 (Optional replace operations have changed)
    • version 1.4.2 (includes Hopcroft minimisation and other modifications which were jointly developed with the HFST team at Helsinki)
    • version 1.3 (fst-print now produces a different output format which might affect the graphical viewers listed below)
    • version 1.2

  • A short manual (included in the source code package)
  • A tutorial on the implementation of computational morphologies (included in the source code package)

  • A Debian package for SFST (created by Francis Tyers)
  • Software for finding potential errors in your SFST code (created by Eleonora Nagy)

Publications

Please cite the following publication if you want to refer to the SFST tools:

A Programming Language for Finite State Transducers, Proceedings of the 5th International Workshop on Finite State Methods in Natural Language Processing (FSMNLP 2005), Helsinki, Finland. (ps, pdf)

This publication describes the German morphology SMOR:

Helmut Schmid, Arne Fitschen and Ulrich Heid: SMOR: A German Computational Morphology Covering Derivation, Composition, and Inflection, Proceedings of the IVth International Conference on Language Resources and Evaluation (LREC 2004), p. 1263-1266, Lisbon, Portugal. (pdf)

Relations to other FST Toolkits

There are two projects which aim to extend the functionality of SFST in various ways:
  • Anssi Yli-Jyrä's AFST toolkit is based on SFST
  • The HFST tookit developed by Krister Lindén, Kimmo Koskenniemi, and colleagues was implemented on top of the three alternative FST libraries SFST, OpenFST, and foma.
See also the contributions by other authors below.

Links

  • Alex Linke provided
    • an interface to the Graphviz tool for the graphical output of transducers.
  • Sebastian Nagel wrote
    • an Emacs mode for editing transducer files and
    • a Perl program which converts SFST transducers to the Graphviz format (similar to that of Alex Linke).
  • Stefan Evert also sent me a Graphviz converter.
  • Matthias Kistler provided a highlighting mode for the VIM editor.
  • Toni Arnold developed
    • a Python interface for the SFST library and
    • Emores, an Empirical MOrphological REaSoning engine for the automatic acquisition of lemmas from a word list.
  • Marius L. Jøhndal created a Ruby interface for the SFST library.


Please send comments, suggestions and bug reports to Helmut Schmid at FirstName.LastName@ims.uni-stuttgart.de. (Insert the name into the email address.)