Project TT in MT

Tree Transducers in Machine Translation

Tree Transducers in Machine Translation

Term
February 2011 - January 2017
PI
Andreas Maletti
Short description

Firstly, we would like to develop an adequate translation model for syntax-based machine translation together with the basic algorithms that operate on it. This research effort should culminate in a competetive and publicly available toolkit that implements this translation model. Secondly, we would like to generalize the existing machine translation technology to fully support a syntax-based approach. This includes the development of tree-to-tree alignments, tree-based metrics, and syntax-based features. To illustrate the applicability of our results, we will develop a syntax-based translation system based on our toolkit.

Sponsor
German Research Foundation (DFG)
Long description

The advent of syntax-based machine translation of natural languages sparked renewed interest in formal models for tree languages and tree transformations. Tree automata and tree transducers are finite-state models that compute such languages and transformations. Several unweighted and weighted types exist, however no known model is adequate for the development of an efficient toolkit for syntax-based machine translation. This is due to the fact that the existing models are either not expressive enough or do not have the essential properties (like closure under composition or preservation of regularity) that are required for the use in a toolkit. In this respect, the situation is dramatically different from the areas of word- or phrase-based translation systems, for which toolkits based on finite-state string transducers are the de-facto implementation standard.

The project has two main goals. Firstly, we would like to develop an adequate translation model for syntax-based machine translation together with the basic algorithms that operate on it. This research effort should culminate in a competetive and publicly available toolkit that implements this translation model. Secondly, we would like to generalize the existing machine translation technology to fully support a syntax-based approach. This includes the development of tree-to-tree alignments, tree-based metrics, and syntax-based features. To illustrate the applicability of our results, we will develop a syntax-based translation system based on our toolkit.

Team
  • Fabienne Braune
  • Aurélie Lagoutte (Sommerpraktikantin 2011)
  • Dr. Andreas Maletti
  • Daniel Quernheim
  • Nina Seemann
Publications
In journals
At conferences
 

General Contact IMS

Pfaffenwaldring 5 b, 70569 Stuttgart

 

Webmaster of the IMS

  • Write e-mail
  • If you have any problems with the website, please directly contact the webmaster.
To the top of the page