Helmut Schmid
University of Stuttgart
Institute of Natural Language Processing
Theoretical Computational Linguistics
Azenbergstr. 12
D-70174 Stuttgart, Germany
tel.: +49 711 6858 1387
fax: +49 711 6858 1366
email: FirstName.LastName@ims.uni-stuttgart.de
 
 
 
During the winter semester 2010/2011 and the sommer semester 2011, I am replacing Prof. Erhard Hinrichs (who is on leave) at the Seminar für Sprachwissenschaft of the University of Tübingen.
 
Research Interests
  Probabilistic and Symbolic NLP, Parsing, POS Tagging, Tokenization, Finite-State Tools, Computational Morphology
 
Publications
  Hassan Sajjad, Nadir Durrani, Helmut Schmid, Alexander Fraser (2011). Comparing Two Techniques for Learning Transliteration Models Using a Parallel Corpus. In Proceedings of the 5th International Joint Conference on Natural Language Processing (IJCNLP), Chiang Mai, Thailand, to appear.

Nadir Durrani, Helmut Schmid, Alexander Fraser (2011): A Joint Sequence Translation Model with Integrated Reordering, Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL-HLT), Portland, Oregon.

Hassan Sajjad, Alexander Fraser, Helmut Schmid (2011): An Algorithm for Unsupervised Transliteration Mining with an Application to Word Alignment, Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL-HLT), Portland, Oregon.

Nadir Durrani, Hassan Sajjad, Alexander Fraser, Helmut Schmid (2010): Hindi-to-Urdu Machine Translation Through Transliteration. Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL), pages 465-474, Uppsala, Sweden.

Fabienne Fritzinger, Max Kisselew, Ulrich Heid, Andreas Madsack, Helmut Schmid (2009): Werkzeuge zur Extraktion von signifikanten Wortpaaren als Web Service, in Wolfgang Hoeppner, editor, GSCL-Symposium Sprachtechnologie und eHumanities, Technischer Bericht Nr. 2009-01 Duisburg, Germany.

Hassan Sajjad, Helmut Schmid (2009): Tagging Urdu Text with Parts of Speech: A Tagger Comparison, Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL) . Athens, Greece.

Wiebke Wagner, Helmut Schmid, Sabine Schulte im Walde (2009): Verb Sense Disambiguation using a Predicate-Argument-Clustering Model, Proceedings of the CogSci Workshop on Distributional Semantics beyond Concrete Concepts. Amsterdam, The Netherlands, July 2009.

Helmut Schmid, Florian Laws (2008): Estimation of Conditional Probabilities with Decision Trees and an Application to Fine-Grained POS Tagging, COLING 2008, Manchester, Great Britain.

Sabine Schulte im Walde, Christian Hying, Christian Scheible, Helmut Schmid: Combining EM Training and the MDL Principle for an Automatic Verb Classification Incorporating Selectional Preferences, ACL-HLT 2008, Columbus, Ohio.

Helmut Schmid, Bernd Möbius, Julia Weidenkaff (2007): Tagging Syllable Boundaries With Joint N-Gram Models, Interspeech 2007, Antwerp, Belgium.

Vera Demberg, Helmut Schmid, Gregor Möhler (2007): Phonological Constraints and Morphological Preprocessing for Grapheme-to-Phoneme Conversion, Proceedings of ACL 2007, Prague, Czech Republic.

Helmut Schmid (2006): Trace Prediction and Recovery With Unlexicalized PCFGs and Slash Features, Proceedings of COLING-ACL 2006, Sydney, Australia.

Helmut Schmid (2005)Disambiguation of Morphological Structure Using a PCFG, Proceedings of the Human Language Technology Conference and the Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP 2005), Vancouver, Canada.

Helmut Schmid (2005): A Programming Language for Finite State Transducers Proceedings of the 5th International Workshop on Finite State Methods in Natural Language Processing (FSMNLP 2005), Helsinki, Finland.

Helmut Schmid, Michaela Atterer (2004): New Statistical Methods for Phrase Break Prediction, Proceedings of the 20th International Conference on Computational Linguistics (COLING 2004), Geneva, Switzerland.

Helmut Schmid (2004): Efficient Parsing of Highly Ambiguous Context-Free Grammars with Bit Vectors, Proceedings of the 20th International Conference on Computational Linguistics (COLING 2004), Geneva, Switzerland.

Helmut Schmid, Arne Fitschen, Ulrich Heid (2004): SMOR: A German Computational Morphology Covering Derivation, Composition, and Inflection, Proceedings of the IVth International Conference on Language Resources and Evaluation (LREC 2004), p. 1263-1266, Lisbon, Portugal.

Helmut Schmid (2002): Lexicalization of Probabilistic Grammars. Proceedings of the 19th International Conference on Computational Linguistics (COLING 2002), Taipei, Taiwan.

Helmut Schmid (2002): A Generative Probability Model for Unification-Based Grammars. Proceedings of the 19th International Conference on Computational Linguistics (COLING 2002), Taipei, Taiwan.

Helmut Schmid, Mats Rooth (2001): Parse Forest Computation of Expected Governors. Proceedings of the 39th Annual Meeting of the ACL (ACL 2001), Toulouse, France.

Helmut Schmid, Sabine Schulte im Walde (2000): Robust German Noun Chunking With a Probabilistic Context-Free Grammar. Proceedings of the 18th International Conference on Computational Linguistics (COLING 2000), August 2000.

Helmut Schmid (2000) LoPar: Design and Implementation. Arbeitspapiere des Sonderforschungsbereiches 340, No. 149, IMS Stuttgart, July 2000. (25 pages)

Helmut Schmid (2000): Unsupervised Learning of Period Disambiguation for Tokenisation. Internal Report, IMS, University of Stuttgart, May 2000. (16 pages)

Helmut Schmid( 2000): YAP - Parsing and Disambiguation With Feature-Based Grammars. Ph.D. thesis, University of Stuttgart, January 2000, AIMS report 6(1). (197 pages)

Helmut Schmid (1997): Parsing by Successive Approximation. Proceedings of International Workshop on Parsing Technologies (IWPT '97). Boston, USA.

Helmut Schmid (1995): Improvements in Part-of-Speech Tagging with an Application to German. Proceedings of the ACL SIGDAT-Workshop. Dublin, Ireland.

Helmut Schmid (1994): Probabilistic Part-of-Speech Tagging Using Decision Trees. Proceedings of International Conference on New Methods in Language Processing, Manchester, UK.

Helmut Schmid (1994): Part-of-Speech Tagging with Neural Networks. Proceedings of the 15th International Conference on Computational Linguistics (COLING-94).

Software
  TreeTagger
The TreeTagger is a tool for automatic annotation of text corpora with part-of-speech and lemma information.
  RFTagger
The RFTagger is a POS tagger for fine-grained POS tagsets.
  SFST
SFST is a toolbox for the implementation of morphological analysers and other programs which are based on finite state transducers.
  LoPar
LoPar is a parser for head-lexicalized probabilistic context-free grammars.
  BitPar
BitPar is an efficient parser for Treebank grammars.
  Trace Parser
Get the trace parser described in my ACL 2006 paper.
  YAP
YAP is a fast parser for feature-based grammars.
  VPF
VPF is a parse forest browser for feature-structure based grammars.
  SMOR
is a German finite-state morphology implemented in the SFST programming language. An older version of SMOR with a few sample lexicon entries comes with the SFST tools (see above).
  LSC
LSC is a statistical clustering software for predicate-argument tuples with a fixed number of arguments.
  PAC
PAC is a statistical clustering software for predicate-argument tuples with a variable number of arguments. The selectional preferences are generally by means of a WordNet hierarchy.
 
Links
  The IMS Corpus Workbench is a tool for full-text retrieval on large textual resources. 

Other linguistic resources and tools available at IMS.

Chris Manning's list of linguistic resources and tools

 
Teaching
  Parsing I

Parsing II: Statistische Methoden in der maschinellen Sprachverarbeitung