Welcome to the Chair Foundations of Computational Linguistics at IMS Stuttgart. The group has been led by Prof. Jonas Kuhn since January 2010.
Our group works on computational models for processing natural language using both rule-based and statistical techniques. The group members' research interests include the following focus areas:
See also staff list and project list below.
Statistical models for dependency parsing focusing on morphologically rich languages. Our research puts special emphasis on the interaction between morphology and syntax. Our group obtained the best results on the two recent shared tasks on parsing morphologically rich languages (SPMRL 2013 and SPMRL 2014). We are also involved in the adaptation of Universal Dependencies to German.
- Wolfgang Seeker and Jonas Kuhn, Morphological and Syntactic Case in Statistical Dependency Parsing. Computational Linguistics, 39(1), 2013.
- Wolfgang Seeker and Jonas Kuhn, The Effects of Syntactic Features in Automatic Prediction of Morphology. Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 2013.
- Anders Björkelund, Özlem Çetinoğlu, Agnieszka Faleńska, Richárd Farkas, Thomas Müller, Wolfgang Seeker, and Zsolt Szántó. The IMS-Wrocław-Szeged-CIS Entry at the SPMRL 2014 Shared Task: Reranking and Morphosyntax Meet Unlabeled Data. In Fifth Workshop on Statistical Parsing of Morphologically-Rich Languages, 2014.
- Wolfgang Seeker and Özlem Çetinoğlu. A Graph-based Lattice Dependency Parser for Joint Morphological Segmentation and Syntactic Analysis. Transactions of the Association for Computational Linguistics (3), 2015.
- Anders Björkelund and Joakim Nivre. Non-Deterministic Oracles for Unrestricted Non-Projective Dependency Parsing. Proceedings of the 14th International Conference on Parsing Technologies, 2015.
- Anders Björkelund, Agnieszka Faleńska, Wolfgang Seeker, and Jonas Kuhn. How to Train Dependency Parsers with Inexact Search for Joint Sentence Boundary Detection and Parsing of Entire Documents. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2016.
Co-reference and Information Structure
Algorithmic and theoretical challenges information structure and coreference. We develop novel algorithms for solving these tasks automatically as well devise annotation schemes and carry out annotation efforts. Of particular interest is the interaction between information structure and prosody, both on the modelling and computational side.
- Stefan Baumann and Arndt Riester. Referential and Lexical Givenness: semantic, prosodic and cognitive aspects. Prosody and Meaning (Interface Explorations 25), eds. Gorka Elordieta and Pilar Prieto, 2012.
- Anders Björkelund, Kerstin Eckart, Arndt Riester, Nadja Schauffler, and Katrin Schweitzer. The Extended DIRNDL Corpus as a Resource for Automatic Coreference and Bridging Resolution. Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014.
- Anders Björkelund and Jonas Kuhn. Learning Structured Perceptrons for Coreference Resolution with Latent Antecedents and Non-local Features. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2014.
- Ina Rösiger and Arndt Riester. Using prosodic annotations to improve coreference resolution of spoken text. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), 2015.
Natural Language Generation and Semantic Parsing
Algorithmic and architectural aspects of data-driven generation. In particular, the interaction between referring expression generation and surface realization for end-to-end generation systems and morphologically rich languages (e.g. German). More recently, weakly supervised learning of representations for data-to-text generation and semantic parsing.
- Sina Zarrieß, Aoife Cahill, and Jonas Kuhn. Underspecifying and Predicting Voice for Surface Realisation Ranking. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 2011.
- Bernd Bohnet, Anders Björkelund, Jonas Kuhn, Wolfgang Seeker, and Sina Zarrieß. Generating Non-Projective Word Order in Statistical Linearization. Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2012.
- Sina Zarrieß, Aoife Cahill, and Jonas Kuhn. To what extent does sentence-internal realisation reflect discourse context? A study on word order. In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, 2012.
- Kyle Richardson and Jonas Kuhn. Learning to Make Inferences in a Semantic Parsing Task. Transactions of the Association of Computional Linguistics (TACL) 2016: 155-168.
- Sina Zarrieß and Kyle Richardson. An Automatic Method for Building a Data-to-Text Generator. Proceedings of the 14th European Workshop on Natural Language Generation, 2013.
Continuing a long tradition of work on corpora and language-technological tools at IMS, our group is involved in projects that focus on infrastructural aspects and the sustainability of language resources. This involves the creation of ready-to-use tool chains and web services, easy access to data by means of visualisations and query tools, the creation of meta data and the documentation of workflows for all kinds of language resources, and the support of collaborative annotation and curation efforts. The group also runs Stuttgart's CLARIN-D center.
- Andre Blessing, Jens Stegmann, and Jonas Kuhn. SOA meets Relation Extraction: Less may be more in Interaction. Proceedings of the Workshop on Service-oriented Architectures (SOAs) for the Humanities: Solutions and Impacts, Digital Humanities, 2012.
- Markus Gärtner, Gregor Thiele, Wolfgang Seeker, Anders Björkelund, and Jonas Kuhn. ICARUS -- An Extensible Graphical Search Tool for Dependency Treebanks. Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2013.
- Andre Blessing, and Jonas Kuhn. Textual Emigration Analysis (TEA). Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014.
- Markus Gärtner, Anders Björkelund, Gregor Thiele, Wolfgang Seeker, and Jonas Kuhn. Visualization, Search, and Error Analysis for Coreference Annotations. Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2014.
- Cerstin Mahlow, Kerstin Eckart, Jens Stegmann, Andre Blessing, Gregor Thiele, Markus Gärtner, and Jonas Kuhn. Resources, Tools, and Applications at the CLARIN Center Stuttgart. Proceedings of the 12th Konferenz zur Verarbeitung natürlicher Sprache (KONVENS 2014), 2014.
We collaborate with various disciplines in the humanities, developing a joint methodological framework for analyzing large text corpora against specific content-analytical questions. One of our goals is to make the aggregated results of such analyses transparent to scholars from the humanities, which requires adaptable models and interactive tools and user interfaces. We have been contributing to several DH-related projects: CLARIN-D, e-Identity, ePoetics, and we are involved in a new Stuttgart center for reflected text analytics (CRETA) in the Digital Humanities.
- Andre Blessing, Andrea Glaser, and Jonas Kuhn. Biographical Data Exploration as a Test-bed for a Multi-view, Multi-method Approach in the Digital Humanities. Proceedings of the first Conference on Biographical Data in a Digital World, 2015.
- Andre Blessing, Fritz Kliche, Ulrich Heid, Cathleen Kantner, and Jonas Kuhn. Computerlinguistische Werkzeuge zur Erschließung und Exploration großer Textsammlungen aus der Perspektive fachspezifischer Theorie. Grenzen und Möglichkeiten der Digital Humanities Sonderband 1, 2015.
- Jonas Kuhn and Nils Reiter. A Plea for a Method-driven Agenda in the Digital Humanities. Global Digital Humanities Conference, Sydney, Australia. 2015.
- Markus John, Steffen Koch, Florian Heimerl, Andreas Müller, Thomas Ertl, and Jonas Kuhn. Interactive Visual Analysis Of German Poetics. Global Digital Humanities Conference, 2015.
- Nils Reiter. Towards Annotating Narrative Segments. Proceedings of the 9th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH), 2015.
- Schulz, S. & Keller, M. (2016). Code-Switching Ubique Est - Language Identification and Part-of-Speech Tagging for Historical Mixed Text. LaTeCH@ACL, August, Berlin: The Association for Computer Linguistics.
Theoretical and Computational Semantics
Continuing a long tradition at the IMS, one focus in our group is on Discourse Representation Theory, the syntax-semantics interface, and the role of lexical information in word formation and sentence/discourse semantics.
- Antje Rossdeutscher and Hans Kamp. Syntactic and Semantic Constraints in the Formation and Interpretation of ung-Nouns, in A. Alexiadou and M. Rathert (eds), The Semantics of Nominalisations across Languages and Frameworks, Interface Explorations 22, Mouton de Gruyter, Berlin, pp. 169-214. 2011.
- Boris Haselbach, Kerstin Eckart, Wolfgang Seeker, Kurt Eberle & Ulrich Heid. Approximating Theoretical Linguistics Classification in Real Data: the Case of German nach Particle Verbs. In Proceedings of the 24th International Conference on Computational Linguistics (COLING-2012), IIT Bombay, Mumbai, December 8-15, 2012.
- Tillmann Pross. Mono-eventive verbs of emission and their bi-eventive nominalizations. 45th Conference of the North East Linguistics Society (NELS). 2014.
- Antje Rossdeutscher. When roots license and when they respect semantico-syntactic structure, in A. Alexiadou, H. Borer and F. Schäfer (eds), The Syntax and Roots and the Roots of Syntax, Oxford University Press, 2014.
Head of chair:
- Prof. Dr. h.c. Hans Kamp, PhD
- Prof. Dr. Christian Rohrer
- Sybille Laderer
- Barbara Schäfer (project secretary)
- Apl. Prof. Dr. Rainer Bäuerle (retired)
- Dr. Bernd Bohnet (now at Google)
- Dr. Fabienne Cap (now at Uppsala University)
- PD Dr. Kurt Eberle
- Wiltrud Kessler (now at MINT-Kolleg Baden Württemberg)
- Fritz Kliche (now at University of Hildesheim)
- Dr. Masood Ghayoomi (now at FU Berlin)
- Andreas Madsack (now at aexea)
- Dr. Cerstin Mahlow (now at IDS Mannheim)
- Andreas Müller
- Wolfgang Seeker (now at Retresco)
- Jens Stegmann
- Gregor Thiele (now at Porsche)
- Dr. Sina Zarrieß (now at the University of Bielefeld)
- Prof. Dr. Heike Zinsmeister (now at the University of Hamburg)
- Zoltán Czesznak
- Alisa Noha
- Ilnar Salimzianov
- Sarah Schopper
- Olga Stroh
- Moritz Völkel
- Anastasia Vostrova
SFB 732 Incremental Specification in Context:
A collaboration between Political Science and Natural Language Processing
PI: Prof. Dr. Cathleen Kantner, Co-PI at IMS: Prof. Dr. Jonas Kuhn
further collaborators: University of Hildesheim and University of Potsdam
A collaboration between Literary Studies, Computer Science and Natural Language Processing
PI: Prof. Dr. Sandra Richter, Prof. Dr. Thomas Ertl, Prof. Dr. Jonas Kuhn, Prof. Dr. Andrea Rapp
PI: Jonas Kuhn