Welcome to the Chair Foundations of Computational Linguistics at IMS Stuttgart. The group has been led by Prof. Jonas Kuhn since January 2010.
Our group works on computational models for processing natural language using both rule-based and statistical techniques.
Research interests of group members
Statistical models for dependency parsing focusing on morphologically rich languages. Our research puts special emphasis on the interaction between morphology and syntax. Our group obtained the best results on the two recent shared tasks on parsing morphologically rich languages (SPMRL 2013 and SPMRL 2014). We are also involved in the adaptation of Universal Dependencies to German.
Selected publications:
- Wolfgang Seeker and Jonas Kuhn, Morphological and Syntactic Case in Statistical Dependency Parsing. Computational Linguistics, 39(1), 2013.
- Wolfgang Seeker and Jonas Kuhn, The Effects of Syntactic Features in Automatic Prediction of Morphology. Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 2013.
- Anders Björkelund, Özlem Çetinoğlu, Agnieszka Faleńska, Richárd Farkas, Thomas Müller, Wolfgang Seeker, and Zsolt Szántó. The IMS-Wrocław-Szeged-CIS Entry at the SPMRL 2014 Shared Task: Reranking and Morphosyntax Meet Unlabeled Data. In Fifth Workshop on Statistical Parsing of Morphologically-Rich Languages, 2014.
- Wolfgang Seeker and Özlem Çetinoğlu. A Graph-based Lattice Dependency Parser for Joint Morphological Segmentation and Syntactic Analysis. Transactions of the Association for Computational Linguistics (3), 2015.
- Anders Björkelund and Joakim Nivre. Non-Deterministic Oracles for Unrestricted Non-Projective Dependency Parsing. Proceedings of the 14th International Conference on Parsing Technologies, 2015.
- Anders Björkelund, Agnieszka Faleńska, Wolfgang Seeker, and Jonas Kuhn. How to Train Dependency Parsers with Inexact Search for Joint Sentence Boundary Detection and Parsing of Entire Documents. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2016.
Algorithmic and theoretical challenges information structure and coreference. We develop novel algorithms for solving these tasks automatically as well devise annotation schemes and carry out annotation efforts. Of particular interest is the interaction between information structure and prosody, both on the modelling and computational side.
Selected publications:
- Stefan Baumann and Arndt Riester. Referential and Lexical Givenness: semantic, prosodic and cognitive aspects. Prosody and Meaning (Interface Explorations 25), eds. Gorka Elordieta and Pilar Prieto, 2012.
- Anders Björkelund, Kerstin Eckart, Arndt Riester, Nadja Schauffler, and Katrin Schweitzer. The Extended DIRNDL Corpus as a Resource for Automatic Coreference and Bridging Resolution. Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014.
- Anders Björkelund and Jonas Kuhn. Learning Structured Perceptrons for Coreference Resolution with Latent Antecedents and Non-local Features. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2014.
- Ina Rösiger and Arndt Riester. Using prosodic annotations to improve coreference resolution of spoken text. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), 2015.
Algorithmic and architectural aspects of data-driven generation. In particular, the interaction between referring expression generation and surface realization for end-to-end generation systems and morphologically rich languages (e.g. German). More recently, weakly supervised learning of representations for data-to-text generation and semantic parsing.
Selected publications:
- Sina Zarrieß, Aoife Cahill, and Jonas Kuhn. Underspecifying and Predicting Voice for Surface Realisation Ranking. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 2011.
- Bernd Bohnet, Anders Björkelund, Jonas Kuhn, Wolfgang Seeker, and Sina Zarrieß. Generating Non-Projective Word Order in Statistical Linearization. Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2012.
- Sina Zarrieß, Aoife Cahill, and Jonas Kuhn. To what extent does sentence-internal realisation reflect discourse context? A study on word order. In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, 2012.
- Kyle Richardson and Jonas Kuhn. Learning to Make Inferences in a Semantic Parsing Task. Transactions of the Association of Computional Linguistics (TACL) 2016: 155-168.
- Sina Zarrieß and Kyle Richardson. An Automatic Method for Building a Data-to-Text Generator. Proceedings of the 14th European Workshop on Natural Language Generation, 2013.
Continuing a long tradition of work on corpora and language-technological tools at IMS, our group is involved in projects that focus on infrastructural aspects and the sustainability of language resources. This involves the creation of ready-to-use tool chains and web services, easy access to data by means of visualisations and query tools, the creation of meta data and the documentation of workflows for all kinds of language resources, and the support of collaborative annotation and curation efforts. The group also runs Stuttgart's CLARIN-D center.
Selected publications:
- Andre Blessing, Jens Stegmann, and Jonas Kuhn. SOA meets Relation Extraction: Less may be more in Interaction. Proceedings of the Workshop on Service-oriented Architectures (SOAs) for the Humanities: Solutions and Impacts, Digital Humanities, 2012.
- Markus Gärtner, Gregor Thiele, Wolfgang Seeker, Anders Björkelund, and Jonas Kuhn. ICARUS -- An Extensible Graphical Search Tool for Dependency Treebanks. Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2013.
- Andre Blessing, and Jonas Kuhn. Textual Emigration Analysis (TEA). Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014.
- Markus Gärtner, Anders Björkelund, Gregor Thiele, Wolfgang Seeker, and Jonas Kuhn. Visualization, Search, and Error Analysis for Coreference Annotations. Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2014.
- Cerstin Mahlow, Kerstin Eckart, Jens Stegmann, Andre Blessing, Gregor Thiele, Markus Gärtner, and Jonas Kuhn. Resources, Tools, and Applications at the CLARIN Center Stuttgart. Proceedings of the 12th Konferenz zur Verarbeitung natürlicher Sprache (KONVENS 2014), 2014.
We collaborate with various disciplines in the humanities, developing a joint methodological framework for analyzing large text corpora against specific content-analytical questions. One of our goals is to make the aggregated results of such analyses transparent to scholars from the humanities, which requires adaptable models and interactive tools and user interfaces. We have been contributing to several DH-related projects: CLARIN-D, e-Identity, ePoetics, and we are involved in a new Stuttgart center for reflected text analytics (CRETA) in the Digital Humanities.
Selected publications:
- Andre Blessing, Andrea Glaser, and Jonas Kuhn. Biographical Data Exploration as a Test-bed for a Multi-view, Multi-method Approach in the Digital Humanities. Proceedings of the first Conference on Biographical Data in a Digital World, 2015.
- Andre Blessing, Fritz Kliche, Ulrich Heid, Cathleen Kantner, and Jonas Kuhn. Computerlinguistische Werkzeuge zur Erschließung und Exploration großer Textsammlungen aus der Perspektive fachspezifischer Theorie. Grenzen und Möglichkeiten der Digital Humanities Sonderband 1, 2015.
- Jonas Kuhn and Nils Reiter. A Plea for a Method-driven Agenda in the Digital Humanities. Global Digital Humanities Conference, Sydney, Australia. 2015.
- Markus John, Steffen Koch, Florian Heimerl, Andreas Müller, Thomas Ertl, and Jonas Kuhn. Interactive Visual Analysis Of German Poetics. Global Digital Humanities Conference, 2015.
- Nils Reiter. Towards Annotating Narrative Segments. Proceedings of the 9th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH), 2015.
- Schulz, S. & Keller, M. (2016). Code-Switching Ubique Est - Language Identification and Part-of-Speech Tagging for Historical Mixed Text. LaTeCH@ACL, August, Berlin: The Association for Computer Linguistics.
Continuing a long tradition at the IMS, one focus in our group is on Discourse Representation Theory, the syntax-semantics interface, and the role of lexical information in word formation and sentence/discourse semantics.
Selected publications:
- Antje Rossdeutscher and Hans Kamp. Syntactic and Semantic Constraints in the Formation and Interpretation of ung-Nouns, in A. Alexiadou and M. Rathert (eds), The Semantics of Nominalisations across Languages and Frameworks, Interface Explorations 22, Mouton de Gruyter, Berlin, pp. 169-214. 2011.
- Boris Haselbach, Kerstin Eckart, Wolfgang Seeker, Kurt Eberle & Ulrich Heid. Approximating Theoretical Linguistics Classification in Real Data: the Case of German nach Particle Verbs. In Proceedings of the 24th International Conference on Computational Linguistics (COLING-2012), IIT Bombay, Mumbai, December 8-15, 2012.
- Tillmann Pross. Mono-eventive verbs of emission and their bi-eventive nominalizations. 45th Conference of the North East Linguistics Society (NELS). 2014.
- Antje Rossdeutscher. When roots license and when they respect semantico-syntactic structure, in A. Alexiadou, H. Borer and F. Schäfer (eds), The Syntax and Roots and the Roots of Syntax, Oxford University Press, 2014.
Current Projects
Digital Humanities:
- CRETA - Center for Reflected Text Analytics
PIs: Jonas Kuhn, inter alia - RePlay-DH
PIs: Jonas Kuhn (IMS), Helge Steenweg (UB Stuttgart), Stefan Wesner (KIZ Ulm)
- CLARIN-D: Common Language Resources and Technology Infrastructure
Director of the Stuttgart CLARIN-D Center: Prof. Dr. Jonas Kuhn
Past Projects
Digital Humanities:
- e-Identity
A collaboration between Political Science and Natural Language Processing
PI: Prof. Dr. Cathleen Kantner, Co-PI at IMS: Prof. Dr. Jonas Kuhn
further collaborators: University of Hildesheim and University of Potsdam - ePoetics
A collaboration between Literary Studies, Computer Science and Natural Language Processing
PI: Prof. Dr. Sandra Richter, Prof. Dr. Thomas Ertl, Prof. Dr. Jonas Kuhn, Prof. Dr. Andrea Rapp - DebateExplorer
PI: Jonas Kuhn
SFB 732 Incremental Specification in Context
- A6: Encoding of Information Structure in German and French
PI: Prof. Dr. Uwe Reyle - B4:The role of lexical information in word-formation, and the semantics of sentence and discourse
PIs: PD Dr. Antje Roßdeutscher and Prof. Dr. Hans Kamp - D2: Combining Contextual Information Sources for Disambiguation in Parsing and Choice in Generation
PI: Prof. Dr. Jonas Kuhn - D8: Data-driven Dependency Parsing
PI: Prof. Dr. Jonas Kuhn - INF: Information Infrastructure Project
PIs: Prof. Dr. Grzegorz Dogil, Prof. Dr. Jonas Kuhn, Prof. Dr. Sebastian Padó
Associated Groups
- Distributional Approaches to Semantic Relatedness
PI: PD Dr. Sabine Schulte im Walde - Tree Transducers in Machine Translation
PI: Dr. Andreas Maletti
