Institute

Studying

Research


 

Heisenberg Group "Distributional Approaches to Semantic Relatedness"

???detail.Chair.Picture???

The Heisenberg Group SemRel is an independent research group, headed by PD Dr. Sabine Schulte im Walde. It is funded by the German Research Foundation (Deutsche Forschungsgemeinschaft), through a Heisenberg Fellowship plus Research Grant, and further supported by the Integrated Research Training Group of the SFB 732. The group started in November 2011. The picture shows the SemRel team together with staff members from the SFB projects D11 and D12

The project explores the potential and the limits of distributional approaches to lexical semantics. While it is clear that distributional knowledge does not cover all the cognitive knowledge humans possess with respect to word meaning, distributional models are very attractive, as the underlying parameters are accessible from even low-level annotated corpus data. We are thus interested in maximising the benefit of distributional information for lexical semantics.


Staff

Principal Investigator:


Staff:


Student Researchers:

  • Sai Abishek Bhaskar (Manipal Institute of Technology, India)
  • Moritz Wittmann


Former Team Members:


Student Theses:

  • Benjamin David (Studienarbeit, 2013; Diplomarbeit, 2014)
  • Lukas Fromme (Bachelor thesis, 2016)
  • Ronny Jauch (Diplomarbeit, 2012)
  • Anna Hätty (Master thesis, 2016)
  • Vanessa Hanschke (Internship, 2013)
  • Nana Khvtisavrishvili (Diplomarbeit, 2014)
  • Maximilian Köper (Bachelor thesis, 2012; Master thesis, 2014)
  • Manuel Müller (Bachelor thesis, 2016)
  • Stefan Müller (Studienarbeit, 2012; Diplomarbeit, 2013)
  • Daniela Naumann (Master thesis, current)
  • Neha Nayak (Bachelor thesis, 2013)
  • Manju Nirmal (Master thesis, 2015)
  • Cornelius Putzler (Studienarbeit, 2013)
  • Martin Rettig (Bachelor thesis, 2016)
  • Stefan Rüd (Diplomarbeit, 2012)
  • Enrico Santus (Master thesis, 2013)
  • Jonas Sturm (Bachelor thesis, 2016)
  • Alina Wimmer (Studienarbeit, 2014; Diplomarbeit, current)
  • Moritz Wittmann (Bachelor thesis, 2014; Master thesis, current)


[Pictures of the Group]

Research

The overall goal of the SemRel group is to explore the potential and the limits of distributional approaches to lexical semantics. In phase 1 of the project (11/2011-01/2015), we distinguished three types of semantic relatedness, to shed light on distributional modelling from different perspectives. The work was performed within an interdisciplinary framework between theoretical, cognitive and computational linguistics. The following figure illustrates the interaction of our topics. Each type of relatedness concerning paradigmatic relations, preposition senses and compound compositionality received input and feedback from human judgements, and was applied to statistical machine translation.

While SemRel phase 1 has distinguished three types of semantic relatedness, phase 2 (02/2015-01/2017) will bring together the tasks, approaches and results from these test cases, and study semantic relatedness from a meta-level perspective: We (i) investigate distributional approaches across types of semantic relatedness, across word classes, and across languages; (ii) explore the linguistic performance of soft clustering approaches, parameters, and evaluations to model ambiguity; and (iii) abstract over semantic subcategorisation information and semantic evaluation for SMT as an extrinsic application. Overall, the work will continue within an interdisciplinary framework between theoretical, cognitive and computational linguistics, which allows us to explore distributional approaches through complementary evidence.

Interdisciplinarity: Theoretical linguistics provides the formal definitions of the semantic relatedness phenomena we are interested in, and cognitive linguistics tells us how humans perceive and express semantic relatedness. Both from the linguistic and the cognitive perspective, we expect a guidance towards selecting and implementing theoretically and cognitively adequate distributional attributes to model word meaning, and gold standards as seeds for computational algorithms and for intrinsic evaluations of the distributional models. As regards the cognitive perspective, we do not only expect cognitive evidence for the potential of distributional knowledge, but also clear evidence for its limits, as human judgements naturally comprise both distributional and world knowledge. Altogether, linguistic and cognitive feedback should help us to define simple, straightforward computational methods to assess information about distributional meaning. Furthermore, the computational perspective explores the applicability of our distributional semantic knowledge to statistical machine translation as an extrinsic evaluation.

Challenges: Within our interdisciplinary approach, we address two major challenges. Firstly, we are interested in a theoretically and cognitively adequate selection of features to model word meaning and word relatedness. In this respect, our project differs from approaches that are not interested in the actual meaning of their features but only in optimising a complex computational machinery that makes use of them. In contrast, our goal is to explore the meaning and the potential of comparatively simple distributional models. Secondly, our work aims to model word meaning with respect to word senses, thus addressing ambiguity. Even though ambiguity is a frequent target of computational models in general, it has largely been ignored in distributionality.

Talks

Invited Talks

  • 53. Jahrestagung des Instituts für Deutsche Sprache:
    Wortschätze: Dynamik, Muster, Komplexität

    Kongresszentrum Rosengarten, Mannheim, March 15, 2017
    Experiential Data and Distributional Models of German Particle Verbs

  • Universität Düsseldorf, Computational Linguistics Research Colloquium
    January 26, 2017
    Distributional Models of Compositionality: German Noun Compounds and Particle Verbs

  • Ruhr-Universität Bochum, Sprachwissenschaftliches Institut
    December 6, 2016
    Distributionelle Modellierung von semantischen Beziehungen

  • Universität Erlangen-Nürnberg, Korpuslinguistik
    Sabine Schulte im Walde
    December 17, 2015
    Potential and Limits of Distributional Approaches to Semantic Relatedness [slides]

  • Universität Zürich, Institut für Computerlinguistik
    May 12, 2015
    Sabine Schulte im Walde
    Experiential Data and Distributional Models of German Particle Verbs [slides]

  • Joint Symposium on Semantic Processing: Textual Inference and Structures in Corpora
    Sabine Schulte im Walde
    Trento, Italy, November 20-22, 2013
    Potential and Limits of Distributional Approaches to Semantic Relatedness [slides]

  • Universität Düsseldorf, Institut für Sprache und Information
    Sabine Schulte im Walde
    Düsseldorf, Germany, July 4, 2013
    Compositionality of German Noun-Noun Compounds and German Particle Verbs: Experiential Data and Distributional Models [slides]

  • Small World of Words. Workshop on Word Associations and Semantic Graphs
    Sabine Schulte im Walde
    Leuven, Belgium, October 11, 2012
    Using Associations to identify Salient Features for Data-intensitive Lexical Semantic Tasks [slides]

  • Workshop Webkorpora in Computerlinguistik und Sprachforschung
    Sabine Schulte im Walde
    Workshop organised by the GSCL special interest groups on Hypermedia and Corpus Linguistics
    Institut für Deutsche Sprache, Mannheim, September 28, 2012
    Webkorpora für die automatische Akquisition lexikalisch-semantischen Wissens [slides]

Talks/Poster at Conferences/Workshops without Proceedings

  • DGfS-CL Poster Session 2016
    Anna Hätty, Sabine Schulte im Walde, Stefan Bott, Nana Khvtisavrishvili
    Annual Meeting of the DGfS, Universität Konstanz
    February 24-26, 2016
    Features of Compositionality in English and German Noun-Noun-Compounds [abstract/poster]

  • International Conference Linguistic Evidence 2016: Empirical, Theoretical and Computational Perspectives
    Sabine Schulte im Walde
    Biannual Meeting hosted by the SFB 833 The Construction of Meaning, Universität Tübingen
    February 18-20, 2016
    Distinguishing Paradigmatic Semantic Relations: Human Ratings and Distributional Similarity [abstract/poster]

  • DGfS-CL Poster Session 2015
    Nana Khvtisavrishvili, Stefan Bott, Sabine Schulte im Walde
    Annual Meeting of the DGfS, Universität Leipzig
    March 4-6, 2015
    A Corpus-based Study on the Syntactic Behaviour of German Particle Verbs [abstract/poster]

  • 20th Architectures and Mechanisms for Natural Language Processing Conference (AMLaP)
    Gabriella Lapesa, Sabine Schulte im Walde, Stefan Evert
    Edinburgh, Scotland, UK
    September 3-6, 2014
    Judging Paradigmatic Relations: A New Collection of English Ratings [abstract/poster]

  • "The Stuff Words are Made of”: An International Conference on the Cross-linguistic Comparison of Indo-Germanic and Semitic Languages (CoGS)
    Sylvia Springorum
    Konstanz, Germany
    Juli 21-23, 2014
    (Re-)constructing German verb particle meanings for familiar and novel verbs [abstract]

  • 5th Conference on Quantitative Investigations in Theoretical Linguistics (QITL)
    Sylvia Springorum, Sabine Schulte im Walde, Antje Roßdeutscher
    Leuven, Belgium
    September 12-14, 2013
    Sentence Generation and Compositionality of Systematic Neologisms of German Particle Verbs [abstract]

  • International Conference Linguistic Evidence 2012: Empirical, Theoretical and Computational Perspectives
    Sylvia Springorum, Sabine Schulte im Walde, Antje Roßdeutscher
    Biannual Meeting hosted by the SFB 833 Bedeutungskonstitution, Universität Tübingen
    February 9-11, 2012
    Automatic Classification of German 'an' Particle Verbs [abstract]

Publications

2016

Maximilian Köper, Sabine Schulte im Walde
Automatic Semantic Classification of German Preposition Types: Comparing Hard and Soft Clustering Approaches across Features [pdf/bib]
In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL). Berlin, Germany, August 2016.

Maximilian Köper, Melanie Zaiß, Qi Han, Steffen Koch, Sabine Schulte im Walde
Visualisation and Exploration of High-Dimensional Distributional Features in Lexical Semantic Classification [pdf/poster/bib]
In: Proceedings of the 10th Conference on Language Resources and Evaluation (LREC). Portoroz, Slovenia, May 2016.

Kim-Anh Nguyen, Sabine Schulte im Walde, Thang Vu
Integrating Distributional Lexical Contrast into Word Embeddings for Antonym-Synonym Distinction -- Outstanding Paper [pdf/bib]
In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL). Berlin, Germany, August 2016.

Kim-Anh Nguyen, Sabine Schulte im Walde, Thang Vu
Neural-based Noise Filtering from Word Embeddings
In: Proceedings of the 26th International Conference on Computational Linguistics (COLING). Osaka, Japan, December 2016. To appear.

Sabine Schulte im Walde, Anna Hätty, Stefan Bott
The Role of Modifier and Head Properties in Predicting the Compositionality of English and German Noun-Noun Compounds: A Vector-Space Perspective [pdf/bib]
In: Proceedings of the 5th Joint Conference on Lexical and Computational Semantics (*SEM). Berlin, Germany, August 2016.

Sabine Schulte im Walde, Anna Hätty, Stefan Bott, Nana Khvtisavrishvili
Ghost-NN: A Representative Gold Standard of German Noun-Noun Compounds [pdf/poster/bib]
In: Proceedings of the 10th Conference on Language Resources and Evaluation (LREC). Portoroz, Slovenia, May 2016.

Marion Weller-Di Marco, Alexander Fraser, Sabine Schulte im Walde
Modeling Complement Types in Phrase-Based SMT [pdf/bib]
In: Proceedings of the 1st Conference on Machine Translation (WMT). Berlin, Germany, August 2016.

Moritz Wittmann, Marion Weller-Di Marco, Sabine Schulte im Walde
Graph-based Clustering of Synonym Senses for German Particle Verbs [pdf/bib]
In: Proceedings of the 12th Workshop on Multiword Expressions. Berlin, Germany, August 2016.

2015

Stefan Bott, Sabine Schulte im Walde
Exploiting Fine-grained Syntactic Transfer Features to Predict the Compositionality of German Particle Verbs [pdf/poster/bib]
In: Proceedings of the 11th Conference on Computational Semantics (IWCS). London, UK, April 2015.

Fabienne Cap, Manju Nirmal, Marion Weller, Sabine Schulte im Walde
How to Account for Idiomatic German Support Verb Constructions in Statistical Machine Translation[pdf/bib]
In: Proceedings of the 11th Workshop on Multiword Expressions. Denver, CO, June 2015.

Nana Khvtisavrishvili, Stefan Bott, Sabine Schulte im Walde
Wie oft schreibt man das zusammen? The Puzzle of Why some Separable Verbs in German are More Separable than Others [pdf/bib]
In: Proceedings of the 26th International Conference of the German Society for Computational Linguistics and Language Technology (GSCL). Duisburg-Essen, Germany, September 2015.

Maximilian Köper, Christian Scheible, Sabine Schulte im Walde
Multilingual Reliability and "Semantic" Structure of Continuous Word Spaces [pdf/poster/bib]
In: Proceedings of the 11th Conference on Computational Semantics (IWCS). London, UK, April 2015.

Sabine Schulte im Walde, Susanne Borgwaldt
Association Norms for German Noun Compounds and their Constituents [doi/preprint pdf/bib]
Behavior Research Methods.

Marion Weller, Alexander Fraser, Sabine Schulte im Walde
Target-Side Generation of Prepositions for SMT [pdf/bib]
In: Proceedings of the 18th Annual Conference of the European Association for Machine Translation (EAMT). Antalya, Turkey, May 2015.

Marion Weller, Alexander Fraser, Sabine Schulte im Walde
Predicting Prepositions for SMT (short version of EAMT-2015 paper) [pdf/poster/bib]
In: Proceedings of the 9th Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST). Denver, CO, June 2015.

2014

Stefan Bott, Sabine Schulte im Walde
Optimizing a Distributional Semantic Model for the Prediction of German Particle Verb Compositionality [pdf/bib]
In: Proceedings of the 9th Conference on Language Resources and Evaluation (LREC). Reykjavik, Iceland, May 2014.

Stefan Bott, Sabine Schulte im Walde
Syntactic Transfer Patterns of German Particle Verbs and Their Impact on Lexical Semantics [pdf/bib]
In: Proceedings of the 3rd Joint Conference on Lexical and Computational Semantics (*SEM). Dublin, Ireland, August 2014.

Stefan Bott, Sabine Schulte im Walde
Modelling Regular Subcategorization Changes in German Particle Verbs [pdf/bib]
In: Proceedings of the 1st Workshop on Computational Approaches to Compound Analysis. Dublin, Ireland, August 2014.

Maximilian Köper, Sabine Schulte im Walde
A Rank-based Distance Measure to Detect Polysemy and to Determine Salient Vector-Space Features for German Prepositions [pdf/bib]
In: Proceedings of the 9th Conference on Language Resources and Evaluation (LREC). Reykjavik, Iceland, May 2014.

Gabriella Lapesa, Stefan Evert
NaDiR: Naive Distributional Response Generation [pdf]
In: Proceedings of the 4th Workshop on Cognitive Aspects of the Lexicon (CogALex). Dublin, Ireland, August 2014.

Gabriella Lapesa, Stefan Evert, Sabine Schulte im Walde
Contrasting Syntagmatic and Paradigmatic Relations: Insights from Distributional Semantic Models [pdf/bib]
In: Proceedings of the 3rd Joint Conference on Lexical and Computational Semantics (*SEM). Dublin, Ireland, August 2014.

Stephen Roller, Sabine Schulte im Walde
Feature Norms of German Noun Compounds [pdf/bib]
In: Proceedings of the 10th Workshop on Multiword Expressions. Gothenburg, Sweden, April 2014.

Michael Roth, Sabine Schulte im Walde
Combining Word Patterns and Discourse Markers for Paradigmatic Relation Classification [pdf/poster/bib]
In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL). Baltimore, MD, June 2014.

Enrico Santus, Alessandro Lenci, Qin Lu, Sabine Schulte im Walde
Chasing Hypernyms in Vector Spaces with Entropy [pdf/bib]
In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL). Gothenburg, Sweden, April 2014.

Silke Scheible, Sabine Schulte im Walde
A Database of Paradigmatic Semantic Relation Pairs for German Nouns, Verbs, and Adjectives [pdf/bib]
In: Proceedings of the COLING Workshop on Lexical and Grammatical Resources for Language Processing. Dublin, Ireland, August 2014.

Jason Utt, Sylvia Springorum, Maximilian Köper, Sabine Schulte im Walde
Fuzzy V-Measure - An Evaluation Method for Cluster Analyses of Ambiguous Data [pdf/poster/bib]
In: Proceedings of the 9th Conference on Language Resources and Evaluation (LREC). Reykjavik, Iceland, May 2014.

Marion Weller, Fabienne Cap, Stefan Müller, Sabine Schulte im Walde, Alexander Fraser
Distinguishing Degrees of Compositionality in Compound Splitting for Statistical Machine Translation [pdf/bib]
In: Proceedings of the 1st Workshop on Computational Approaches to Compound Analysis. Dublin, Ireland, August 2014.

Marion Weller, Alexander Fraser, Ulrich Heid
Combining Bilingual Terminology Mining and Morphological Modeling for Domain Adaptation in SMT [pdf]
In: Proceedings of the Seventeenth Annual Conference of the European Association for Machine Translation (EAMT). Dubrovnik, Croatia, June 2014.

Marion Weller, Sabine Schulte im Walde, Alexander Fraser
Using Noun Class Information to model Selectional Preferences for Translating Prepositions in SMT [pdf/poster/bib]
In: Proceedings of the 11th Conference of the Association for Machine Translation in the Americas (AMTA). Vancouver, Canada, October 2014.

Moritz Wittmann, Marion Weller, Sabine Schulte im Walde
Automatic Extraction of Synonyms for German Particle Verbs from Parallel Data with Distributional Similarity as a Reranking Feature [pdf/poster/bib]
In: Proceedings of the 9th Conference on Language Resources and Evaluation (LREC). Reykjavik, Iceland, May 2014.

2013

Stephen Roller, Sabine Schulte im Walde
A Multimodal LDA Model integrating Textual, Cognitive and Visual Modalities [pdf/poster/bib]
In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). Seattle, WA, October 2013.

Stephen Roller, Sabine Schulte im Walde, Silke Scheible
The (Un)expected Effects of Applying Standard Cleansing Models to Human Ratings on Compositionality [pdf/bib]
In: Proceedings of the 9th Workshop on Multiword Expressions. Atlanta, GA, June 2013.

Silke Scheible, Sabine Schulte im Walde, Sylvia Springorum
Uncovering Distributional Differences between Synonyms and Antonyms in a Word Space Model [pdf/bib]
In: Proceedings of the 6th International Joint Conference on Natural Language Processing (IJCNLP). Nagoya, Japan, October 2013.

Silke Scheible, Sabine Schulte im Walde, Marion Weller, Max Kisselew
A Compact but Linguistically Detailed Database for German Verb Subcategorisation relying on Dependency Parses from a Web Corpus: Tool, Guidelines and Resource [pdf/bib]
In: Proceedings of the 8th Web as Corpus Workshop. Lancaster, UK, July 2013.

Sabine Schulte im Walde, Maximilian Köper
Pattern-based Distinction of Paradigmatic Relations for German Nouns, Verbs, Adjectives [doi/preprint pdf/bib]
In: Proceedings of the 25th International Conference of the German Society for Computational Linguistics and Language Technology (GSCL). Darmstadt, Germany, September 2013.

Sabine Schulte im Walde, Stefan Müller
Using Web Corpora for the Automatic Acquisition of Lexical-Semantic Knowledge [doi/pdf/bib]
Journal for Language Technology and Computational Linguistics 28(2):85-105, 2013. Special Issue on Web Corpora for Computational Linguistics and Linguistic Research, edited by Roman Schneider, Angelika Storrer and Alexander Mehler.

Sabine Schulte im Walde, Stefan Müller, Stephen Roller
Exploring Vector Space Models to Predict the Compositionality of German Noun-Noun Compounds [pdf/bib]
In: Proceedings of the 2nd Joint Conference on Lexical and Computational Semantics (*SEM). Atlanta, GA, June 2013.

Sylvia Springorum, Sabine Schulte im Walde, Jason Utt
Detecting Polysemy in Hard and Soft Cluster Analyses of German Preposition Vector Spaces [pdf/bib]
In: Proceedings of the 6th International Joint Conference on Natural Language Processing (IJCNLP). Nagoya, Japan, October 2013.

Sylvia Springorum, Jason Utt, Sabine Schulte im Walde
Regular Meaning Shifts in German Particle Verbs: A Case Study [pdf/bib]
In: Proceedings of the 10th International Conference on Computational Semantics (IWCS). Potsdam, Germany, March 2013.

Marion Weller, Alexander Fraser, Sabine Schulte im Walde
Using Subcategorization Knowledge to improve Case Prediction for Translation to German [pdf/poster/bib]
In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL). Sofia, Bulgaria, August 2013.

2012

Silke Scheible, Sabine Schulte im Walde
Designing a Database of GermaNet-based Semantic Relation Pairs involving Coherent Mini-Networks [pdf/bib]
In: Proceedings of the LREC Workshop Semantic Relations II: Enhancing Resources and Applications. Istanbul, Turkey, May 2012.

Sabine Schulte im Walde, Susanne Borgwaldt, Ronny Jauch
Association Norms of German Noun Compounds [pdf/poster/bib]
In: Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC). Istanbul, Turkey, May 2012.

Sylvia Springorum, Sabine Schulte im Walde, Antje Roßdeutscher
Automatic Classification of German an Particle Verbs [pdf/bib]
In: Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC). Istanbul, Turkey, May 2012.

Resources

Association Norms of German Noun Compounds

References: Schulte im Walde, Borgwaldt and Jauch (2012); Schulte im Walde & Borgwaldt (2015)

Bildernetle - A Dataset of German Noun-to-ImageNet Mappings

Reference: Roller and Schulte im Walde (2013)

Compositionality Ratings

References: Hartmann (2008); von der Heide and Borgwaldt (2009); Schulte im Walde et al. (2013); Bott and Schulte im Walde (2015)

Database of Paradigmatic Semantic Relation Pairs

Reference: Scheible and Schulte im Walde (2014)

Deep Semantic Analogies

Reference: Köper, Scheible and Schulte im Walde (2015)

DUDEN Senses and Synonyms for German Particle Verbs

References: Wittmann et al. (2014; 2016)

Feature Norms of German Noun Compounds

Reference: Roller and Schulte im Walde (2014)

GermaNet-based Semantic Relation Pairs involving Coherent Mini-Networks

Reference: Scheible and Schulte im Walde (2012)

Ghost-NN: A Representative Gold Standard of German Noun-Noun Compounds

Reference: Schulte im Walde et al. (2016)

SubCat-Extractor - Induction of Verb Subcategorisation from Dependency Parses

Reference: Scheible et al. (2013)

Subcategorisation Database for German MATE Parses

Reference: Scheible et al. (2013)

Cooperations
  • Susanne Borgwaldt (Germanistisches Seminar, Universität Siegen): associations and compositionality of German compound nouns
  • Tibor Kiss and Antje Müller (Theoretische Linguistik/Computerlinguistik, Ruhr-Universität Bochum): preposition senses
  • Steffen Koch (Institut für Visualisierung und Interaktive Systeme, Universität Stuttgart): visualisation of ambiguous words
  • Alessandro Lenci (Dipartimento di Linguistica, Università di Pisa): association norms, feature norms, and semantic relations

 

Contact
Contact person
PD Dr. Sabine Schulte im Walde
Phone 0049 711 685-84584
E-Mail
Secretary
Sybille Laderer
Sekretariat
Phone 0049 711 685-81363
Fax0049 711 685-81366
E-Mail
Address
Universität Stuttgart
Institut für Maschinelle Sprachverarbeitung
Pfaffenwaldring 5b
70569 Stuttgart
Deutschland