Position within the page tree

Institute for Natural Language Processing
Research
Resources
Experiment-Data
Compositionality Ratings

Compositionality Ratings

Compositionality ratings are human ratings on the degree of compositionality of compounds

Compositionality Ratings

Type

ExperimentData

Author

Sabine Schulte im Walde

Description

Compositionality ratings are human ratings on the degree of compositionality of compounds. There are two basic versions of compositionality ratings:

Humans are asked to rate the degree of compositionality of a compound as a whole, i.e., without explicitly referring to the constituents.
Example: On a scale between 0 (definitely opaque) and 6 (definitely transparent), how compositional is the compound Löwenzahn 'lion's tooth'?
Humans are asked to rate the degree of compositionality of a compound with regard to one or more constituents.
Example: On a scale between 0 (definitely opaque) and 6 (definitely transparent), how compositional is the compound Löwenzahn 'lion's tooth' with regard to the head noun Löwe 'lion'?

We have collected several datasets of compositionality ratings for German compounds.

Compositionality Ratings for German Noun Compounds

(collected by Susanne Borgwaldt and Sabine Schulte im Walde)

von der Heide and Borgwaldt (2009) created a set of 450 concrete, depictable German noun compounds and collected human ratings on compositionality for all their 450 compounds. The compounds were distributed over 5 lists, and 270 participants judged the degree of compositionality of the compounds with respect to their first as well as their second constituent, on a scale between 1 (definitely opaque) and 7 (definitely transparent). For each compound-constituent pair, they collected judgements from 30 participants, and calculated the rating mean and the standard deviation.

We disregarded noun compounds with more than two constituents (in some cases, the modifier or the head was complex itself) as well as compounds where the modifiers were not nouns, thus deriving at a subset of the 450 compounds including 244 two-part noun-noun compounds. A second experiment collected human ratings on compositionality for our subset. In this case, we asked the participants to provide a unique score for each compound as a whole, again on a scale between 1 and 7. The collection was performed via Amazon Mechanical Turk (AMT) and resulted in 27-34 ratings per target compound. For each of the compounds we calculated the rating mean and the standard deviation.

(collected by Sabine Schulte im Walde, Anna Hätty, Stefan Bott and Nana Khvtisavrishvili)

Ghost-NN is a gold standard of German noun-noun compounds including 868 compounds annotated with corpus frequencies of the compounds and their constituents, productivity and ambiguity of the constituents, semantic relations between the constituents, and compositionality ratings of compound-constituent pairs. Moreover, a subset of the compounds containing 180 compounds is balanced for the productivity of the modifiers (distinguishing low/mid/high productivity) and the ambiguity of the heads (distinguishing between heads with 1, 2 and >2 senses). See here for details on the dataset.

Compositionality Ratings for German Particle Verbs

Over the years, we developed two gold standards with compositionality ratings for German particle verbs (PVs). Each of them contains PVs across different particles and was annotated by humans for the degree of compositionality.

(collected by Silvana Hartmann and Sabine Schulte im Walde)

Hartmann (2008) describes the collection of compositionality judgements for 99 German particle verbs across 11 different preposition particles, and across 8 frequency bands, plus one manually chosen verb per particle (to make sure that interesting ambiguous verbs were included). Four independent judges rated the degree of compositionality of the selected particle verbs between 1 (definitely opaque) and 10 (definitely transparent).

(collected by Stefan Bott, Nana Khvtisavrishvili, Max Kisselew and Sabine Schulte im Walde)

Ghost-PV is a gold standard of 400 randomly selected German particle verbs. It is balanced across several particle types and three frequency bands, and accomplished by human ratings on the degree of semantic compositionality. See here for details on the dataset.

Bott and Schulte im Walde (2015) used two preliminary versions of Ghost-PV, containing 354 and 150 PVs.

Reference

Stefan Bott, Nana Khvtisavrishvili, Max Kisselew, Sabine Schulte im Walde
Ghost-PV: A Representative Gold Standard of German Particle Verbs
In: Proceedings of the 5th Workshop on Cognitive Aspects of the Lexicon (CogALex). Osaka, Japan, December 2016.

Stefan Bott, Sabine Schulte im Walde
Exploiting Fine-grained Syntactic Transfer Features to Predict the Compositionality of German Particle Verbs
In: Proceedings of the 11th Conference on Computational Semantics (IWCS). London, UK, 2015.

Silvana Hartmann
Einfluss syntaktischer und semantischer Subkategorisierung auf die Kompositionalität von Partikelverben (in German)
Studienarbeit, Institut für Maschinelle Sprachverarbeitung, Universität Stuttgart, 2008.

Sabine Schulte im Walde, Anna Hätty, Stefan Bott, Nana Khvtisavrishvili
Ghost-NN: A Representative Gold Standard of German Noun-Noun Compounds
In: Proceedings of the 10th Conference on Language Resources and Evaluation (LREC). Portoroz, Slovenia, May 2016.

Sabine Schulte im Walde, Stefan Müller, Stephen Roller
Exploring Vector Space Models to Predict the Compositionality of German Noun-Noun Compounds [pdf/bib]
In: Proceedings of the 2nd Joint Conference on Lexical and Computational Semantics (*SEM). Atlanta, GA, June 2013.

Claudia von der Heide, Susanne Borgwaldt
Assoziationen zu Unter-, Basis- und Oberbegriffen. Eine explorative Studie (in German)
In: Proceedings of the 9th Norddeutsches Linguistisches Kolloquium. 2009.

Download

Please contact the SemRel group to obtain the datasets.

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA).

SemRel

Write e-mail
Research Group Sabine Schulte im Walde

This image shows Sabine Schulte im Walde

Compositionality Ratings

Compositionality Ratings

SemRel

Research Group SemRel

Sabine Schulte im Walde

Audience

Formalities

Services

Organization

Compositionality Ratings

Compositionality Ratings

SemRel

Research Group SemRel

Sabine Schulte im Walde

Here you can reach us

Audience

Formalities

Services

Organization