Dataset of Literal and Non-Literal Language Usage for German Particle Verbs

Dataset of Literal and Non-Literal Language Usage for German Particle Verbs

Dataset of Literal and Non-Literal Language Usage for German Particle Verbs

Typ
ExperimentData
Autor
Maximilian Köper & Sabine Schulte im Walde

This resource contains a collection of 6436 sentences with human annotations on (non-)literalness. The data are part-of-speech-tagged and dependency-parsed sentences for 4174 literal and 2262 non-literal uses across 159 particle verbs and 10 particles. Three German native speakers with a linguistic background have annotated the sentences for their degree of (non-)literalness.

We provide a single file for each particle verb. One line before every sentence shows the binary gold-standard class (literal vs. non-literal), followed by the average rating scores. The rating scores range from literal (0) to non-literal (5). An 'X' in the second column indicates the position of the particle verb; in syntactically separate cases, this is the position of the base verb.

Example sentences:

NON-LITERAL 5
1 Politisch _ politisch _ ADJD _ degree=pos -1 6 _ MO _ _
2 hat _ haben _ VAFIN _ number=sg|person=3|tense=pres|mood=ind -1 0 _ -- _ _
3 ihn _ ihn _ PPER _ case=acc|number=sg|gender=masc|person=3 -1 6 _ OA _ _
4 ja _ ja _ ADV _ _ -1 6 _ MO _ _
5 Jelzin _ Jelzin _ NE _ case=acc|number=sg|gender=* -1 2 _ SB _ _
6 abgesägt X gesägen _ VVPP _ _ -1 2 _ OC _ _

LITERAL 0
1 Ich _ ich _ PPER _ case=nom|number=sg|gender=*|person=1 -1 2 _ SB _ _
2 musste _ mussen _ VVFIN _ number=sg|person=1|tense=pres|mood=ind -1 0 _ -- _ _
3 für _ für _ APPR _ _ -1 2 _ MO _ _
4 Schlossherren _ Schlossherr _ NN _ case=acc|number=pl|gender=masc -1 3 _ NK _ _
5 Bäume _ Baum _ NN _ case=acc|number=pl|gender=masc -1 6 _ OA _ _
6 absägen X absägen _ VVINF _ _ -1 2 _ OC _ _
7 , _ -- _ $, _ _ -1 6 _ -- _ _
8 damit _ damit _ KOUS _ _ -1 13 _ CP _ _
9 die _ der _ ART _ case=nom|number=sg|gender=fem -1 10 _ NK _ _
10 Holz _ Holz _ NN _ case=nom|number=sg|gender=neut -1 13 _ SB _ _
11 zum _ zu _ APPRART _ case=dat|number=sg|gender=neut -1 10 _ MNR _ _
12 Heizen _ heizen _ NN _ case=dat|number=sg|gender=neut -1 11 _ NK _ _
13 hatten _ haben _ VAFIN _ number=pl|person=3|tense=past|mood=ind -1 6 _ MO _ _
14 . _ -- _ $. _ _ -1 13 _ -- _ _

Referenz

Maximilian Köper, Sabine Schulte im Walde (2016)
Distinguishing Literal and Non-Literal Usage of German Particle Verbs
In: Proceedings of the 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT). San Diego, CA.

Download

SemRel

Logo der Forschergruppe SemRel
 

Forschergruppe SemRel

Dieses Bild zeigt Sabine Schulte im Walde

Sabine Schulte im Walde

Prof. Dr.

Akademische Rätin

Zum Seitenanfang