Institute

Studying

Research


 

Dataset of Literal and Non-Literal Language Usage for German Particle Verbs

Type ExperimentData
Title Dataset of Literal and Non-Literal Language Usage for German Particle Verbs
Author Maximilian Köper & Sabine Schulte im Walde

Description

This resource contains a collection of 6436 sentences with human annotations on (non-)literalness. The data are part-of-speech-tagged and dependency-parsed sentences for 4174 literal and 2262 non-literal uses across 159 particle verbs and 10 particles. Three German native speakers with a linguistic background have annotated the sentences for their degree of (non-)literalness.

We provide a single file for each particle verb. One line before every sentence shows the binary gold-standard class (literal vs. non-literal), followed by the average rating scores. The rating scores range from literal (0) to non-literal (5). An 'X' in the second column indicates the position of the particle verb; in syntactically separate cases, this is the position of the base verb.

Example sentences:

NON-LITERAL 5
1 Politisch _ politisch _ ADJD _ degree=pos -1 6 _ MO _ _
2 hat _ haben _ VAFIN _ number=sg|person=3|tense=pres|mood=ind -1 0 _ -- _ _
3 ihn _ ihn _ PPER _ case=acc|number=sg|gender=masc|person=3 -1 6 _ OA _ _
4 ja _ ja _ ADV _ _ -1 6 _ MO _ _
5 Jelzin _ Jelzin _ NE _ case=acc|number=sg|gender=* -1 2 _ SB _ _
6 abgesägt X gesägen _ VVPP _ _ -1 2 _ OC _ _

LITERAL 0
1 Ich _ ich _ PPER _ case=nom|number=sg|gender=*|person=1 -1 2 _ SB _ _
2 musste _ mussen _ VVFIN _ number=sg|person=1|tense=pres|mood=ind -1 0 _ -- _ _
3 für _ für _ APPR _ _ -1 2 _ MO _ _
4 Schlossherren _ Schlossherr _ NN _ case=acc|number=pl|gender=masc -1 3 _ NK _ _
5 Bäume _ Baum _ NN _ case=acc|number=pl|gender=masc -1 6 _ OA _ _
6 absägen X absägen _ VVINF _ _ -1 2 _ OC _ _
7 , _ -- _ $, _ _ -1 6 _ -- _ _
8 damit _ damit _ KOUS _ _ -1 13 _ CP _ _
9 die _ der _ ART _ case=nom|number=sg|gender=fem -1 10 _ NK _ _
10 Holz _ Holz _ NN _ case=nom|number=sg|gender=neut -1 13 _ SB _ _
11 zum _ zu _ APPRART _ case=dat|number=sg|gender=neut -1 10 _ MNR _ _
12 Heizen _ heizen _ NN _ case=dat|number=sg|gender=neut -1 11 _ NK _ _
13 hatten _ haben _ VAFIN _ number=pl|person=3|tense=past|mood=ind -1 6 _ MO _ _
14 . _ -- _ $. _ _ -1 13 _ -- _ _


Reference

Maximilian Köper, Sabine Schulte im Walde
Distinguishing Literal and Non-Literal Usage of German Particle Verbs [pdf/resource/bib/talk]
In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics:


Download

The resources are freely available for education, research and other non-commercial purposes. For download please click here or contact Maximilian Köper or Sabine Schulte im Walde to obtain the data.