Grammaticalization of German Prepositions - Test Set

A data collection containing 206 German prepositions with 4 different levels of grammaticalization.

Grammaticalization of German Prepositions - Test Set

Type
ExperimentData
Author
Dominik Schlechtweg, Sabine Schulte im Walde
Description

This data collection supplementing the paper referenced below contains:

  • a test set containing 206 German prepositions with 4 different degrees of grammaticalization as identified by Di Meola (2014). It comes as a tab-separated csv file where each line corresponds to one preposition and has the form

word_forms POS_tags degree

Different word forms for a particular preposition are separated by '/', while different words in multi-word prepositions are separated by '_'. POS-tags are separated by ','.

  • the measure predictions coming as a tab-separated csv file where each line corresponds to one preposition and has the form

word_forms entropy frequency types degree

The value '-999' means that a word form was not found in the corpus after preprocessing.

Find more information in the paper referenced below.

Reference

Dominik Schlechtweg and Sabine Schulte im Walde. 2018. Distribution-based Prediction of the Degree of Grammaticalization for German Prepositions. In Cuskley, C., Flaherty, M., Little, H., McCrohon, L., Ravignani, A. & Verhoef, T. (Eds.): The Evolution of Language: Proceedings of the 12th International Conference (EVOLANGXII).

Download
This image shows Sabine Schulte im Walde

Sabine Schulte im Walde

Prof. Dr.

Akademische Rätin (Associate Professor)

Dominik Schlechtweg

Dr.

Employee

To the top of the page