Institute

Studying

Research


 

Grammatikalisierung deutscher Präpositionen - Test Set

Type ExperimentData
Title Grammaticalization of German Prepositions - Test Set
Author Dominik Schlechtweg, Sabine Schulte im Walde

Description

This data collection supplementing the paper referenced below contains:

  • a test set containing 206 German prepositions with 4 different degrees of grammaticalization as identified by Di Meola (2014). It comes as a tab-separated csv file where each line corresponds to one preposition and has the form

word_forms POS_tags degree

Different word forms for a particular preposition are separated by '/', while different words in multi-word prepositions are separated by '_'. POS-tags are separated by ','.

  • the measure predictions coming as a tab-separated csv file where each line corresponds to one preposition and has the form

word_forms entropy frequency types degree

The value '-999' means that a word form was not found in the corpus after preprocessing.

Find more information in the paper referenced below.


Reference

Dominik Schlechtweg and Sabine Schulte im Walde. 2018. Distribution-based Prediction of the Degree of Grammaticalization for German Prepositions. In Cuskley, C., Flaherty, M., Little, H., McCrohon, L., Ravignani, A. & Verhoef, T. (Eds.): The Evolution of Language: Proceedings of the 12th International Conference (EVOLANGXII).


Download

The resources are freely available for education, research and other non-commercial purposes. For download, click here.