DWUG DE Sense: Historical word sense annotations in German

A data set of historical word sense annotations in German

DWUG DE Sense: A data set of historical word sense annotations in German

Type

Dataset

Author

Dominik Schlechtweg

Description

This data collection contains a subset of DWUG DE word usage data annotated with classical word sense definitions (DWUG DE Sense, see data/*/judgments_senses.csv). From these annotations aggregated and cleaned sense labels were derived (labels/*/labels_senses.csv). From these labels we derived additional binary semantic proximity labels between use pairs ('0' for different sense, '1' for same sense, labels/*/labels_proximity.csv) and change labels reflecting sense changes between the two time periods from which word usages were sampled (stats/*/stats_groupings.csv).

The sense labels were derived from the sense annotation by removing instances where not at least 2/3 annotators agree on the label (maj_2/maj_3). Note that the binary proximity labels were derived from the sense annotation, and not directly judged by humans (in contrast to other WUG data sets). Note that consequently also the change scores EARLIER, LATER and COMPARE were not calculated directly from human judgments, but from the inferred binary proximity labels. Please find the code aggregating and cleaning the data, deriving proximity labels and deriving change labels in the WUG repository.

Please find more information on the provided data in the paper referenced below.

Reference

Dominik Schlechtweg. 2023. Human and Computational Measurement of Lexical Semantic Change. PhD thesis. University of Stuttgart.

Download

The resource is available per download.

This image shows Dominik Schlechtweg

Dominik Schlechtweg

Dr.

Junior research group leader

To the top of the page