Fine-grained Compound Termhood Annotation Dataset
- Type
- Author
Anna Hätty, Sabine Schulte im Walde
We consider term difficulty as part of a tier model for a term's strength of association to a domain. It should naturally align to the idea of a gradual increase of term specificity to the domain: The more difficult or specialised a term is, the more distinctive it is from general language and the more it is associated to a domain. If terms are both general and understandable, it is sometimes hard to distinguish them from general-language words. Thus, the more expert knowledge is needed to understand a term, the stronger it should be associated to a domain. In this dataset, we distinguish four tiers, according to which five human judges annotated 396 German compounds from the cooking domain
- Reference
Anna Hätty, Sabine Schulte im Walde (2018)
Fine-grained Termhood Prediction for German Compound Terms using Neural Networks
In: Proceedings of the COLING Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG). Santa Fe, NM, USA. - Download
Please contact the SemRel group to obtain the data.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA).

Research Group SemRel
- Write e-mail
- Research Group Sabine Schulte im Walde

Sabine Schulte im Walde
Prof. Dr.Akademische Rätin (Associate Professor)