Vietnamese dataset for similarity and relatedness
- Typ
- ExperimentData
- Autor
- Kim Anh Nguyen, Sabine Schulte im Walde, Ngoc Thang Vu
-
This dataset consists of two kinds of datasets: The first dataset, namely ViCon, comprises pairs of synonyms and antonymys across noun, verb, and adjective classes, offerring data to distinguish between similarity and dissimilarity. The second dataset ViSim-400 is a dataset of semantic relation pairs which contains degrees of similarity across five semantic relations, as rated by human judges.
- Referenz
-
Kim Anh Nguyen, Sabine Schulte im Walde and Ngoc Thang Vu. Introducing two Vietnamese Datasets for Evaluating Semantic Models of (Dis-)Similarity and Relatedness. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HTL). New Orleans, Louisiana, June 2018.
- Download
-
The resources are freely available for education, research and other non-commercial purposes. For download, click here.

Sabine Schulte im Walde
Akademische Rätin
