Vietnamese dataset for similarity and relatedness
- Typ
-
ExperimentData
- Autor
-
Kim Anh Nguyen, Sabine Schulte im Walde, Ngoc Thang Vu
-
This dataset consists of two kinds of datasets: The first dataset, namely ViCon, comprises pairs of synonyms and antonymys across noun, verb, and adjective classes, offerring data to distinguish between similarity and dissimilarity. The second dataset ViSim-400 is a dataset of semantic relation pairs which contains degrees of similarity across five semantic relations, as rated by human judges.
- Referenz
-
Kim-Anh Nguyen, Sabine Schulte im Walde, Ngoc Thang Vu (2018)
Introducing Two Vietnamese Datasets for Evaluating Semantic Models of (Dis-)Similarity and Relatedness
In: Proceedings of the 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT). New Orleans, LA. - Download
Sabine Schulte im Walde
Prof. Dr.Akademische Rätin
Thang Vu
Prof. Dr.Lehrstuhlinhaber Digitale Phonetik, Stiftungsprofessur der Carl-Zeiss-Stiftung