CoInCo - Concepts in Context
- Sebastian Pado
CoInCo (Concepts in Context) is a relatively large English all-words lexical substitution corpus built on the basis of the newswire and fiction genres of the freely available MASC corpus. It covers some 35K tokens of running text in which all 15.5K content words were labaled with at least 6 Synonyms using crowdsourcing methods. Annotators were able to see the whole sentence as well as two sentences of discourse context.
Gerhard Kremer, Katrin Erk,Sebastian Pado, Stefan Thater: What Substitutes Tell Us – Analysis of an “All-Words” Lexical Substitution Corpus. To appear in Proceedings of EACL 2014. Gothenburg, Schweden.