The IMS HOTCoref system is a data-driven coreference resolution system . It models coreference within a document as a directed rooted tree. For learning it adopts the idea of latent antecedents and exploits the tree structure for the purpose of non-local (with respect to a single pair of mentions) features.
The name HOTCoref stands for Higher Order Tree Coreference. Higher order features is a term often used to describe non-local features in the context of dependency parsing.
The system obtains the best results published to date on all languages from the CoNLL 2012 Shared Task. It is written entirely in Java and is thus platform independent. The download package below includes binaries and sources and a description how to replicate the experiments from the paper.
The system is licensed under the GNU General Public License (GPL). For questions contact Anders Björkelund (email@example.com)
The system can be downloaded here.
Here's also two pre-trained models for English. They were both trained on the concatenation of the English training and development data.
- English local model -- train+dev-eng-fo-opt.mdl
- English non-local model (use the switch -beam 20 when applying) -- train+dev-eng-nho7-opt.mdl