Position within the page tree

Institute for Natural Language Processing
Research
Resources
Tools
HotCoref DE

HotCoref DE

The German coreference system described in the LREC 2016 paper and in the ACL paper

HotCoref DE

Type

Tool

Author

Ina Rösiger

Description

Update March 2017:

The update contains

an improved version of the coreference resolver Download link
a conversion tool for linux that converts plain texts into the required input format CoNLL-12 (using the same tools that were used during training) Link to GitHub

The German coreference system described in the LREC 2016 paper [1] and in the ACL paper [2] can be downloaded below.

The download includes

a manual on how to run the resolver
default feature lists as well as an overview of the features that one can play around with
example documents in CoNLL-12 format, including pos tags, parse bits, lemmata, morphological information and named entities (optional)

Note: the resolver is among other things based on the extraction of NPs from the parse bits. Some parsers for German do not annotate NPs inside PPs (=they are flat), so you need to insert them before running the tool.

Here's a manual on how to run the resolver

Pre-trained models

New model:

new model trained on the completeTüBa-D/Z version 10 data using regular processing with the improved version of the coreference resolver available here

Older models: (trained with LREC version)

trained on the complete TüBa-D/Z version 10 data, gold processing available here
trained on the complete TüBa-D/Z version 10 data, regular processing available here
trained on the complete TüBa-D/Z, version 9, regular processing is available here

An older version of the tool (as published in the LREC 2016 paper [1]) can be downloaded here.
The version published in ACL 2015 [2]) can be found here

CoNLL scores as published in [1]:

65.76 (no singletons) on the TüBa-D/Z test set version 10, using gold annotations
48.54 (including singletons) on the TüBa-D/Z test set version 10, using regular annotations

The older performance (as reported in the paper [2], using real preprocessing/predicted annotations only and no gold mention boundary (GB) information) is as follows:

51.61 (no singletons) on the TüBa-D/Z test set version 9
60.35 (including singletons) and 48.61 (without) on the TüBa-D/Z test set version 8 (=SemEval dataset) (in CoNLL score)

This version of the system is licensed under the GNU General Public License. For questions contact Anders Björkelund (firstname@ims.uni-stuttgart.de).

Reference

[1] Ina Rösiger and Jonas Kuhn
IMS HotCoref De: A data-driven co-reference resolver for German
Proceedings of LREC 2016, Portorož, Slovenia 2016.

[2] Ina Rösiger and Arndt Riester
Using prosodic annotations to improve coreference resolution of spoken text.
Proceedings of ACL-IJCNLP 2015, Beijing, China.

Download

Download the German co-reference system and the manual how to run the resolver
Download the older German co-reference system as published in the LREC 2016 paper [1] amd the manual how to run the resolver
Download the older German prototype as published in ACL 2015 [2] and the manual how to run the resolver

Write e-mail
If you have any problems with the website, please directly contact the webmaster.

HotCoref DE

HotCoref DE

General Contact IMS

Pfaffenwaldring 5 b, 70569 Stuttgart

Webmaster of the IMS

Audience

Formalities

Services

Organization

HotCoref DE

HotCoref DE

General Contact IMS

Pfaffenwaldring 5 b, 70569 Stuttgart

Webmaster of the IMS

Here you can reach us

Audience

Formalities

Services

Organization