GerDraCor-Coref - German Drama Corpus for Coreference

A corpus with coreference annotations on German dramatic texts

GerDraCor-Coref - German Drama Corpus for Coreference




Janis Pagel, Nils Reiter


The corpus contains ca. 40 acts taken from 30 German dramas which are annotated with coreference information. The texts were written between 1730 and 1920. The data is available in several formats: CoNLL 2012, TEI and XMI. The CoNLL data was enriched with automatically created linguistic information, such as parts-of-speech and lemmas. The TEI data contains structural information about act and scene boundaries and speaker turns.


Janis Pagel, Nils Reiter. GerDraCor-Coref: A Coreference Corpus for Dramatic Texts in German. In Proceedings of the Language Resources and Evaluation Conference (LREC), pp. 55-64, Marseille, France, May 2020. Url:

This image shows Janis Pagel

Janis Pagel


External PhD Student

This image shows Nils Reiter

Nils Reiter


Former Staff

To the top of the page