GerDraCor-Coref - German Drama Corpus for Coreference

A corpus with coreference annotations on German dramatic texts

GerDraCor-Coref - German Drama Corpus for Coreference

Type

Corpus

Author

Janis Pagel, Nils Reiter

Description

The corpus contains ca. 40 acts taken from 30 German dramas which are annotated with coreference information. The texts were written between 1730 and 1920. The data is available in several formats: CoNLL 2012, TEI and XMI. The CoNLL data was enriched with automatically created linguistic information, such as parts-of-speech and lemmas. The TEI data contains structural information about act and scene boundaries and speaker turns.

Reference

Janis Pagel, Nils Reiter. GerDraCor-Coref: A Coreference Corpus for Dramatic Texts in German. In Proceedings of the Language Resources and Evaluation Conference (LREC), pp. 55-64, Marseille, France, May 2020. Url: http://www.lrec-conf.org/proceedings/lrec2020/pdf/2020.lrec-1.7.pdf.

Download

GitHub

To the top of the page