DIRNDLDIRNDL -- (D)iscourse (I)nformation (R)adio (N)ews (D)atabase for (L)inguistic Analysis -- is a corpus resource based on hourly broadcast German radio news. The textual version of the news is annotated with syntactic information. On top of this, the syntactic phrases are labeled with information status categories (given-new information). The speech version is prosodically annotated, i.e. with pitch accents and prosodic phrase boundaries. As the textual and the speech version slightly deviate from each other due to slips of the tongue, fillers and minor modifications, a (semi-automatic) linking of the two versions was carried out and the results were stored inside the database. With the help of these newly established links, all annotation layers can be accessed for exploring the relations between prosody, syntax and information status. The corpus contains several repetitions of the same news items, which are read with slight changes in their prosody on each occasion.


The DIRNDL_anaphora corpus from Björkelund et al. (2014) can be downloaded here.
The DIRNDL version described in Rösiger and Riester (2015) can be downloaded here.