Europarl Nominal Compound Datenbank

Die Europarl Nominal Compound Datenbank (ENCD) wurde automatisch aus Europarl v7 von OPUS extrahiert. Diese Datenbank enthält englische nominale Verbindungen und deren Äquivalente in bis zu neun Sprachen

Europarl Nominal Compound Datenbank

Typ
Corpus
Autor
Patrick Ziering
Beschreibung

The Europarl Nominal Compound Database (ENCD) was automatically extracted from Europarl v7 of OPUS (Tiedemann, 2012 [2]).

This database contains English nominal compounds and their equivalents in up to nine languages:

  • Danish
  • Dutch
  • English (pivot)
  • French
  • German
  • Greek
  • Italian
  • Portuguese
  • Romanian
  • Spanish
  • Swedish

We provide several versions of the database (ranging from optimal recall (CCR0) to optimal precision (CCR4)).

Keywords: noun compound, compound noun, multilingual, cross-lingual, multi-word expression, database, list, resource, dataset

Referenz

[1] Patrick Ziering and Lonneke van der Plas
What good are 'Nominalkomposita' for 'noun compounds':
Multilingual Extraction and Structure Analysis of Nominal Compositions using Linguistic Restrictors

Proceedings of the 25th International Conference on Computational Linguistics (COLING), 2014.

[2] Jörg Tiedemann
Parallel data, tools and interfaces in OPUS.
Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC), 2012.

Download
 

Kontakt IMS

Pfaffenwaldring 5 b, 70569 Stuttgart

 

Webmaster des IMS

Zum Seitenanfang