Institut

Studium

Forschung


 

Experimental Settings for "Turkish Treebank as a Gold Standard for Morphological Disambiguation and Its Influence on Parsing", LREC 2014

 

Data

We combined the morphological analyser (Oflazer, 1994) output and the detached version of the Turkish Treebank in training Sak's (2008) morphological disambiguator.  You can download the training data from here.

Parser Settings

We used Bohnet's (2010) graph-based parser in our experiments. The input to the parser should be in the CoNLL09 format where there is only one POS field. We used only the fine-grained POS tags in the experiments. The model trained on the Turkish Treebank is given here. To parse with this model you can use the command below:

java -Xmx24g  -cp anna-660.jar is2.parser52LX2CS.Parser -model turtb_detached_train.depmodel -test inFile -out parsedFile