Institut

Studium

Forschung


 

Experimental Settings for "Towards Joint Morphological Analysis and Dependency Parsing for Turkish", DepLing 2013

 

Data

We used the detached version of the Turkish treebank.  The original data has both coarse- and fine-grained POS tags. The parsers used in experiments expect the CoNLL09 format where there is only one POS field. We used only the fine-grained tags in the experiments. 

Parser Settings

The graph-based parser:

java -Xmx24g  -cp anna-660.jar is2.parser52LX2CS.Parser -train trainFile  -model modelFile -test inFile -out parsedFile -eval goldFile

The joint parser:

java -Xmx24g  -cp anna-660.jar is2.transitionR5ysp7b.Parser -train trainFile -test inFile -out parsedFile  -model modelFile -eval goldFile  -i 25  -hsize 400000001 -cores 12  -beam 80 -2nd abcd -3rd abc -1st a -tsize 2 -tnumber 10 -ti 10  -x train:test -thsize 90000001 -tthreshold 0.3 -tx 3  -decoder nv -half 4  -tt 25

The parameter descriptions for the joint parser can be found here.

 Evaluation

We used an IG-based evaluator from Gülşen Eryiğit. The files to be compared are in CoNLL06 format. 

java -cp . Parsing_RawdataConll_Evaluator goldFile inFile

For the upper bounds with 100% MWE, we set the parameter

acceptMWElabelscorrect=true;

within the code.