Experimental Settings for "Towards Joint Morphological Analysis and Dependency Parsing for Turkish", DepLing 2013
Data
We used the detached version of the Turkish treebank. The original data has both coarse- and fine-grained POS tags. The parsers used in experiments expect the CoNLL09 format where there is only one POS field. We used only the fine-grained tags in the experiments.
Parser Settings
The graph-based parser:
java -Xmx24g -cp anna-660.jar is2.parser52LX2CS.Parser -train trainFile -model modelFile -test inFile -out parsedFile -eval goldFile
The joint parser:
java -Xmx24g -cp anna-660.jar is2.transitionR5ysp7b.Parser -train trainFile -test inFile -out parsedFile -model modelFile -eval goldFile -i 25 -hsize 400000001 -cores 12 -beam 80 -2nd abcd -3rd abc -1st a -tsize 2 -tnumber 10 -ti 10 -x train:test -thsize 90000001 -tthreshold 0.3 -tx 3 -decoder nv -half 4 -tt 25
The parameter descriptions for the joint parser can be found here.
Evaluation
We used an IG-based evaluator from Gülşen Eryiğit. The files to be compared are in CoNLL06 format.
java -cp . Parsing_RawdataConll_Evaluator goldFile inFile
For the upper bounds with 100% MWE, we set the parameter
acceptMWElabelscorrect=true;
within the code.