=Paper=
{{Paper
|id=Vol-1749/paper30
|storemode=property
|title=Comparing State–of–the–art Dependency Parsers on the Italian Stanford Dependency Treebank
|pdfUrl=https://ceur-ws.org/Vol-1749/paper30.pdf
|volume=Vol-1749
|authors=Alberto Lavelli
|dblpUrl=https://dblp.org/rec/conf/clic-it/Lavelli16
}}
==Comparing State–of–the–art Dependency Parsers on the Italian Stanford Dependency Treebank==
Comparing State-of-the-art Dependency Parsers on the Italian Stanford Dependency Treebank Alberto Lavelli FBK-irst via Sommarive, 18 - Povo I-38123 Trento (TN) - ITALY lavelli@fbk.eu Abstract the first four editions of the EVALITA evaluation campaign (Bosco et al., 2008; Bosco et al., 2009; English. In the last decade, many accu- Bosco and Mazzei, 2011; Bosco et al., 2014). In rate dependency parsers have been made the 2014 edition, the task on dependency parsing publicly available. It can be difficult for exploited the Italian Stanford Dependency Tree- non-experts to select a good off-the-shelf bank (ISDT), a treebank featuring an annotation parser among those available. This is even based on Stanford Dependencies (de Marneffe and more true when working on languages dif- Manning, 2008). ferent from English, because parsers have This paper is a follow-up of Lavelli (2014b) been tested mainly on English treebanks. and reports the experience in applying an up- Our analysis is focused on Italian and re- dated list of state-of-the-art dependency parsers lies on the Italian Stanford Dependency on ISDT. It can be difficult for non-experts to Treebank (ISDT). This work is a contri- select a good off-the-shelf parser among those bution to help non-experts understand how available. This is even more true when working difficult it is to apply a specific depen- on languages different from English, given that dency parser to a new language/treebank parsers have been tested mainly on English tree- and choose a parser that meets their needs. banks (and in particular on the WSJ portion of Italiano. Nell’ultimo decennio sono stati the PennTreebank). This work is a contribution to resi disponibili molti analizzatori sintattici help practitioners understand how difficult it is to a dipendenza. Per i non esperti può es- apply a specific dependency parser to a new lan- sere difficile sceglierne uno pronto all’uso guage/treebank and choose a parser to optimize tra quelli disponibili. A maggior ragione their desired speed/accuracy trade-off. se si lavora su lingue diverse dall’inglese, As in many other NLP fields, there are very few perché gli analizzatori sono stati appli- comparative articles where different parsers are di- cati soprattutto su treebank inglesi. La rectly run by the authors and their performance nostra analisi è dedicata all’italiano e compared (Daelemans and Hoste, 2002; Hoste et si basa sull’Italian Stanford Dependency al., 2002; Daelemans et al., 2003). Most of the pa- Treebank (ISDT). Questo articolo è un pers simply present the results of a newly proposed contributo per aiutare i non esperti a approach and compare them with the results re- capire quanto è difficile applicare un anal- ported in previous articles. In other cases, the pa- izzatore a una nuova lingua/treebank e a pers are devoted to the application of the same tool scegliere quello più adatto. to different languages/treebanks. A notable excep- tion is the study reported in Choi et al. (2015), where the authors present a comparative analysis 1 Introduction of ten leading statistical dependency parsers on a In the last decade, there has been an increasing in- multi-genre corpus of English. terest in dependency parsing, witnessed by the or- It is important to stress that the comparison pre- ganisation of various shared tasks, e.g. Buchholz sented in this paper concerns tools used more or and Marsi (2006), Nivre et al. (2007), Seddah et al. less out of the box and that the results cannot be (2013), Seddah et al. (2014). Concerning Italian, used to compare specific characteristics like: pars- there have been tasks on dependency parsing in ing algorithms, learning systems, . . . 2 Parsers 2.3) is a C++ package that implements non- projective graph-based dependency parsing ex- The choice of the parsers used in this study ploiting third-order features. The approach uses started from the ones already applied in a previous AD3 , an accelerated dual decomposition algo- study (Lavelli, 2014b), i.e. MaltParser, the MATE rithm extended to handle specialized head au- dependency parsers, TurboParser, and ZPar. We tomata and sequential head bigram models. then identified a few other freely available de- ZPar (Zhang and Nivre, 2011) (version 0.75) pendency parsers that have shown state-of-the- is a transition-based parser implemented in C++. art performance. Some of such parsers are in- ZPar supports multiple languages and multiple cluded in the study in Choi et al. (2015) and oth- grammar formalisms. ZPar has been most heavily ers have been made publicly available more re- developed for Chinese and English, while it pro- cently. The additional parsers included in this pa- vides generic support for other languages. It lever- per are DeSR, the Stanford Neural Network de- ages a global discriminative training and beam- pendency parser, EmoryNLP, RBG, YARA Parser, search framework. and LSTM parser. DeSR (Attardi and Dell’Orletta, 2009) version Differently from what was done in the previous 1.4.3 is a shift-reduce dependency parser, which study, this time we have not included approaches uses a variant of the approach of Yamada and Mat- based on combination of parsers’ results, such as sumoto (2003). It is capable of dealing directly ensemble or stacking. They usually obtain top per- with non-projective parsing, by means of specific formance (see e.g. Attardi and Simi (2014) at non-projective transition rules (Attardi, 2006). It EVALITA 2014) but in this case we focus on sim- is highly configurable: one can choose which clas- plicity and ease of use rather than on absolute per- sifier (e.g. SVM or Multi-Layer Perceptron) and formance. Below you may find short descriptions which feature templates to use, and the format of of the parsers reported in the paper. the input, just by editing a configuration file. MaltParser (Nivre et al., 2006) (version 1.8) EmoryNLP (Choi and McCallum, 2013)3 (pre- implements the transition-based approach to de- viously ClearNLP) dependency parser (version pendency parsing, which has two essential com- 1.1.1) uses a transition-based, non-projective pars- ponents: (i) a nondeterministic transition system ing algorithm showing a linear-time speed for both for mapping sentences to dependency trees; (ii) a projective and non-projective parsing. classifier that predicts the next transition for ev- The Stanford neural network dependency ery possible system configuration. MaltParser in- parser (Chen and Manning, 2014)4 is a transition- cludes different built-in transition systems, dif- based parser which produces typed dependency ferent classifiers and techniques for recovering parses using a neural network which uses word non-projective dependencies with strictly projec- embeddings as features besides forms and POS tive parsers. tags. It also uses no beam. The MATE tools1 include both a graph-based RBG (Lei et al., 2014; Zhang et al., 2014b; parser (Bohnet, 2010) and a transition-based Zhang et al., 2014a)5 is based on a low-rank fac- parser (Bohnet and Nivre, 2012; Bohnet and torization method that enables to map high dimen- Kuhn, 2012). For the languages of the 2009 sional feature vectors into low dimensional repre- CoNLL Shared Task, the graph-based MATE sentations. The method maintains the parameters parser reached accuracy scores similar or above as a low-rank tensor to obtain low dimensional the top performing systems with fast processing representations of words in their syntactic roles, (obtained with the use of Hash Kernels and par- and to leverage modularity in the tensor for easy allel algorithms). The transition-based MATE training with online algorithms. parser is a model that takes into account complete YARA Parser (Rasooli and Tetreault, 2015)6 is structures as they become available to rescore the an implementation of the arc-eager dependency elements of a beam, combining the advantages of model. It uses an average structured perceptron transition-based and graph-based approaches. TurboParser (Martins et al., 2013)2 (version 3 http://nlp.mathcs.emory.edu/ 4 http://nlp.stanford.edu/software/ 1 nndep.shtml https://code.google.com/p/mate-tools/ 2 5 http://www.ark.cs.cmu.edu/ https://github.com/taolei87/RBGParser 6 TurboParser/ https://github.com/yahoo/YaraParser as classifier and a beam size of 64. The feature 4 Experiments setting is from Zhang and Nivre (2011) with ad- The level of interaction with the authors of the ditional Brown cluster features. parsers varied. For MaltParser, MATE parsers, LSTM parser (Dyer et al., 2015; Ballesteros et TurboParser, and ZPar we have mainly exploited al., 2015)7 is a transition based dependency parser the experience gained in the context of EVALITA with state embeddings computed by LSTM RNNs 2014 (Lavelli, 2014a). and an alternative char-based model exploiting Concerning MaltParser, in addition to using character embeddings as features. Both the mod- the best performing configuration at EVALITA els are applied in the experiments. 2014 (Nivre’s arc-eager, PP-head), we have used The list of parsers is still in progress because MaltOptimizer12 (Ballesteros and Nivre, 2014) to the field of dependency parsing is in constant evo- identify the best configuration. This was done to lution. In mid-May, SyntaxNet, the dependency be fair to the other parsers, given that MaltParser’s parser by Google, was made publicly available; a best configuration was the result of extensive fea- few days later BIST parser (that claims to be “A ture selection at the CoNLL 2006 shared task. Ac- faster and more accurate parser than Google’s Mc- cording to MaltOptimizer, the best configuration Parseface”) was announced to become public. is Nivre’s arc-standard. SyntaxNet (Andor et al., 2016)8 , BIST As for the MATE parsers, we have applied both parser (Kiperwasser and Goldberg, 2016)9 , and the graph-based and the transition-based parser. spaCy10 are not yet included in our study because TurboParser was applied using the three stan- we are still trying to make them working in a sat- dard configurations (basic, standard, full). isfactory way. Concerning ZPar, the main difficulty emerged in 2014 (i.e., the fact that sentences with more than 3 Data Set 100 tokens needed 70 GB of RAM) is no longer present and so its use is rather straightforward. The experiments reported in the paper are per- As for the new parsers, the only problems dur- formed on the Italian Stanford Dependency Tree- ing installation concerned an issue with the ver- bank (ISDT) (Bosco et al., 2013) version 2.0 re- sion of the C++ compiler needed for successfully leased in the context of the EVALITA 2014 evalu- compiling LSTM parser. ation campaign on Dependency Parsing for Infor- For some of the parsers there is the possibil- mation Extraction (Bosco et al., 2014)11 . There ity of exploiting word embeddings (RBG, Stan- are three main novelties with respect to the pre- ford parser, LSTM, EmoryNLP) or Brown clus- viously available Italian treebanks: (i) the size of tering (YARA parser). As for word embeddings the dataset, much bigger than the resources used in (WEs), we exploited the following (both built us- the previous EVALITA campaigns; (ii) the annota- ing word2vec): tion scheme, compliant to de facto standards at the level of both representation format (CoNLL) and • word embeddings of size 300 learned on adopted tagset (Stanford Dependency Scheme); WackyPedia/itWaC (a corpus of more than 1 (iii) its being defined with a specific view to sup- billion tokens)13 ; porting information extraction tasks, a feature in- • word embeddings of size 50 produced herited from the Stanford Dependency scheme. in the project PAISÀ (Piattaforma per The training set contains 7,414 sentences l’Apprendimento dell’Italiano Su corpora (158,561 tokens), the development set 564 sen- Annotati)14 on a corpus of 250 million to- tences (12,014 tokens), and the test set 376 sen- kens. tences (9,066 tokens). In general, WEs of size 300 produced an in- 7 https://github.com/clab/lstm-parser crease in performance, while those of size 50 pro- 8 https://github.com/tensorflow/models/ duced a decrease in performance (with the excep- tree/master/syntaxnet 9 12 https://github.com/elikip/bist-parser http://nil.fdi.ucm.es/maltoptimizer/ 10 13 https://spacy.io/, https://github.com/ http://clic.cimec.unitn.it/ spacy-io/spaCy ˜georgiana.dinu/down/ 11 14 http://www.evalita.it/2014/tasks/dep_ http://www.corpusitaliano.it/en/ par4IE. index.html LAS UAS LA RBG (full, w/ WEs - size=300) 87.72 90.00 93.03 RBG (standard, w/ WEs - size=300) 87.63 89.91 93.03 RBG (full, w/o WEs) 87.33 89.94 92.41 RBG (standard, w/o WEs) 87.33 89.86 92.43 MATE transition-based 87.07 89.69 92.30 MATE graph-based 86.91 89.53 92.67 TurboParser (model type=full) 86.53 89.20 92.22 TurboParser (model type=standard) 86.45 88.96 92.29 ZPar 86.32 88.65 92.40 LSTM (EMNLP 2015, char-based w/ WEs - size=300) 86.07 88.96 92.15 RBG (basic, w/o WEs) 85.99 88.53 91.71 MaltParser (Nivre eager -PP head) 85.82 88.29 91.62 EmoryNLP (w/o WEs) 85.30 87.68 91.51 TurboParser (model type=basic) 84.90 87.28 91.26 DeSR (MLP) 84.61 87.18 90.79 MaltParser (Nivre standard - MaltOptimizer) 84.44 87.17 90.94 LSTM (ACL 2015, w/ WEs - size=300) 84.20 87.13 90.80 LSTM (EMNLP 2015, char-based w/o WEs) 84.13 87.32 90.75 YARA parser (w/o BCs) 83.87 86.79 90.34 LSTM (ACL 2015, w/o WEs) 83.86 86.95 90.56 Stanford NN dependency parser (w/ WEs - size=50) 83.68 86.50 90.85 Table 1: Results on the EVALITA 2014 test set without considering punctuation, in terms of Labeled Attachment Score (LAS), Unlabeled Attachment Score (UAS) and Label Accuracy (LA). tion of the Stanford NN dependency parser, which 5 Conclusions produced results comparable to other parsers with WEs of size 50 and absurdly low results with those In the paper we have reported on work in progress of size 300). We were not able to successfully run on the comparison between several state-of-the-art the EmoryNLP parser with WEs. The use of WEs dependency parsers on the Italian Stanford Depen- needs further investigation. dency Treebank (ISDT). As for the use of Brown clusters (BCs), we are We are already working to widen the scope of still working to build suitable resources for Ital- the comparison including more parsers and to per- ian, so the YARA Parser was used with standard form an analysis of the results obtained by the dif- settings and without Brown clusters. ferent parsers considering not only their perfor- mance but also their behaviour in terms of speed, The experiments were performed using the CPU load at training and parsing time, ease of use, splits provided by the EVALITA 2014 organisers: licence agreement, . . . training on the training set, tuning (if any) using The next step would be to apply the parsers in the development set and final test on the test set. a multilingual setting, exploiting the availability In Table 1 we report the parser results ranked of treebanks based on Universal Dependencies in according to decreasing Labeled Accuracy Score many languages (Nivre et al., 2016)15 . (LAS), not considering punctuation. We have grouped together the parsers if the differences be- Acknowledgments tween their results (in terms of LAS) are not statis- tically significant (computation performed using We thank the authors of the parsers for making D EPENDA BLE (Choi et al., 2015)). them freely available, for kindly answering our The results obtained by the best system sub- questions and for providing useful suggestions. mitted to the official evaluation at EVALITA We thank the reviewers for valuable suggestions 2014 (Attardi and Simi, 2014) are: 87.89 (LAS), to improve this article. 90.16 (UAS). More details about the task and the results obtained by the participants are available 15 in Bosco et al. (2014). http://universaldependencies.org/ References Cristina Bosco, Alessandro Mazzei, Vincenzo Lom- bardo, Giuseppe Attardi, Anna Corazza, Alberto Daniel Andor, Chris Alberti, David Weiss, Aliaksei Lavelli, Leonardo Lesmo, Giorgio Satta, and Maria Severyn, Alessandro Presta, Kuzman Ganchev, Slav Simi. 2008. Comparing Italian parsers on a com- Petrov, and Michael Collins. 2016. Globally nor- mon treebank: the EVALITA experience. In Pro- malized transition-based neural networks. CoRR, ceedings of LREC 2008. abs/1603.06042. Giuseppe Attardi and Felice Dell’Orletta. 2009. Re- Cristina Bosco, Simonetta Montemagni, Alessandro verse revision and linear tree combination for de- Mazzei, Vincenzo Lombardo, Felice DellOrletta, pendency parsing. In Proceedings of Human Lan- and Alessandro Lenci. 2009. Evalita09 parsing guage Technologies: The 2009 Annual Conference task: comparing dependency parsers and treebanks. of the North American Chapter of the Association In Proceedings of EVALITA 2009. for Computational Linguistics, Companion Volume: Short Papers, pages 261–264, Boulder, Colorado, Cristina Bosco, Simonetta Montemagni, and Maria June. Association for Computational Linguistics. Simi. 2013. Converting Italian treebanks: Towards an Italian Stanford Dependency Treebank. In Pro- Giuseppe Attardi and Maria Simi. 2014. Dependency ceedings of the 7th Linguistic Annotation Workshop parsing techniques for information extraction. In and Interoperability with Discourse, pages 61–69, Proceedings of EVALITA 2014. Sofia, Bulgaria, August. Association for Computa- Giuseppe Attardi. 2006. Experiments with a multi- tional Linguistics. language non-projective dependency parser. In Pro- ceedings of the Tenth Conference on Computational Cristina Bosco, Felice Dell’Orletta, Simonetta Monte- Natural Language Learning (CoNLL-X), pages 166– magni, Manuela Sanguinetti, and Maria Simi. 2014. 170, New York City, June. Association for Compu- The EVALITA 2014 dependency parsing task. In tational Linguistics. Proceedings of EVALITA 2014. Miguel Ballesteros and Joakim Nivre. 2014. MaltOp- Sabine Buchholz and Erwin Marsi. 2006. CoNLL- timizer: Fast and effective parser optimization. Nat- X shared task on multilingual dependency parsing. ural Language Engineering, FirstView:1–27, 10. In Proceedings of the Tenth Conference on Com- putational Natural Language Learning (CoNLL-X), Miguel Ballesteros, Chris Dyer, and Noah A. Smith. pages 149–164, New York City, June. Association 2015. Improved transition-based parsing by mod- for Computational Linguistics. eling characters instead of words with LSTMs. In Proceedings of the 2015 Conference on Empirical Danqi Chen and Christopher Manning. 2014. A fast Methods in Natural Language Processing, pages and accurate dependency parser using neural net- 349–359, Lisbon, Portugal, September. Association works. In Proceedings of the 2014 Conference on for Computational Linguistics. Empirical Methods in Natural Language Process- Bernd Bohnet and Jonas Kuhn. 2012. The best of ing (EMNLP), pages 740–750, Doha, Qatar, Octo- both worlds – a graph-based completion model for ber. Association for Computational Linguistics. transition-based parsers. In Proceedings of the 13th Conference of the European Chapter of the Associ- Jinho D. Choi and Andrew McCallum. 2013. ation for Computational Linguistics, pages 77–87, Transition-based dependency parsing with selec- Avignon, France, April. Association for Computa- tional branching. In Proceedings of the 51st An- tional Linguistics. nual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1052– Bernd Bohnet and Joakim Nivre. 2012. A transition- 1062, Sofia, Bulgaria, August. Association for Com- based system for joint part-of-speech tagging and la- putational Linguistics. beled non-projective dependency parsing. In Pro- ceedings of the 2012 Joint Conference on Empiri- Jinho D. Choi, Joel Tetreault, and Amanda Stent. 2015. cal Methods in Natural Language Processing and It depends: Dependency parser comparison using a Computational Natural Language Learning, pages web-based evaluation tool. In Proceedings of the 1455–1465, Jeju Island, Korea, July. Association for 53rd Annual Meeting of the Association for Compu- Computational Linguistics. tational Linguistics and the 7th International Joint Bernd Bohnet. 2010. Top accuracy and fast depen- Conference on Natural Language Processing (Vol- dency parsing is not a contradiction. In Proceedings ume 1: Long Papers), pages 387–396, Beijing, of the 23rd International Conference on Computa- China, July. Association for Computational Linguis- tional Linguistics (Coling 2010), pages 89–97, Bei- tics. jing, China, August. Coling 2010 Organizing Com- mittee. Walter Daelemans and Véronique Hoste. 2002. Eval- uation of machine learning methods for natural lan- Cristina Bosco and Alessandro Mazzei. 2011. The guage processing tasks. In Proceedings of the Third EVALITA 2011 parsing task: the dependency track. International Conference on Language Resources In Working Notes of EVALITA 2011, pages 24–25. and Evaluation (LREC 2002), Las Palmas, Spain. Walter Daelemans, Véronique Hoste, Fien De Meul- Joakim Nivre, Johan Hall, Sandra Kübler, Ryan Mc- der, and Bart Naudts. 2003. Combined optimiza- Donald, Jens Nilsson, Sebastian Riedel, and Deniz tion of feature selection and algorithm parameters Yuret. 2007. The CoNLL 2007 shared task on de- in machine learning of language. In Proceedings of pendency parsing. In Proceedings of the CoNLL the 14th European Conference on Machine Learning Shared Task Session of EMNLP-CoNLL 2007, pages (ECML 2003), Cavtat-Dubronik, Croatia. 915–932, Prague, Czech Republic, June. Associa- tion for Computational Linguistics. Marie-Catherine de Marneffe and Christopher D. Man- ning. 2008. The Stanford typed dependencies rep- Joakim Nivre, Marie-Catherine de Marneffe, et al. resentation. In Coling 2008: Proceedings of the 2016. Universal Dependencies v1: A multilingual workshop on Cross-Framework and Cross-Domain treebank collection. In Proceedings of the Tenth Parser Evaluation, pages 1–8, Manchester, UK, Au- International Conference on Language Resources gust. Coling 2008 Organizing Committee. and Evaluation (LREC 2016), Portorož, Slovenia, May. European Language Resources Association Chris Dyer, Miguel Ballesteros, Wang Ling, Austin (ELRA). Matthews, and Noah A. Smith. 2015. Transition- based dependency parsing with stack long short- Mohammad Sadegh Rasooli and Joel R. Tetreault. term memory. In Proceedings of the 53rd Annual 2015. Yara parser: A fast and accurate dependency Meeting of the Association for Computational Lin- parser. CoRR, abs/1503.06733. guistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Djamé Seddah, Reut Tsarfaty, Sandra Kübler, et al. Papers), pages 334–343, Beijing, China, July. Asso- 2013. Overview of the SPMRL 2013 shared ciation for Computational Linguistics. task: A cross-framework evaluation of parsing morphologically rich languages. In Proceedings Véronique Hoste, Iris Hendrickx, Walter Daelemans, of the Fourth Workshop on Statistical Parsing of and Antal van den Bosch. 2002. Parameter Morphologically-Rich Languages, pages 146–182, optimization for machine-learning of word sense Seattle, Washington, USA, October. Association for disambiguation. Natural Language Engineering, Computational Linguistics. 8(4):311–325. Eliyahu Kiperwasser and Yoav Goldberg. 2016. Sim- Djamé Seddah, Sandra Kübler, and Reut Tsarfaty. ple and accurate dependency parsing using bidi- 2014. Introducing the SPMRL 2014 shared task rectional LSTM feature representations. CoRR, on parsing morphologically-rich languages. In Pro- abs/1603.04351. ceedings of the First Joint Workshop on Statisti- cal Parsing of Morphologically Rich Languages and Alberto Lavelli. 2014a. Comparing state-of-the-art Syntactic Analysis of Non-Canonical Languages, dependency parsers for the EVALITA 2014 depen- pages 103–109, Dublin, Ireland, August. Dublin dency parsing task. In Proceedings of EVALITA City University. 2014. Yue Zhang and Joakim Nivre. 2011. Transition-based Alberto Lavelli. 2014b. A preliminary comparison dependency parsing with rich non-local features. In of state-of-the-art dependency parsers on the Italian Proceedings of the 49th Annual Meeting of the Asso- Stanford Dependency Treebank. In Proceedings of ciation for Computational Linguistics: Human Lan- the first Italian Computational Linguistics Confer- guage Technologies, pages 188–193, Portland, Ore- ence (CLiC-it 2014). gon, USA, June. Association for Computational Lin- guistics. Tao Lei, Yu Xin, Yuan Zhang, Regina Barzilay, and Tommi Jaakkola. 2014. Low-rank tensors for scor- Yuan Zhang, Tao Lei, Regina Barzilay, and Tommi ing dependency structures. In Proceedings of the Jaakkola. 2014a. Greed is good if randomized: New 52nd Annual Meeting of the Association for Compu- inference for dependency parsing. In Proceedings of tational Linguistics (Volume 1: Long Papers), pages the 2014 Conference on Empirical Methods in Nat- 1381–1391, Baltimore, Maryland, June. Association ural Language Processing (EMNLP), pages 1013– for Computational Linguistics. 1024, Doha, Qatar, October. Association for Com- Andre Martins, Miguel Almeida, and Noah A. Smith. putational Linguistics. 2013. Turning on the turbo: Fast third-order non- Yuan Zhang, Tao Lei, Regina Barzilay, Tommi projective turbo parsers. In Proceedings of the 51st Jaakkola, and Amir Globerson. 2014b. Steps to Annual Meeting of the Association for Computa- excellence: Simple inference with refined scoring tional Linguistics (Volume 2: Short Papers), pages of dependency trees. In Proceedings of the 52nd 617–622, Sofia, Bulgaria, August. Association for Annual Meeting of the Association for Computa- Computational Linguistics. tional Linguistics (Volume 1: Long Papers), pages Joakim Nivre, Johan Hall, and Jens Nilsson. 2006. 197–207, Baltimore, Maryland, June. Association MaltParser: A data-driven parser-generator for de- for Computational Linguistics. pendency parsing. In Proceedings of the 5th In- ternational Conference on Language Resources and Evaluation (LREC), pages 2216–2219.