<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Bosco et al.</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Tree Kernels-based Discriminative Reranker for Italian Constituency Parsers</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Antonio Uvay</string-name>
          <email>antonio.uva@unitn.it</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alessandro Moschitti</string-name>
          <email>amoschitti@gmail.com</email>
        </contrib>
      </contrib-group>
      <pub-date>
        <year>2013</year>
      </pub-date>
      <volume>83</volume>
      <abstract>
        <p>English. This paper aims at filling the gap between the accuracy of Italian and English constituency parsing: firstly, we adapt the Bllip parser, i.e., the most accurate constituency parser for English, also known as Charniak parser, for Italian and trained it on the Turin University Treebank (TUT). Secondly, we design a parse reranker based on Support Vector Machines using tree kernels, where the latter can effectively generalize syntactic patterns, requiring little training data for training the model. We show that our approach outperforms the state of the art achieved by the Berkeley parser, improving it from 84.54 to 86.81 in labeled F1.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Italiano. Questo paper mira a
colmare il gap di accuratezza tra il
constituency parsing dell’Italiano e quello
Inglese: come primo miglioramento,
abbiamo adattato il parser a costituenti per
l’Inglese, Bllip, anche noto come
Charniak parser, per l’Italiano e lo
abbiamo addestrato sul Turin University
Treebank. In seguito, abbiamo progettato un
reranker basato sulle Macchine a Vettori
di Supporto che usano kernel arborei, i
quali possono efficacemente generalizzare
pattern sintattici, richiedendo pochi dati
di training per addestrare il modello. Il
nostro approccio supera lo stato dell’arte
ottenuto con il Berkeley parser,
migliorando la labeled F1 da 84.54 a 86.81.</p>
    </sec>
    <sec id="sec-2">
      <title>1 Introduction</title>
      <p>
        Constituency Syntactic parsing is one of the most
important research lines in Computational
Linguistics. Consequently, a large body of work has
been also devoted to its design for Italian language
        <xref ref-type="bibr" rid="ref1 ref2 ref3 ref4 ref4">(Bosco et al., 2007; Bosco et al., 2009; Bosco and
Mazzei, 2011)</xref>
        . However, the accuracy reported
for the best parser is still far behind the state of the
art of other languages, e.g., English.
      </p>
      <p>
        One noticeable attempt to fill this
technological gap was carried out in the EvalIta
challenge, which proposed a parsing track on both
dependency and constituency parsing for Italian.
Among the several participant systems, the
Berkeley parser
        <xref ref-type="bibr" rid="ref16">(Petrov and Klein, 2007)</xref>
        gave the best
result
        <xref ref-type="bibr" rid="ref11 ref12">(Lavelli and Corazza, 2009; Lavelli, 2011)</xref>
        .
      </p>
      <p>
        At the beginning, the outcome for constituency
parsing computed on TUT
        <xref ref-type="bibr" rid="ref3">(Bosco et al., 2009)</xref>
        was much lower than the one obtained for English
on the Penn Treebank
        <xref ref-type="bibr" rid="ref13">(Marcus et al., 1993)</xref>
        . In
the last EvalIta edition, such gap diminished as the
Italian parser labeled F1 increased from 78.73%
(EvalIta 2009) to 82.96% (EvalIta 2011). Some
years later the parser F1 improved to 83.27%
        <xref ref-type="bibr" rid="ref4">(Bosco et al., 2013)</xref>
        . However, the performance
of the best English parser
        <xref ref-type="bibr" rid="ref14">(McClosky et al., 2006)</xref>
        ,
i.e., 92.1%, is still far away. The main
reason for such gap is the difference in the amount
of training data available for Italian compared to
English. In fact, while Penn Treebank contains
49; 191 sentences/trees, TUT only contains 3; 542
sentences/trees.
      </p>
      <p>
        In presence of scarcity of training data, a
general solution for increasing the accuracy of a
machine learning-based system is the use of more
general features. This way, the probability of
matching training and testing instance
representations is larger, allowing the learning process to find
more accurate optima. In case of syntactic
parsing, we need to generalize either lexical or
syntactic features, or possibly both. However, modeling
such generalization in state-of-the-art parser
algorithms such as the Bllip1
        <xref ref-type="bibr" rid="ref5 ref6 ref8">(Charniak, 2000;
Charniak and Johnson, 2005)</xref>
        is rather challenging. In
particular, the space of all possible syntactic
patterns is very large and cannot be explicitly coded
1https://github.com/BLLIP/bllip-parser
in the model. An easier solution consists in
using such features in a simpler model, which can be
trained to improve the outcome of the main parser,
e.g., selecting one of its best hypotheses. In
particular, tree kernels (TKs) by Moschitti (2006) can be
used for encoding an exponential number of
syntactic patterns in parse rerankers.
      </p>
      <p>In this work, we aim at filling the gap between
English and Italian constituency parsing: firstly,
we adapted Bllip parser, i.e., the most accurate
constituency parser for English, also known as
Charniak parser, for Italian and trained it on TUT.
We designed various configuration files for
defining specific labels for TUT by also defining their
type, although we did not encode head-finding
rules for Italian, needed to complete the parser
adaptation.</p>
      <p>Secondly, we apply rerankers based on Support
Vector Machines (SVMs) using TKs to the k-best
parses produced by Bllip, with the aim of
selecting its best hypotheses. TKs allow us to represent
data using the entire space of subtrees, which
correspond to syntactic patterns of different level of
generality. This representation enables the
training of the reranker with little data. Finally, we
tested our models on TUT, following the EvalIta
setting and compare with other parsers. For
example, we observed an improvement of about 2%,
over the Berkeley parser, i.e., 86.81 vs. 84.54.
2</p>
    </sec>
    <sec id="sec-3">
      <title>Bllip parser</title>
      <p>The Bllip parser is a lexicalized probabilistic
constituency parser. It can be considered a smoothed
PCFG, whose non-terminals encode a wide
variety of manually chosen conditioning information,
such as heads, governors, etc. Such information is
used to derive probability distributions, which, in
turn, are utilized for computing the likelihood of
constituency trees being generated. As described
by McClosky et al. (2006), Bllip uses five
distributions, i.e., the probabilities of the (i) constituent
heads, (ii) constituent part-of-speeches (PoS), (iii)
head-constituents, (iv) left-of-head and (v)
rightof-head constituents. Each probability
distribution is conditioned by five or more features and
backed-off by the probability of lower-order
models in case of rare feature configurations. The
variety of information needed by Bllip to work
properly makes its configuration much harder than
for other parsers, e.g., the Berkeley’ one.
However, Bllip is faster to train than other off-the-shelf
parsers.</p>
      <sec id="sec-3-1">
        <title>2.1 Adapting Bllip to Italian Language</title>
        <p>Bllip adaptation required to create various
configuration files. For example, PoS and bracket labels
observed in training and development sets must be
defined in a file named terms.txt. As labels present
in the TUT are different from those of the Penn
Treebank2, we added them in such file. Then, we
specified the type of labels present in the data, i.e.,
constituent type, open-class PoS, punctuation, etc.</p>
        <p>Finally, it should be noted that, since Bllip is
lexicalized, head-finding rules for Italian should
be specified in the file, headInfo.txt. For example,
the rule, ADJ P !r J J , specifies that the head of
an adjective phrase (ADJP) is the right-most
adjective (JJ). Due to time restriction, we used the
default Bllip rules and leave this task as our
shortterm future work.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>3 Tree Kernel-based Reranker</title>
      <p>We describe three types of TKs and the Preference
Reranker approach using them.</p>
      <sec id="sec-4-1">
        <title>3.1 Tree kernels</title>
        <p>TKs can be used for representing arbitrary tree
structures in kernel machines, e.g., SVMs. They
are a viable alternative to explicit feature design
as they implement the scalar products between
feature vectors as a similarity between two trees.
Such scalar product is computed using efficient
algorithms and it is basically equal to the number of
the common subparts of the two trees.</p>
        <p>Syntactic Tree Kernels (STK) count the
number of common tree fragments, where the latter (i)
contain more than two nodes and (ii) each node is
connected to either all or none of its children. We
also used a variant, called STKb, which adds the
number of common leaves of the comparing trees
in the final subpart count.</p>
        <p>Partial Tree Kernels (PTK) counts a larger
class of tree fragments, i.e., any subset of nodes,
where the latter are connected in the original trees:
clearly, PTK is a generalization of STK.</p>
      </sec>
      <sec id="sec-4-2">
        <title>3.2 Preference Reranker</title>
        <p>Preference reranking is cast as a binary
classification problem, where each instance is a pair hhi; hj i
of tree hypotheses and the classifier decides if hi
is better than hj . The positive training examples
are the pairs, hh1; hii, where h1 has the highest
F1 with respect to the gold standard among the
candidate hypotheses. The negative examples are
2For example, the PoS-tag NN in Penn Treebank
corresponds to tag NOU CS in TUT
obtained inverting the hypotheses in the pairs, i.e.,
hhi; h1i. If the hypotheses have the same score,
the pair is not included in the training set. At
classification time all pairs hhi; hj i generated from the
k-best hypotheses are classified. A positive
classification is a vote for hi, whereas a negative
classification is a vote for hj . The hypothesis associated
with the highest number of votes (or highest sum
of classifier scores) is selected as the best parse.
4</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Experiments</title>
      <p>In these experiments, we first report on the
performance of Bllip for Italian and compare it with
the Berkeley parser. Then, we show that our parse
reranker can be very effective, even in case of use
of small training data.
4.1</p>
      <sec id="sec-5-1">
        <title>Experimental Setup</title>
        <p>Parsing data. The data for training and
testing the constituency parsers come from the TUT
project 3, developed at the University of Turin.
There have been several releases of the dataset:
we used the latest version from EvalIta 2011. The
training set is composed of 3; 542 sentences, while
the test set contains 300 sentences. The set of
PoS-tags includes 97 tags: 68 encoding
morphological features (out of which 19 basic tags) for
pre-terminal symbols (e.g., ADJ, ADVB, NOUN,
etc.) and 29 non-terminal symbols for phrase
constituents (e.g., ADJP, ADVP, NP, etc.).</p>
        <p>Reranking Data. To generate the data for
training the reranker, we apply 10-fold cross validation
to the official TUT training set: we train the based
parser on 9 folds and applied it to the remaining
fold to generate the n-best trees for each of its
sentences. Then, we merge all the 10-labeled folds
to produce the training set of the reranker. This
way, we avoid the bias a parser would have if
applied to the data used for training it. For
generating the test data of the reranker, we simply apply
3http://www.di.unito.it/˜tutreeb/
the base parser (trained on all TUT training data)
to the TUT test set and generate n-hypotheses for
each sentence.</p>
        <p>SVM Reranker. We train the reranker using
SVM-light-TK, which takes both feature vectors
and trees as input to learn a classification model.
The features used for reranking constituency trees
are: (i) the probability and the (inverse) rank of
the hypotheses provided by Bllip and (ii) the
entire syntactic trees used with two types of kernels,
STK and PTK, described in Sec. 3.</p>
        <p>Measures. For evaluating the parsers, we used
the EVALB scoring program, which reports the
Labeled Precision (LP), Labeled Recall (LR),
Labeled F1 (LF) and Exact Match Rate (EMR).
According to the official EvalIta procedure for
evaluating the participant system output, we did not
score the TOP label, ignore all functional labels
attached to non-terminals and include punctuation
in the scoring procedure.</p>
      </sec>
      <sec id="sec-5-2">
        <title>4.2 Bllip base parser results</title>
        <p>We divided the training set in train and validation
sets, where the latter is composed of the last 50
sentences of each of the six sections of the former
for a total of 300 sentences. We train the models
on the training set and tune parameters on the
validation set. Then, we applied the learned model
to the 300 sentences of the test set. Table 1 shows
the results obtained by the Bllip base parser on the
TUT test set. Our parser obtained an LF of 86:28%
for sentences with less than 40 words and a score
of 85:61% for all sentences.</p>
      </sec>
      <sec id="sec-5-3">
        <title>4.3 Comparison with the Berkeley parser</title>
        <p>Table 1 also reports the results of the Berkeley
parser obtained by Bosco et al. (2013). For
comparison purposes, we trained our own version of
the Berkeley parser. In particular, we trained the
parser for 5 split-merge cycles on the whole
training set. We selected such number of cycles
applying 10-fold cross validation on the training set.
Similarly to Bosco et al. (2013), we specialized
Bllip base model
STK
STKb
PTK</p>
        <p>All
85.61
84.49
84.52
85.46</p>
        <p>All
85.61
84.16
84.35
86.41</p>
        <p>All
85.61
84.45
84.38
85.92
punctuation symbols to more specific tags.
However, we used full PoS-tags, as they gave the best
results in cross-fold validation. Indeed, the table
shows that our own version of the Berkeley parser
outperforms the version the one of Bosco et al.
(2013) by 1:27 absolute percent points (84.54 vs.
83.27). The table also reports the results of the
Bllip parser, which outperforms the best result
obtained by the Berkeley parser by 1.07% in LF, i.e.,
85:61 vs. 84:54.
Table 2 reports the LF obtained by different
reranking models, varying: (i) the type of TKs,
(ii) the group of features (i.e., either trees or trees
+ feat.) and (iii) the number, n, of parse trees used
to generate the reranker training data. More in
particular, we experimented with three values for n,
i.e., 10-, 20- and 30-best parse trees. As it can be
seen from the table, PTK constantly outperforms
STK and STKb for any number of parse
hypotheses. This indicates that the subtree features
generated by PTK, which include nodes with any
subset of the children in the original tree, are useful
for improving the parser accuracy.</p>
        <p>Very interestingly, the performance of all
models when trained on 30-best trees give either worse
results (e.g., STKb and STK) or very little
improvement (e.g., PTK) than training on 20-best
parse trees. This may suggest that adding too
many negative examples, largely populating the
lower part of the n-best list may be detrimental.</p>
        <p>The bottom part of Table 1 shows standard
parser evaluation metrics for different reranking
models using different kernel types: only the
kernel models with the highest LF from Table 2 are
reported. The method shows an 1.2% absolute
improvement in LF (from 85.61% to 86.81%) on all
the sentences over the base-parser model (i.e., the
baseline) when using the most powerful kernel,
PTK, and 30-best hypotheses. STK and STKb
show a lower improvement over the baseline of
0:44% and 0:6%, respectively. One interesting
fact is the following: while PTK gives better
results in terms of LF, STK and STKb perform
better in terms of EMR, i.e., the percentage of
sentence parse completely matching gold trees. This
is rather intuitive as the name suggests, Partial
Tree Kernel generates partial subtrees, i.e., partial
production rules as patterns. On one hand, this
can improve the ability of matching syntactic
patterns, thus capturing rules partially expressed by
more than one support vector. On the other hand,
the precision in capturing complete patterns, i.e.,
regarding a complete tree is intuitively decreased.
5</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Related Work and Conclusions</title>
      <p>This work was inspired by Collins and Duffy
(2002) and Collins and Koo (2005), who explored
discriminative approaches for ranking problems.
Their studies were limited to WSJ, though, and
did not explore the use of max-margin classifiers,
i.e., SVMs. The first experiments with SVMs and
TKs were conducted by Shen and Joshi (2003),
who proposed a new SVM-based voting algorithm
making use of preference reranking.</p>
      <p>
        In this paper, we adapted the Charniak parser
for Italian gaining an improvement of 1.07% over
the Berkeley model (indicated by EvalIta as the
state of the art for Italian). Then, our TK-based
reranker further improved it up to 2 absolute
percent points. It should also be noted that our best
reranking result is 3.54 absolute points better than
the best outcome reported in
        <xref ref-type="bibr" rid="ref4">(Bosco et al., 2013)</xref>
        ,
i.e., 83.27.
      </p>
      <p>
        In the future, we would like to integrate (i) the
features developed in the reranking software
available by Johnson and Ural (2010) in our model for
further improving it, (ii) generalizing lexical
features (e.g., embeddings, brown clusters) and
including similarity measures in PTK, i.e., SPTK
        <xref ref-type="bibr" rid="ref4 ref9">(Croce et al., 2011)</xref>
        .
      </p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>A special thank is due to Alberto Lavelli and
Alessandro Mazzei for enabling us to carry out an
exact comparison with their parser.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <source>[Bosco and Mazzei2011] Cristina Bosco and Alessandro Mazzei</source>
          .
          <year>2011</year>
          .
          <article-title>The evalita 2011 parsing task: the constituency track</article-title>
          .
          <source>Working Notes of EVALITA.</source>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [Bosco et al.2007]
          <string-name>
            <given-names>Cristina</given-names>
            <surname>Bosco</surname>
          </string-name>
          , Alessandro Mazzei, and
          <string-name>
            <given-names>Vincenzo</given-names>
            <surname>Lombardo</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>Evalita parsing task: an analysis of the first parsing system contest for italian</article-title>
          .
          <source>Intelligenza artificiale</source>
          ,
          <volume>12</volume>
          :
          <fpage>30</fpage>
          -
          <lpage>33</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [Bosco et al.2009]
          <string-name>
            <given-names>Cristina</given-names>
            <surname>Bosco</surname>
          </string-name>
          , Alessandro Mazzei, and
          <string-name>
            <given-names>Vincenzo</given-names>
            <surname>Lombardo</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Evalita09 parsing task: constituency parsers and the penn format for italian</article-title>
          .
          <source>Proceedings of EVALITA</source>
          ,
          <volume>9</volume>
          :
          <fpage>1794</fpage>
          -
          <lpage>1801</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [Bosco et al.2013]
          <string-name>
            <given-names>Cristina</given-names>
            <surname>Bosco</surname>
          </string-name>
          , Alessandro Mazzei, and
          <string-name>
            <given-names>Alberto</given-names>
            <surname>Lavelli</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Looking back to the evalita constituency parsing task: 2007-2011</article-title>
          .
          <article-title>In Evaluation of Natural Language and Speech Tools for Italian</article-title>
          , pages
          <fpage>46</fpage>
          -
          <lpage>57</lpage>
          . Springer.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <source>[Charniak and Johnson2005] Eugene Charniak and Mark Johnson</source>
          .
          <year>2005</year>
          .
          <article-title>Coarse-to-fine n-best parsing and maxent discriminative reranking</article-title>
          .
          <source>In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics</source>
          , pages
          <fpage>173</fpage>
          -
          <lpage>180</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [Charniak2000]
          <string-name>
            <given-names>Eugene</given-names>
            <surname>Charniak</surname>
          </string-name>
          .
          <year>2000</year>
          .
          <article-title>A maximumentropy-inspired parser</article-title>
          .
          <source>In Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference</source>
          , pages
          <fpage>132</fpage>
          -
          <lpage>139</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [Collins and Duffy2002] Michael Collins and
          <string-name>
            <given-names>Nigel</given-names>
            <surname>Duffy</surname>
          </string-name>
          .
          <year>2002</year>
          .
          <article-title>New ranking algorithms for parsing and tagging: Kernels over discrete structures, and the voted perceptron</article-title>
          .
          <source>In Proceedings of the 40th annual meeting on association for computational linguistics</source>
          , pages
          <fpage>263</fpage>
          -
          <lpage>270</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [Collins and Koo2005] Michael Collins and
          <string-name>
            <given-names>Terry</given-names>
            <surname>Koo</surname>
          </string-name>
          .
          <year>2005</year>
          .
          <article-title>Discriminative reranking for natural language parsing</article-title>
          .
          <source>Computational Linguistics</source>
          ,
          <volume>31</volume>
          (
          <issue>1</issue>
          ):
          <fpage>25</fpage>
          -
          <lpage>70</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [Croce et al.2011]
          <string-name>
            <given-names>Danilo</given-names>
            <surname>Croce</surname>
          </string-name>
          , Alessandro Moschitti, and
          <string-name>
            <given-names>Roberto</given-names>
            <surname>Basili</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Structured lexical similarity via convolution kernels on dependency trees</article-title>
          .
          <source>In Proceedings of the Conference on Empirical Methods in Natural Language Processing</source>
          , pages
          <fpage>1034</fpage>
          -
          <lpage>1046</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <source>[Johnson and Ural2010] Mark Johnson and Ahmet Engin Ural</source>
          .
          <year>2010</year>
          .
          <article-title>Reranking the berkeley and brown parsers</article-title>
          . In Human Language Technologies:
          <article-title>The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics</article-title>
          , pages
          <fpage>665</fpage>
          -
          <lpage>668</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <source>[Lavelli and Corazza2009] Alberto Lavelli and Anna Corazza</source>
          .
          <year>2009</year>
          .
          <article-title>The berkeley parser at the evalita 2009 constituency parsing task</article-title>
          .
          <source>In EVALITA 2009 Workshop on Evaluation of NLP Tools for Italian.</source>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [Lavelli2011]
          <string-name>
            <given-names>Alberto</given-names>
            <surname>Lavelli</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>The berkeley parser at the evalita 2011 constituency parsing task</article-title>
          .
          <source>In Working Notes of EVALITA.</source>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [Marcus et al.1993] Mitchell P Marcus, Mary Ann Marcinkiewicz, and
          <string-name>
            <given-names>Beatrice</given-names>
            <surname>Santorini</surname>
          </string-name>
          .
          <year>1993</year>
          .
          <article-title>Building a large annotated corpus of english: The penn treebank</article-title>
          .
          <source>Computational linguistics</source>
          ,
          <volume>19</volume>
          (
          <issue>2</issue>
          ):
          <fpage>313</fpage>
          -
          <lpage>330</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>[McClosky</surname>
          </string-name>
          et al.
          <year>2006</year>
          ]
          <article-title>David McClosky</article-title>
          ,
          <string-name>
            <given-names>Eugene</given-names>
            <surname>Charniak</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Mark</given-names>
            <surname>Johnson</surname>
          </string-name>
          .
          <year>2006</year>
          .
          <article-title>Effective selftraining for parsing</article-title>
          .
          <source>In Proceedings of the main conference on human language technology conference of the North American Chapter of the Association of Computational Linguistics</source>
          , pages
          <fpage>152</fpage>
          -
          <lpage>159</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [Moschitti2006]
          <string-name>
            <given-names>Alessandro</given-names>
            <surname>Moschitti</surname>
          </string-name>
          .
          <year>2006</year>
          .
          <article-title>Efficient convolution kernels for dependency and constituent syntactic trees</article-title>
          .
          <source>In European Conference on Machine Learning</source>
          , pages
          <fpage>318</fpage>
          -
          <lpage>329</lpage>
          . Springer.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <source>[Petrov and Klein2007] Slav Petrov and Dan Klein</source>
          .
          <year>2007</year>
          .
          <article-title>Improved inference for unlexicalized parsing</article-title>
          .
          <source>In HLT-NAACL</source>
          , volume
          <volume>7</volume>
          , pages
          <fpage>404</fpage>
          -
          <lpage>411</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <article-title>[Shen and Joshi2003] Libin Shen and Aravind</article-title>
          K Joshi.
          <year>2003</year>
          .
          <article-title>An svm based voting algorithm with application to parse reranking</article-title>
          .
          <source>In Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003-Volume</source>
          <volume>4</volume>
          , pages
          <fpage>9</fpage>
          -
          <lpage>16</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>