<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Neural Surface Realization for Italian</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Valerio Basile</string-name>
          <email>basile@di.unito.it.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alessandro Mazzei</string-name>
          <email>mazzei@di.unito.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dipartimento di Informatica, Universita` degli Studi di Torino</institution>
          ,
          <addr-line>Corso Svizzera 185, 10153 Torino</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>We present an architecture based on neural networks to generate natural language from unordered dependency trees. The task is split into the two subproblems of word order prediction and morphology inflection. We test our model gold corpus (the Italian portion of the Universal Dependency treebanks) and an automatically parsed corpus from the Web. (Italian) Questo lavoro introduce un'architettura basata su reti neurali per generare frasi in linguaggio naturale a partire da alberi a dipendenze. Il processo e` diviso nei due sottoproblemi dell'ordinamento di parole e dell'inflessione morfologica, per i quali la nostra architettura prevede due modelli indipendenti, il cui risultato e` combinato nella fase finale. Abbiamo testato il modello usando un gold corpus e un silver corpus ottenuto dal Web.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>Natural Language Generation is the process of
producing natural language utterances from an
abstract representation of knowledge. As opposed to
Natural Language Understanding, where the input
is well-defined (typically a text or speech segment)
and the output may vary in terms of complexity
and scope of the analysis, in the generation process
the input can take different forms and levels of
abstraction, depending on the specific goals and
applicative scenarios. However, the input structures
for generation should be at least formally defined.</p>
      <p>In this work we focus on the final part of the
standard NLG pipeline defined by Reiter and Dale
(2000), that is, surface realization, the task of
producing natural language from formal abstract
representations of sentences’ meaning and syntax.</p>
      <p>We consider the surface realization of
unordered Universal Dependency (UD) trees, i.e.,
syntactic structures where the words of a sentence
are connected by labeled directed arcs in a
treelike fashion. The labels on the arcs indicate the
syntactic relation holding between each word and
its dependent words (Figure 1a). We approach
the surface realization task in a supervised
statistical setting. In particular, we draw inspiration
from Basile (2015) by dividing the task into the
two independent subtasks of word order
prediction and morphology inflection prediction. Two
neural network-based models run in parallel on the
same input structure, and their output is later
combined to produce the final surface form.</p>
      <p>
        A first version of the system implementing our
proposed architecture (called the DipInfo-UniTo
realizer) was submitted to the shallow track of the
Surface Realization Shared Task 2018
        <xref ref-type="bibr" rid="ref10">(Mille et al.,
2018)</xref>
        . The main research goal of this paper is to
provide a critical analysis for tuning the training
data and learning parameters of the DipInfo-UniTo
realizer.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Neural network-based Surface</title>
    </sec>
    <sec id="sec-3">
      <title>Realization</title>
      <p>In the following sections, we detail the two neural
networks employed to solve the subtasks of word
order prediction (2.1) and morphology inflection
(2.2) respectively.
2.1</p>
      <sec id="sec-3-1">
        <title>Word Ordering</title>
        <p>We reformulate the problem of sentence-wise
word ordering in terms of reordering the subtrees
of its syntactical structure. The algorithm is
composed of three steps: i) splitting the unordered tree
into single-level unordered subtrees; ii) predicting
the local word order for each subtree; iii)
recomposing the single-level ordered subtrees into a
single multi-level ordered tree to obtain the global
word order.</p>
        <p>In the first step, we split the original unordered
universal dependency multilevel tree into a
number of single-level unordered trees, where each
subtree is composed by a head (the root) and all
its dependents (the children), similarly to Bohnet
et al. (2012). An example is shown in Figure 1:
ROOT
contenere
opera
prodotto
.
suo
numeroso
chimico
tossico
(a) Tree corresponding to the Italian sentence “Numerose
sue opere contengono prodotti chimici tossici.” (“Many
of his works contain toxic chemicals.”)</p>
        <p>contenere opera prodotto
prodotto</p>
        <p>. opera suo numeroso chimico
(b) Three subtrees extracted from the main tree.
tossico
from the (unordered) tree representing the
sentence “Numerose sue opere contengono prodotti
chimici tossici.” (1a), each of its component
subtrees (limited to one-level dependency) is
considered separarately (1b). The head and the
dependents of each subtree form an unordered list of
lexical items. Crucially, we leverage the flat structure
of the subtrees in order to extract structures that
are suitable as input to the learning to rank
algorithm in the next step of the process.</p>
        <p>In the second step of the algorithm, we predict
the relative order of the head and the dependents
of each subtree with a learning to rank approach.
We employ the list-wise learning to rank algorithm
ListNet, proposed by Cao et al. (2007). The
relatively small size of the lists of items to rank
allows us to use a list-wise approach, as opposed to
pair-wise or poin-twise approaches, while keeping
the computation times manageable. ListNet uses a
list-wise loss function based on top one
probability, i.e., the probability of an element of being the
first one in the ranking. The top one probability
model approximates the permutation probability
model that assigns a probability to each possible
permutation of an ordered list. This
approximation is necessary to keep the problem tractable by
avoiding the exponential explosion of the number
of permutations. Formally, the top one probability
of an object j is defined as</p>
        <p>Ps(j) =</p>
        <p>X
that is, the sum of the probabilities of all the
possible permutations of n objects (denoted as n)
where j is the first element. s = (s1; :::; sn) is a
given list of scores, i.e., the position of elements in
the list. Considering two permutations of the same
list y and z (for instance, the predicted order and
the reference order) their distance is computed
using cross entropy. The distance measure and the
top one probabilities of the list elements are used
in the loss function:</p>
        <p>L(y; z) =</p>
        <p>n
X Py(j)log(Pz(j))
j=1
The list-wise loss function is plugged into a
linear neural network model to provide a learning
environment. ListNet takes as input a sequence
of ordered lists of feature vectors (the features are
encoded as numeric vectors). The weights of the
network are iteratively adjusted by computing a
list-wise cost function that measure the distance
between the reference ranking and the prediction
of the model and passing its value to the gradient
descent algorithm for optimization of the
parameters.</p>
        <p>
          The choice of features for the supervised
learning to rank component is a critical point of our
solution. We use several word-level features
encoded as one-hot vectors, namely: the universal
POS-tag, the treebank specific POS tag, the
morphology features and the head-status of the word
(head of the single-level tree vs. leaf).
Furthermore, we included word representations,
differentiating between content words and function words:
for open-class word lemmas (content words) we
added the corresponding language-specific word
embedding to the feature vector, from the
pretrained multilingual model Polyglot
          <xref ref-type="bibr" rid="ref2">(Al-Rfou’ et
al., 2013)</xref>
          . Closed-class word lemmas (function
words) are encoded as one-hot bags of words
vectors. An implementation of the feature encoding
for the word ordering module of our architecture
is available online1.
        </p>
        <p>1https://github.com/alexmazzei/ud2ln
In the third step of the word ordering algorithm,
we reconstruct the global (i.e. sentence-level)
order from the local order of the one-level trees
under the hypothesis of projectivity2 — see Basile
and Mazzei (2018) for details on this step.
2.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Morphology Inflection</title>
        <p>The second component of our architecture is
responsible for the morphology inflection. The task
is formulated as an alignment problem between
characters that can be modeled with the sequence
to sequence paradigm. We use a deep neural
network architecture based on a hard attention
mechanism. The model has been recently introduced by
Aharoni and Goldberg (2017). The model consists
of a neural network in an encoder-decoder setting.
However, at each step of the training, the model
can either write a symbol to the output sequence,
or move the attention pointer to the next state of
the sequence. This mechanism is meant to model
the natural monotonic alignment between the
input and output sequences, while allowing the
freedom to condition the output on the entire input
sequence.</p>
        <p>We employ all the morphological features
provided by the UD annotation and the dependency
relation binding the word to its head, that is, we
transform the training files into a set of
structures ((lemma; f eatures); f orm) in order to
learn the neural inflectional model associating a
(lemma; f eatures) to the corresponding f orm.
An example of training instance for our
morphology inflection module is the following:
lemma: artificiale
features:
uPoS=ADJ
xPoS=A
rel=amod</p>
        <p>Number=Plur
form: artificiali
Corresponding to the word form artificiali, an
inflected form (plural) of the lemma artificiale
(artificial).
3</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Evaluation</title>
      <p>In this section, we present an evaluation of the
models presented in Section 2, with particular
consideration for two crucial points influencing
2As a consequence of the design of our approach, the
DipInfo-UniTo realizer cannot predict the correct word order
for non-projective sentences.
the performances of the DipInfo-UniTo realizer,
namely training data and learning parameters
settings. In Basile and Mazzei (2018), the
hardware limitations did not allow for an extensive
experimentation dedicated to the optimization of
the realizer performances. In this paper, we aim
to bridge this gap by experimenting with higher
computing capabilities, specifically a virtualized
GNU/Linux box with 16-core and 64GB of RAM.
3.1</p>
      <sec id="sec-4-1">
        <title>Training Data</title>
        <p>
          For our experiments, we used the four Italian
corpora annotated with Universal Dependencies
available on the Universal Dependency
repositories3. In total, they comprise 270,703 tokens and
12,838 sentences. We have previously used this
corpus for the training of the DipInfo-UniTo
realizer that participated to the SRST18 competition
          <xref ref-type="bibr" rid="ref5">(Basile and Mazzei, 2018)</xref>
          . We refer to this corpus
as Gold-SRST18 henceforth.
        </p>
        <p>
          Moreover, we used a larger corpus extracted
from ItWaC, a large unannotated corpus of
Italian
          <xref ref-type="bibr" rid="ref4">(Baroni et al., 2009)</xref>
          . We parsed ItWaC with
UDpipe
          <xref ref-type="bibr" rid="ref1 ref12">(Straka and Strakova´, 2017)</xref>
          , and selected
a random sample of 9,427 sentence (274,115
tokens). We refer to this corpus as Silver-WaC
henceforth.
3.2
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>Word Ordering Performances</title>
        <p>We trained the word order prediction module of
our system4 on the Gold-SRST18 corpus as well
as on the larger corpus created by concatenating
Gold-SRST18 and Silver-WaC.</p>
        <p>
          The performance of the ListNet algorithm for
word ordering is given in terms of average
Kendall’s Tau
          <xref ref-type="bibr" rid="ref9">(Kendall, 1938, )</xref>
          , a measure of
rank correlation used to give a score to each of the
rankings predicted by our model for every subtree
(Figure 2). measures the similarity between two
rankings by counting how many pairs of elements
are swapped with respect to the original ordering
out of all possible pairs of n elements:
=
#concordant pairs
12 n(n
#discordant tpairs
1)
Therefore, ranges from -1 to 1.
        </p>
        <p>In Figure 2 we reported the values obtained
at various epochs of learning for both the
Gold3http://universaldependencies.org/
4Our implementation of ListNet featuring a regularization
parameter to prevent overfitting is available at https://
github.com/valeriobasile/listnet
0.96
0.94
0.92
yc 0.9
cua
r
cA 0.88
0.86
0.84
0.82 0</p>
        <p>Gold-SRST18 training set,LR=0.00005
Gold-SRST18 training set,LR=0.000005</p>
        <p>Gold-SRST18 training set,LR=0.0000005</p>
        <p>Gold-SRST18+Silver-WaC training set,LR=0.000005
10 20 30 40 50 60 70 80 90 100
Epochs</p>
        <p>value with respect to</p>
        <p>SRST18 and Gold-SRST18+Silver-WaC corpora.
In particular, in order to investigate the influence
of the learning rate parameter (LR) in the learning
of the ListNet model, we reported the trends for
LR = 5 10 5 (the value originally used for the
official SRST18 submission), LR = 5 10 6 and
LR = 5 10 7. It is quite clear that the value of
LR has a great impact on the performance of the
word ordering, and that LR = 5 10 5 is not
appropriate to reach the best performance. This
explains the poor performance of the DipInfo-UniTo
realizer in the SRST18 competition (Table 1).
Indeed, the typical zigzag shape of the curve
suggests a sort of loop in the gradient learning
algorithm. In contrast, the LR = 5 10 6 seems to
reach a plateau value after the 100th epoch with
both corpora used in the experiments. We used the
system tuned with this value of the learning rate to
evaluate the global performance of the realizer.
In order to understand the impact of the
SilverWaC corpus on the global performance of the
system, we trained the DNN system for morphology
inflection5 both on the Gold-SRST18 corpus and
on the larger corpus composed by Gold-SRST18+
Silver-WaC. In Figure 3 we reported the accuracy
on the SRST18 development set for both the
corpora. A first analysis of the trend shows little
improvement to the global performance of the
realization from the inclusion of additional data (see
the discussion in the next section).
3.4</p>
      </sec>
      <sec id="sec-4-3">
        <title>Global Surface Realization Performances</title>
        <p>
          Finally, we evaluate the end-to-end performance
of our systems by combining the output of the two
modules and submitting it to the evaluation scorer
of the Surface Realization Shared Task. In
Table 1 we report the performance of various tests
systems with respect to the BLUE-4, DIST, NIST
measures, as defined by Mille et al. (2018). The
first line reports the official performance of the
DipInfo-Unito realizer in the SRST18 for
Italian. The last line reports the best performances
achieved on Italian by the participants to SRST18
          <xref ref-type="bibr" rid="ref10">(Mille et al., 2018)</xref>
          . The other lines report the
performance of the DipInfo-UniTo realizer by
considering various combination of the gold and silver
corpora. The results show a clear improvement
        </p>
        <sec id="sec-4-3-1">
          <title>ListNet</title>
          <p>Gsrst
G</p>
          <p>G
G+S
G+S</p>
        </sec>
        <sec id="sec-4-3-2">
          <title>Morpho</title>
          <p>Gsrst</p>
          <p>G
G+S</p>
          <p>G
G+S</p>
          <p>BLEU-4
24.61
36.40
36.60
36.40
36.60
44.16</p>
          <p>DIST
36.11
32.80
32.70
32.80
32.70
58.61</p>
          <p>
            NIST
8.25
9.27
9.30
9.27
9.30
9.11
for the word order module (note that the DIST
metric is character-based, therefore it is more
sensitive to the morphological variation than NIST
and BLEU-4). In contrast, the morphology
submodule performance seems to be unaffected by
the use of a larger training corpus. This effect
could be due different causes. Errors are present in
the silver standard training set, and it is not clear
to what extent the morphology analysis is correct
5An implementation of the model by
            <xref ref-type="bibr" rid="ref1 ref12">(Aharoni and
Goldberg, 2017)</xref>
            is freely available as https://github.com/
roeeaharoni/morphological-reinflection
with respect to the syntactic analysis. The other
possible cause is the neural model itself. Indeed,
Aharoni and Goldberg (2017) report a plateau in
performance after feeding it with relatively small
datasets. The DipInfo-UniTo realizer performs
better than the best systems of the SRST18
challenge for one out of three metrics (NIST).
4
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusion and Future Work</title>
      <p>In this paper, we considered the problem of
analysing the impact of the training data and
parameters tuning on the (modular and global)
performance of the DipInfo-UniTo realizer. We
computationally proved that the DipInfo-UniTo
realizer can gives competitive results (i) by
augmenting the training data set with automatically
annotated sentences, and (ii) by tuning the learning
parameters of the neural models.</p>
      <p>
        In future work, we intend to resolve the main
lack of our approach, that is the impossibility to
realize non-projective sentences. Moreover, further
optimization of both neural models will be carried
out on a new high-performance architecture
        <xref ref-type="bibr" rid="ref3">(Aldinucci et al., 2018)</xref>
        , by executing a systematic
gridsearch over the hyperparameter space, namely the
regularization factor and weight initialization for
ListNet, and the specific DNN hyperparameters
for the morphology module.
      </p>
    </sec>
    <sec id="sec-6">
      <title>Aknowledgment</title>
      <p>We thank the GARR consortium which kindly
allowed to use to the GARR Cloud Platform6 to run
some of the experiments described in this paper.
Valerio Basile was partially funded by Progetto di
Ateneo/CSP 2016 (Immigrants, Hate and
Prejudice in Social Media, S1618 L2 BOSC 01).
Alessandro Mazzei was partially supported by the
HPC4AI project, funded by the Region Piedmont
POR-FESR 2014-20 programme (INFRA-P call).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>Roee</given-names>
            <surname>Aharoni</surname>
          </string-name>
          and
          <string-name>
            <given-names>Yoav</given-names>
            <surname>Goldberg</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Morphological inflection generation with hard monotonic attention</article-title>
          .
          <source>In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics</source>
          ,
          <string-name>
            <surname>ACL</surname>
          </string-name>
          <year>2017</year>
          , pages
          <fpage>2004</fpage>
          -
          <lpage>2015</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>Rami</given-names>
            <surname>Al-Rfou</surname>
          </string-name>
          <string-name>
            <surname>'</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Bryan</given-names>
            <surname>Perozzi</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Steven</given-names>
            <surname>Skiena</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Polyglot: Distributed word representations for multilingual nlp</article-title>
          .
          <source>In CoNLL</source>
          , pages
          <fpage>183</fpage>
          -
          <lpage>192</lpage>
          . ACL.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>Marco</given-names>
            <surname>Aldinucci</surname>
          </string-name>
          , Sergio Rabellino, Marco Pironti, Filippo Spiga, Paolo Viviani, Maurizio Drocco, Marco Guerzoni, Guido Boella, Marco Mellia, Paolo Margara, Idillio Drago, Roberto Marturano, Guido Marchetto, Elio Piccolo, Stefano Bagnasco, Stefano Lusso, Sara Vallero, Giuseppe Attardi, Alex Barchiesi, Alberto Colla, and
          <string-name>
            <given-names>Fulvio</given-names>
            <surname>Galeazzi</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Hpc4ai, an ai-on-demand federated platform endeavour</article-title>
          .
          <source>In ACM Computing Frontiers</source>
          , Ischia, Italy, May.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>Marco</given-names>
            <surname>Baroni</surname>
          </string-name>
          , Silvia Bernardini, Adriano Ferraresi, and
          <string-name>
            <given-names>Eros</given-names>
            <surname>Zanchetta</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>The wacky wide web: a collection of very large linguistically processed web-crawled corpora</article-title>
          .
          <source>Language Resources and Evaluation</source>
          ,
          <volume>43</volume>
          (
          <issue>3</issue>
          ):
          <fpage>209</fpage>
          -
          <lpage>226</lpage>
          , September.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>Valerio</given-names>
            <surname>Basile</surname>
          </string-name>
          and
          <string-name>
            <given-names>Alessandro</given-names>
            <surname>Mazzei</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>The dipinfo-unito system for srst 2018</article-title>
          .
          <source>In Proceedings of the First Workshop on Multilingual Surface Realisation</source>
          , pages
          <fpage>65</fpage>
          -
          <lpage>71</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>Valerio</given-names>
            <surname>Basile</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>From Logic to Language : Natural Language Generation from Logical Forms</article-title>
          .
          <source>Ph.D. thesis</source>
          , University of Groningen, Netherlands.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>Bernd</given-names>
            <surname>Bohnet</surname>
          </string-name>
          , Anders Bjo¨rkelund, Jonas Kuhn, Wolfgang Seeker, and
          <string-name>
            <given-names>Sina</given-names>
            <surname>Zarrieß</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Generating non-projective word order in statistical linearization</article-title>
          .
          <source>In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning</source>
          , pages
          <fpage>928</fpage>
          -
          <lpage>939</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <given-names>Zhe</given-names>
            <surname>Cao</surname>
          </string-name>
          , Tao Qin,
          <string-name>
            <surname>Tie-Yan</surname>
            <given-names>Liu</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ming-Feng Tsai</surname>
            , and
            <given-names>Hang</given-names>
          </string-name>
          <string-name>
            <surname>Li</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>Learning to rank: From pairwise approach to listwise approach</article-title>
          .
          <source>In Proceedings of the 24th International Conference on Machine Learning, ICML '07</source>
          , pages
          <fpage>129</fpage>
          -
          <lpage>136</lpage>
          , New York, NY, USA. ACM.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>M. G.</given-names>
            <surname>Kendall</surname>
          </string-name>
          .
          <year>1938</year>
          .
          <article-title>A new measure of rank correlation</article-title>
          .
          <source>Biometrika</source>
          ,
          <volume>30</volume>
          (
          <issue>1</issue>
          /2):
          <fpage>81</fpage>
          -
          <lpage>93</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>Simon</given-names>
            <surname>Mille</surname>
          </string-name>
          , Anja Belz, Bernd Bohnet, Yvette Graham, Emily Pitler, and
          <string-name>
            <given-names>Leo</given-names>
            <surname>Wanner</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>The first multilingual surface realisation shared task (sr'18): Overview and evaluation results</article-title>
          .
          <source>In Proceedings of the First Workshop on Multilingual Surface Realisation</source>
          , pages
          <fpage>1</fpage>
          -
          <lpage>12</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <given-names>Ehud</given-names>
            <surname>Reiter</surname>
          </string-name>
          and
          <string-name>
            <given-names>Robert</given-names>
            <surname>Dale</surname>
          </string-name>
          .
          <year>2000</year>
          .
          <article-title>Building Natural Language Generation Systems</article-title>
          . Cambridge University Press, New York, NY, USA.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>Milan</given-names>
            <surname>Straka</surname>
          </string-name>
          and Jana Strakova´.
          <year>2017</year>
          .
          <article-title>Tokenizing, pos tagging, lemmatizing and parsing ud 2.0 with udpipe</article-title>
          .
          <source>In Proceedings of the CoNLL</source>
          <year>2017</year>
          <article-title>Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies</article-title>
          , pages
          <fpage>88</fpage>
          -
          <lpage>99</lpage>
          , Vancouver, Canada, August. Association for Computational Linguistics.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>