<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>From Tokens to Trees: Mapping Syntactic Structures in the Deserts of Data-Scarce Languages</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>David Vilares</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alberto Muñoz-Ortiz</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Universidade da Coruña, CITIC, Departamento de Ciencias de la Computación y Tecnologías de la Información</institution>
          ,
          <addr-line>Campus de Elviña s/n, 15071, A Coruña</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Low-resource learning in natural language processing focuses on developing efective resources, tools, and technologies for languages that are less popular within the industry and academia. This efort is crucial for several reasons, including ensuring that as many languages as possible are represented digitally, and enhancing access to language technologies for native speakers of minority languages. In this context, this paper outlines the motivation, research lines, and results from a Leonardo Grant - by FBBVA - on low-resource languages and parsing as sequence labeling. The project's primary aim was to devise fast and accurate methods for low-resource syntactic parsing and to examine evaluation strategies as well as strengths and weaknesses in comparison to alternative parsing strategies.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;low-resource learning</kwd>
        <kwd>natural language processing</kwd>
        <kwd>parsing</kwd>
        <kwd>cross-lingual learning</kwd>
        <kwd>multilinguality</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>SEPLN-CEDI-PD 2024: Seminar of the Spanish Society for Natural
Language Processing: Projects and Systems Demonstrations, June
19-20, 2024, A Coruña, Spain.
$ david.vilares@udc.es (D. Vilares); alberto.munoz.ortiz@udc.es
(A. Muñoz-Ortiz)
 https://www.grupolys.org/~david.vilares/ (D. Vilares); https://
amunozo.github.io/ (A. Muñoz-Ortiz)</p>
      <p>0000-0002-1295-3840 (D. Vilares); 0000-0001-9608-2730
(A. Muñoz-Ortiz)</p>
      <p>© 2024 Copyright for this paper by its authors. Use permitted under Creative
CPWrEooUrckReshdoinpgs IhStpN:/c1e6u1r3-w-0s.o7r3g CCoEmmUoRns LWicenoserkAtstrhiboutpionP4.r0 oIncteerneadtioinnagl(sCC(CBYE4.U0).R-WS.org)</p>
      <p>
        1https://www.redleonardo.es/
monitoring social networks but considering only Indo- The evaluation. Throughout the project, we
emphaEuropean languages), among others. sized the importance of evaluating a wide variety of
languages, encompassing diverse linguistic families,
typoloThe problem. While syntactic parsers excel with high- gies, and alphabets. This strategy was adopted to ensure
resource languages, they encounter significant challenges our results were more robust and generalizable. To do
with low-resource ones. The ability to analyze sentence so, we mostly relied on the Universal Dependencies [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ],
structure is crucial for NLP tools, including the develop- a collection of treebanks2, which contains syntactic
anment of applications like automatic translation, question notations for more than 100 languages from diferent
answering, and text summarization. In some other cases, language families, and alphabets.
the desired output is the structure itself, as is often the
case for computational linguists (for instance, because The novelty. From a technical standpoint, this project
they want to study languages) or when the final output was both original and innovative as it combined
artifiis a tree or graph that aids in understanding the meaning cial intelligence and natural language processing with
of the utterance (e.g., relationships between symptoms, recent cognitive theories on how humans comprehend
diseases, and cures in clinical reports). language structure. The approach aimed to develop new
NLP models capable of swiftly and accurately obtaining
The approach. From a linguistic point of view, the the syntactic structure of sentences written in languages
7,000 languages spoken in the world are organized into with a scarcity of resources. In this regard, research on
about 140 families. For example, Spanish, French, Gali- languages with limited resources is recognized by the
cian, or Catalan are all Indoeuropean languages; while international NLP community as one of the major
unreTurkish, Uzbek, Kazakh, or Uyghur are Turkic languages. solved challenges. Several authors have made significant
Moreover, many of these resource-scarce languages are contributions in recent years in areas such as machine
closely related to another language with a multitude of translation [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] morphological analysis [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], and syntactic
speakers and resources available (e.g., Galician-Spanish analysis [6]. Thematically, the project addressed various
or Uyghur-Turkish), sharing not only linguistic typol- concerns of contemporary society, including the
develogy (e.g., word order or vocabulary formulation) but also opment of technologies that contribute to the
preservasyntactic structures. In the same way that it is easier tion of knowledge expressed in diferent languages and
for a person to create grammatical sentences in a new ensuring democratic access to artificial intelligence
techlanguage if they already know another language with nologies.
similar characteristics (e.g., for a Spanish speaker,
Galician would be easier than Uyghur, and the opposite would 3. Methodology
be true for a Turkish speaker). In NLP, it is also a
common approach to exploit related languages, specially in The project explored three lines of work. The first focused
the context of using rich-resourced languages to help on data collection for experiments, including training
modeling less-resourced ones. This is also an angle that initial sequence labeling baselines, and it examined the
we considered through the project to model the syntactic impact of annotated data volume on model quality.
Furstructure of low-resource languages. In addition, recent thermore, it set up baseline models based on traditional
studies in cognitive science suggest that humans might dependency parsing paradigms, using both graph-based
use the same brain regions for lexical, syntactic, and se- and transition-based strategies. This aimed to better
unmantic processing of sentences, and that this processing derstand the models and compare our results with these
is carried out according to a sequence-labeling-like pro- typically slower, but more accurate, strategies. The
seccess [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The underlying idea is that the brain processes ond line of work concentrated on leveraging distant and
sentences as a flat sequence, whose representation is dy- auxiliary data to enhance the performance of the
basenamically updated without the need for creating complex line models and to comprehend how neural networks
hierarchical abstractions of the sentence to represent its perceive the structure of languages. The third of work
syntactic structure. Recent studies have shown that it explored data augmentation methods for low-resource
is possible to emulate this behavior in NLP using deep languages and dependency parsers. The second and third
learning techniques and sequence labeling models, with lines of work were partially dependent on the first one,
the great added advantage of their speed, making their but could be developed independently from each other
use in real environments possible, unlike other syntactic later. We now briefly summarize them before moving on
analysis paradigms. However, there was little research to the project results.
of sequence labeling models for low-resource languages,
and the challenges it poses to build them. This was the
gap that this project aimed to fill.
      </p>
      <sec id="sec-1-1">
        <title>2This is usually the name given to a dataset with syntactic</title>
        <p>annotations.</p>
        <p>
          Research line 1 - Compilation, analysis of syntactic
typology, creation of baseline models, and impact
of annotated data. This line focused on: (1) collecting
representative data, (2) training the initial models, and
(3) exploring the impact of the amount of annotated data
on sequence labeling models, depending on the chosen
parsing linearization. Specifically:
1. The first goal was to identify treebanks for
numerous languages in collections such as the
Universal Dependencies [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], pinpointing both
lowresource and rich-resource languages of interest
for the project. The focus was on identifying
languages that share substantial syntactic
proximity, evaluated according to various linguistic
criteria including alphabet, word order, language
family, or typology, among others. To achieve
this, the approach involved using automated
techniques to estimate such proximity, leveraging
publicly available resources like the World Atlas
of Language Structures [7] and URIEL [8]. Among
the treebanks studied during the project, we
included several rich-resource languages - such
as English, German, Portuguese, Russian,
Classical Chinese, Korean, and Japanese - and
lowresource languages - such as Galician, Basque,
Telugu, Marathi, Lithuanian, Faroese, Afrikaans,
and Wolof.
2. The second goal was to develop, train, and
assess base syntactic models across the chosen
languages. The first step involved training sequence
labeling models for both low-resource and
richresource languages separately. This step was
crucial for garnering preliminary experimental
results and to have a baseline framework against
which to evaluate models in the next phases.
Additionally, this step was useful for preparing the
high-resource models aimed at transferring
syntactic knowledge in later stages of the project, for
instance through zero-shot and few-shot setups.
3. The third goal of this line was to examine the
performance of diferent linearizations for sequence
labeling parsing on low-resource languages. At
the project’s outset, various linearizations of
dependency trees were available for training
sequence labeling models, i.e. diferent strategies
to create a sequence of labels that could be
decoded into a dependency tree, and some others
were created during the project.3 However, it was
unclear if some linearizations could be more
effectively used with the same data volume. To
study so, we trained sequence labeling parsers
on various languages to determine whether such
        </p>
      </sec>
      <sec id="sec-1-2">
        <title>3For the details about the tested linearizations, we recommend</title>
        <p>
          reading [
          <xref ref-type="bibr" rid="ref1">1, 9, 10</xref>
          ].
linearizations were equally data-hungry or not,
and whether rich-resource and low-resource
languages showed similar patterns.
        </p>
        <p>Research line 2 - Auxiliary data use of pre-trained
models. This line focused on the use of distance
learning, such as reliance on parsers first trained for
richresource languages, encoders pre-trained on masked
language modeling, and auxiliary data, such as
part-ofspeech tags, and examined their impact on the
performance of sequence labeling parsers for low-resource
languages and domains:</p>
      </sec>
      <sec id="sec-1-3">
        <title>1. The first goal involved using sequence labeling</title>
        <p>models first trained on rich-resource languages.</p>
        <p>These models were then fine-tuned in a second
phase on low-resource languages. We applied
this strategy in both zero-shot and few-shot
setups. The zero-shot setup operates under the
assumption that there is no available data for the
low-resource language. However, we expect that
a related rich-resource language can still help
obtain meaningful outputs for the low-resource
languages. The few-shot setup, on the other hand,
assumes that some data is available. This data is
used to continue fine-tuning the model initially
pre-trained on the rich-resource language.
Alternatively, under the few-shot setup, this phase
also involved training the model in a single phase
by merging low-resource training data with data
from a related rich-resource language.
2. The second goal aimed to use related or distant
tasks that provide useful information about the
syntactic structure of the languages, to assess
their impact on sequence labeling models for
lowresource languages. On one hand, the first task
involved leveraging morphological information
for sequence labeling parsers in both low- and
rich-resource languages. On the other hand, we
explored the use of language models as encoders
for sequence labeling tasks. This involved directly
outputting vector representations into a sequence
of labels to reconstruct the tree, and analyzing
its performance on data-scarce tongues. The
hypothesis was that during the pre-training phase,
the language model would learn to encode useful
information about the syntactic structure of seen
languages in its latent representational space.</p>
        <p>Research line 3 - Data augmentation techniques for
low-resource dependency parsing. This research
line explored methods for generating synthetic data to
train dependency parsers for languages that sufer from
a scarcity of resources. Initially, we considered various
strategies, including techniques such as cropping and
rotating, as well as semi-automatically annotating sen- even low-accuracy PoS taggers can enhance parsing
pertences. Finally, we focused our eforts on adapting syn- formance, especially when more PoS tag than
depentactic resources annotated in a rich-resource language to dency tree annotations are available. This study is
siga low-resource language, treating the task as a word-level nificant in computational linguistics, ofering insights
translation problem that takes into account morphologi- into the nuanced relationship between encoding
stratecal information to maintain annotations across languages. gies and resource availability. It underscored the
varyWe found this strategy adequate for the purpose of the ing utility of PoS tags for sequence labeling models (as
project as it ofers explicit properties that should facili- well as for other parsing paradigms) and emphasized
tate the transfer of language structure from resource-rich the encoding-dependent impact of PoS tagging accuracy.
languages to related, less-resourced ones. The research also explored how controlling PoS tag
accuracy can influence parsing outcomes, providing valuable
guidance for future work on parsing models for
under4. Results represented languages. The code was made available at:
https://www.grupolys.org/software/aacl2022/.</p>
        <p>Linearizations for parsing as sequence labeling. In
[9] we proposed a new family of sequence labeling
encodings based on brackets. In short, these encodings use a Cross-lingual Inflection as a Data Augmentation
special kind of shorthand - a series of symbols like brack- Method for Parsing [13]. This paper introduced
ets and slashes - to describe which words are connected a technique for creating ‘synthetic creole’ treebanks,
and how. This type of linearizations is particularly well- termed x-inflected treebanks, through cross-lingual
morsuited for certain low-resource languages such as Ancient phological inflection. This process required a source
Greek, and also languages with high non-projectivity, language dependency treebank from a closely related
which represents language with relatively free word or- language, equipped with lemmas and morphological
feader. In [10] we propose a set of novel linearizations from tures, alongside a morphological inflection system
taiexisting transition-based algorithms. The code is avail- lored for the target language. To create the morphological
able at https://github.com/mstrise/dep2label-bert, and inflectors, we relied on UniMorph [ 14]. Our aim with this
it supports large language models such as BERT as en- approach was to produce x-inflected treebanks that
mimcoders to exploit learned structure of languages during icked the target language to a certain degree. For a greater
its pre-training phase. clarity, Figure 1 depicts an example from our paper
summarizing the high-level process of our method. The
objective was to enhance parser performance for languages
that had scarce or no annotated data, by leveraging an
accurately trained morphological inflection system. This
system was then applied to a related rich-resource
treebank to approximate the linguistic characteristics of the
target low-resourced language. The code was made
available at: https://github.com/amunozo/x-inflection.</p>
        <p>Not All Linearizations Are Equally Data-Hungry in
Sequence Labeling Parsing [11]. The paper
summarized the main outcomes from our research line 1. It
focused on the efectiveness of various sequence
labeling encodings for dependency parsing, particularly in
the context of low-resource languages. It compared the
performance of diferent encodings—head selection,
relative position, bracketing, and mapping from
transitionbased subsequences — under the constraints of limited
training data. The findings suggest that while
headselection encodings may perform better in data-rich
environments, bracketing encodings show greater promise
in low-resource settings. This insight is crucial for
developing more efective parsing strategies in languages with
scarce computational resources. The study highlighted
the complex connection between how information is
encoded and the availability of resources.</p>
        <p>The Fragility of Multi-Treebank Parsing Evaluation
[15]. This paper examined the impact of treebank
selection on parser performance evaluations, drawing
on insights and evaluation issues that we observed
during the development of the project. It specifically
demonstrated how parser rankings, in terms of
performance, could vary significantly across diferent treebank
subsets, challenging the reliability of evaluations based
on a single subset. The results from several experiments
emphasized the need for meticulous treebank selection to
ensure robust, comprehensive, and unbiased evaluations.</p>
        <p>Parsing linearizations appreciate PoS tags - but The study also highlighted the challenges in formulating
some are fussy about errors [12]. This paper sum- selection guidelines and cautioned against strategies that
marized some of the findings that resulted from our sec- might lead to weak conclusions. Interestingly, it revealed
ond research line of work. Particularly, it investigated that the disparity in efectiveness between sequence
the role of Part-of-Speech (PoS) tags in sequence label- labeling parsers and traditional parsers was considerably
ing parsing in low-resource settings. It highlighted that smaller for languages with fewer resources compared to
Another Dead End for Morphological Tags?
Perturbed Inputs and Parsing [16]. This paper focused
on a low-resource domain: how to perform efective
parsing when the input text is highly corrupted with many
lexical errors, which could be due to natural causes or
adversarial attacks. These attacks could involve
removing a character, adding a character, replacing a character,
or switching two symbols. In our study - linguistically
diverse, but for now restricted to languages using the
Latin alphabet - we looked at 14 diferent sets of
language data and found some interesting results. When
we tested under such types of corrupted inputs, adding
morphological information (such as universal, specific
part-of-speech tags, and very detailed morphological
features) actually (and counterintuitively) made the
performance of traditional parsing models decline faster.
However, for sequence labeling parsers, adding this kind of
information was beneficial, like the ones proposed in our
project was beneficial. The code to replicate the
experiments and create adversarial attacks was made available
at: https://github.com/amunozo/parsing_perturbations.</p>
        <p>This method serves as a proxy to estimate the extent of
syntactic structure encoded by these models for various
languages, which is of interest for both rich-resource and
low-resource languages. To achieve this, we first
carefully selected a diverse array of language models,
difering in their scale, language pretraining objectives, and
token representation formats. Then, to extract dependency
and constituent structures directly from them, we used
existing sequence labeling encodings for tree parsing. By
adding just a linear layer on top of this type of encoders,
we transformed continuous vector representations into
discrete labels. The results showed that, for languages
included in the pretraining data, sequence labeling models
can be trained much more efectively, with the amount
of available fine-tuning data not being a primary
factor. The code for this research was made available at
https://github.com/amunozo/multilingual-assessment.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Acknowledgments</title>
      <sec id="sec-2-1">
        <title>This project was supported by a 2020 Leonardo Grant for</title>
        <p>Researchers and Cultural Creators from the FBBVA.4
Assessment of Pre-Trained Models Across
Languages and Grammars [17]. In this paper, we built
upon our initial ideas from our second line of research,
to introduce the first comprehensive framework that
spans multiple paradigms and languages, aimed at
recovering syntactic structures, including both dependency 4FBBVA accepts no responsibility for the opinions, statements
and constituent types, as learned by language models. and contents included in the project and/or the results thereof, which
are entirely the responsibility of the authors.</p>
        <p>Empirical Methods in Natural Language Processing, //aclanthology.org/2022.aacl-short.16.
2018, pp. 614–620. [13] A. Muñoz-Ortiz, C. Gómez-Rodríguez, D. Vilares,
[6] L. Duong, T. Cohn, S. Bird, P. Cook, Low resource Cross-lingual inflection as a data augmentation
dependency parsing: Cross-lingual parameter shar- method for parsing, in: S. Tafreshi, J. Sedoc,
ing in a neural network parser, in: Proceedings A. Rogers, A. Drozd, A. Rumshisky, A. Akula (Eds.),
of the 53rd Annual Meeting of the Association Proceedings of the Third Workshop on Insights
for Computational Linguistics and the 7th Interna- from Negative Results in NLP, Association for
Comtional Joint Conference on Natural Language Pro- putational Linguistics, Dublin, Ireland, 2022, pp. 54–
cessing (Volume 2: Short Papers), 2015, pp. 845–850. 61. URL: https://aclanthology.org/2022.insights-1.7.
[7] M. Haspelmath, The typological database of the doi:10.18653/v1/2022.insights-1.7.
world atlas of language structures, The Use of [14] A. D. McCarthy, C. Kirov, M. Grella, A. Nidhi, P. Xia,
Databases in Cross-Linguistic Studies 41 (2009) 283. K. Gorman, E. Vylomova, S. J. Mielke, G. Nicolai,
[8] P. Littell, D. R. Mortensen, K. Lin, K. Kairis, M. Silfverberg, T. Arkhangelskiy, N. Krizhanovsky,
C. Turner, L. Levin, Uriel and lang2vec: Repre- A. Krizhanovsky, E. Klyachko, A. Sorokin, J.
Manssenting languages as typological, geographical, and ifeld, V. Ernštreits, Y. Pinter, C. L. Jacobs, R.
Cotphylogenetic vectors, in: Proceedings of the 15th terell, M. Hulden, D. Yarowsky, UniMorph 3.0:
Conference of the European Chapter of the Asso- Universal Morphology, in: N. Calzolari, F. Béchet,
ciation for Computational Linguistics: Volume 2, P. Blache, K. Choukri, C. Cieri, T. Declerck, S. Goggi,
Short Papers, 2017, pp. 8–14. H. Isahara, B. Maegaard, J. Mariani, H. Mazo,
[9] M. Strzyz, D. Vilares, C. Gómez-Rodríguez, Brack- A. Moreno, J. Odijk, S. Piperidis (Eds.), Proceedings
eting encodings for 2-planar dependency pars- of the Twelfth Language Resources and Evaluation
ing, in: D. Scott, N. Bel, C. Zong (Eds.), Pro- Conference, European Language Resources
Associceedings of the 28th International Conference ation, Marseille, France, 2020, pp. 3922–3931. URL:
on Computational Linguistics, International Com- https://aclanthology.org/2020.lrec-1.483.
mittee on Computational Linguistics, Barcelona, [15] I. Alonso-Alonso, D. Vilares, C. Gómez-Rodríguez,
Spain (Online), 2020, pp. 2472–2484. URL: https: The fragility of multi-treebank parsing evaluation,
//aclanthology.org/2020.coling-main.223. doi:10. in: N. Calzolari, C.-R. Huang, H. Kim, J. Pustejovsky,
18653/v1/2020.coling-main.223. L. Wanner, K.-S. Choi, P.-M. Ryu, H.-H. Chen, L.
Do[10] C. Gómez-Rodríguez, M. Strzyz, D. Vilares, A uni- natelli, H. Ji, S. Kurohashi, P. Paggio, N. Xue, S. Kim,
fying theory of transition-based and sequence la- Y. Hahm, Z. He, T. K. Lee, E. Santus, F. Bond, S.-H.
beling parsing, in: D. Scott, N. Bel, C. Zong (Eds.), Na (Eds.), Proceedings of the 29th International
Proceedings of the 28th International Conference Conference on Computational Linguistics,
Interon Computational Linguistics, International Com- national Committee on Computational Linguistics,
mittee on Computational Linguistics, Barcelona, Gyeongju, Republic of Korea, 2022, pp. 5345–5359.
Spain (Online), 2020, pp. 3776–3793. URL: https: URL: https://aclanthology.org/2022.coling-1.475.
//aclanthology.org/2020.coling-main.336. doi:10. [16] A. Muñoz-Ortiz, D. Vilares, Another dead
18653/v1/2020.coling-main.336. end for morphological tags? perturbed
in[11] A. Muñoz-Ortiz, M. Strzyz, D. Vilares, Not all lin- puts and parsing, in: A. Rogers, J.
Boydearizations are equally data-hungry in sequence Graber, N. Okazaki (Eds.), Findings of the
Aslabeling parsing, in: R. Mitkov, G. Angelova (Eds.), sociation for Computational Linguistics: ACL
Proceedings of the International Conference on 2023, Association for Computational Linguistics,
Recent Advances in Natural Language Processing Toronto, Canada, 2023, pp. 7301–7310. URL: https:
(RANLP 2021), INCOMA Ltd., Held Online, 2021, //aclanthology.org/2023.findings-acl.459. doi: 10.
pp. 978–988. URL: https://aclanthology.org/2021. 18653/v1/2023.findings-acl.459.
ranlp-1.111. [17] A. Muñoz-Ortiz, D. Vilares, C. Gómez-Rodríguez,
[12] A. Muñoz-Ortiz, M. Anderson, D. Vilares, C. Gómez- Assessment of pre-trained models across languages
Rodríguez, Parsing linearizations appreciate PoS and grammars, in: J. C. Park, Y. Arase, B. Hu, W. Lu,
tags - but some are fussy about errors, in: Y. He, D. Wijaya, A. Purwarianti, A. A. Krisnadhi (Eds.),
H. Ji, S. Li, Y. Liu, C.-H. Chang (Eds.), Proceedings Proceedings of the 13th International Joint
Conferof the 2nd Conference of the Asia-Pacific Chap- ence on Natural Language Processing and the 3rd
ter of the Association for Computational Linguis- Conference of the Asia-Pacific Chapter of the
Astics and the 12th International Joint Conference sociation for Computational Linguistics (Volume 1:
on Natural Language Processing (Volume 2: Short Long Papers), Association for Computational
LinPapers), Association for Computational Linguis- guistics, Nusa Dua, Bali, 2023, pp. 359–373. URL:
tics, Online only, 2022, pp. 117–127. URL: https: https://aclanthology.org/2023.ijcnlp-main.23.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Strzyz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Vilares</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Gómez-Rodríguez</surname>
          </string-name>
          ,
          <article-title>Viable dependency parsing as sequence labeling</article-title>
          , in: J.
          <string-name>
            <surname>Burstein</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Doran</surname>
          </string-name>
          , T. Solorio (Eds.),
          <source>Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</source>
          , Volume
          <volume>1</volume>
          (Long and Short Papers),
          <source>Association for Computational Linguistics</source>
          , Minneapolis, Minnesota,
          <year>2019</year>
          , pp.
          <fpage>717</fpage>
          -
          <lpage>723</lpage>
          . URL: https://aclanthology.org/ N19-1077. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>N19</fpage>
          -1077.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M. H.</given-names>
            <surname>Christiansen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Chater</surname>
          </string-name>
          ,
          <article-title>The now-or-never bottleneck: A fundamental constraint on language</article-title>
          ,
          <source>Behavioral and brain sciences 39</source>
          (
          <year>2016</year>
          )
          <article-title>e62</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>M.-C. de Marnefe</surname>
            ,
            <given-names>C. D.</given-names>
          </string-name>
          <string-name>
            <surname>Manning</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Nivre</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Zeman</surname>
          </string-name>
          , Universal Dependencies,
          <source>Computational Linguistics</source>
          <volume>47</volume>
          (
          <year>2021</year>
          )
          <fpage>255</fpage>
          -
          <lpage>308</lpage>
          . URL: https://aclanthology.org/
          <year>2021</year>
          .cl-
          <volume>2</volume>
          .11. doi:
          <volume>10</volume>
          .1162/coli_a_
          <fpage>00402</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>B.</given-names>
            <surname>Zoph</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Yuret</surname>
          </string-name>
          , J. May,
          <string-name>
            <given-names>K.</given-names>
            <surname>Knight</surname>
          </string-name>
          ,
          <article-title>Transfer learning for low-resource neural machine translation</article-title>
          ,
          <source>in: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>1568</fpage>
          -
          <lpage>1575</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>B.</given-names>
            <surname>Plank</surname>
          </string-name>
          , Ž. Agić,
          <article-title>Distant supervision from disparate sources for low-resource part-of-speech tagging</article-title>
          ,
          <source>in: Proceedings of the 2018 Conference on</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>