<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>CLARIAH-EUS: a Cross-border CLARIAH Node for the Basque Language and Culture</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jon Alkorta</string-name>
          <email>jon.alkorta@ehu.eus</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Aritz Farwell</string-name>
          <email>aritz.farwell@ehu.eus</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Joseba Fernandez de Landa</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Begoña Altuna</string-name>
          <email>begona.altuna@ehu.eus</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ainara Estarrona</string-name>
          <email>ainara.estarrona@ehu.eus</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mikel Iruskieta</string-name>
          <email>mikel.iruskieta@ehu.eus</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Xabier Arregi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Xabier Goenaga</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jose Mari Arriola</string-name>
          <email>josemaria.arriola@ehu.eus</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>CLARIAH-EUS, HiTZ Basque Center for Language Technology - Ixa NLP Group, University of the Basque Country (UPV/EHU)</institution>
          ,
          <addr-line>Manuel Lardizabal pasealekua, 1, 20018 Donostia-San Sebastian, Gipuzkoa, Basque Country</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>CLARIAH-EUS is a node within CLARIAH-ES, Spain's distributed infrastructure for CLARIN and DARIAH, Europe's two principal digital research infrastructures for the humanities, arts, and social sciences. CLARIAH-EUS aims to sustain research in these fields of study that is related to Basque or Basque culture by supporting scholars with digital tools and resources. The node is unique because it seeks to service a language (Basque) and not a territory, making the infrastucture transnational in scope. In this article, we describe the motivations for creating CLARIAH-EUS, how it was constructed, the projects that are in currently in development, and future plans.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Basque</kwd>
        <kwd>Infrastructure</kwd>
        <kwd>CLARIN ERIC</kwd>
        <kwd>DARIAH ERIC</kwd>
        <kwd>CLARIAH</kwd>
        <kwd>Digital Humanities</kwd>
        <kwd>Arts</kwd>
        <kwd>Social Sciences</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction and motivation</title>
      <p>
        true of social scientists, who have adapted to the
emergence of big data by devising novel techniques that take
The nature of research is in constant flux. This steady advantage of an ocean of digital information, casting
change is especially evident in fields where technology new light on how data can represent reality [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Even
is required for research. Conversely, it is not always as the lines between disciplines have softened as digital
hupronounced in those where the presence of technology manities, in its broadest sense, has necessarily prompted
is deemed to be less essential; areas in which qualitative unforeseen collaborative and transdisciplinary research,
analysis often plays as significant a role as its quantitative teaching, and publishing [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
counterpart. The humanities, arts, and social sciences, The results of this type of innovative work sometimes
with some notable exceptions, have customarily fallen overshadow the fact that language technology is often
into the latter camp. at its core. More concretely, language technology
specif
      </p>
      <p>
        Over the past two decades, however, a small cadre ically crafted to aid the humanities, arts, and social
sciwithin these disciplines has begun to take advantage of ences. And the successful application of this particular
digital technology in ways that have given rise to new specialization, it is worth remembering, depends to a
modes of research and, consequently, new lines of in- great extent on the availability of tools, resources, and
quiry [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The advent of this "digital turn" are attested data for and in the languages that are being utilized for
to by approaches and results that were once impossible. research. As may be appreciated, this is one reason why
The use of digital tools and methods are, for example, the development of language technologies is important
reshaping how GLAMs (galleries, libraries, archives, and for all languages, but absolutely crucial for languages
museums) engage with cultural heritage. The same is that belong to relatively small populations [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
      <p>
        Although Basque may be counted among the
lesserspoken languages, its situation in terms of language
technology is comparatively favorable to languages of a
similar size. This is largely thanks to significant progress in
“fostering the necessary sociolinguistic conditions for the
successful development and dissemination of LT,” which
has resulted in “state-of-the-art technology and robust,
broad-coverage NLP for Basque” [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ]. Equally
important in this regard are the several decades of collaborative
work between research groups, foundations, language
industry clusters, and regional institutions. Yet, despite
these eforts, Basque remains in a precarious position
with respect to research maturity and readiness.
      </p>
      <p>The CLARIAH-EUS consortium was created to help
address these shortcomings. Its goals are twofold: 1) The UPV/EHU is represented by the Vice-Rectorate
promote language technology for researchers involved of Basque, Culture and Internationalization,3 which
parin Basque-related humanities, arts, and social sciences ticipates as a financing entity, and by HiTZ, the Basque
and 2) foster and facilitate relationships between these Center for Language Technology.4 The latter, which also
researchers so that they may better exchange ideas and in- provides funding, is responsible for CLARIAH-EUS’s
adnovative approaches. For these reasons, CLARIAH-EUS ministrative ofice and several of its members are part of
is centered on language rather than territory or region, the CLARIAH-EUS steering committee.
making it transnational in scope. Organizationally speak- With the help of these institutions, CLARIAH-EUS has
ing, the network is a node within CLARIAH-ES, Spain’s hired four staf members, who oversee the maintenance
distributed infrastructure for CLARIN ERIC and DARIAH of the CLARIAH-EUS infrastructure and perform duties
ERIC, Europe’s two principal digital research infrastruc- within the administrative ofices of both CLARIAH-EUS
tures for the humanities, arts, and social sciences. and CLARIAH-ES (for which HiTZ is also responsible).5</p>
    </sec>
    <sec id="sec-2">
      <title>2. Objectives</title>
    </sec>
    <sec id="sec-3">
      <title>4. Development of CLARIAH-EUS</title>
      <p>As stated above, one of CLARIAH-EUS’s objectives is to The development of CLARIAH-EUS has occurred in two
support language technology for researchers involved in stages: a design phase (2021-2023) (see sections 4.1. and
Basque-related humanities, arts, and social sciences. Part 4.2) and an implementation phase (2023-present), during
of this activity involves producing digital resources for which time CLARIAH-EUS has become an active node
Basque and integrating them into the CLARIN ERIC and (see sections 4.3 and 4.4).</p>
      <p>DARIAH ERIC infrastructures. Doing so will generate
tools that are designed with Basque in mind and provide 4.1. First workshop: needs and manifesto
researchers with better access to them. A second, closely
related area of activity, will focus on ofering services to On November 26, 2021, an initial workshop,6
Euresearchers who create or wish to utilize Basque language skararentzako hizkuntza-teknologia Humanitateetan eta
technology for the digital humanities. Zientzia Sozialetan garatzeko CLARIAH-EUS azpiegitura</p>
      <p>Another ambition of CLARIAH-EUS is to establish a diseinatzen (Designing the CLARIAH-EUS infrastructure to
research community that is devoted to sustaining lan- develop language technology for Basque in the Humanities
guage technology for Basque-related humanities, arts, and Social Sciences), was organized by HiTZ to create the
and social sciences. On the one hand, the purpose of this CLARIAH-EUS infrastructure.
goal is to encourage collaboration between scholars that Its objective was to discuss opportunities and needs
may lead to more impactful research or greater opportu- for diferent research areas. The workshop’s activities
innities for participation in international projects. On the cluded: 1) compiling a collection of use cases and posters
other, the aspiration stems from the belief that a strong of digital projects developed for anyone who wants to
research community is more likely to cultivate an en- study Basque, 2) making a list of the strategic resources
vironment that yields innovative approaches to Basque necessary for Basque and Basque research in diferent
disdigital humanities and language technology for Basque. ciplines, and 3) obtaining the involvement of researchers
to promote the CLARIAH-EUS research infrastructure.</p>
      <p>Nine institutions and thirty-four researchers from
3. Funding institutions twenty research groups participated and fourteen
projects were presented. In addition, 134 organizations
and individuals signed a manifesto7 calling for the
creation of a digital humanities infrastructure for Basque.</p>
      <p>CLARIAH-EUS is focused on maintaining long-lasting
and steady financial support so as to ensure short-,
midand long-term research initiatives and guarantee the
reusability of resources. Several public stakeholders share
this perspective and have provided funding to the
infrastructure. Currently, CLARIAH-EUS is supported by
the Basque Government through its Department of
Culture and Linguistic Policy,1 the Provincial Council of
Gipuzkoa,2 and the University of the Basque Country
(UPV/EHU).</p>
      <sec id="sec-3-1">
        <title>1https://www.euskadi.eus/eusko-jaurlaritza/</title>
        <p>kultura-hizkuntza-politika-saila/
2https://www.gipuzkoa.eus/eu/</p>
        <sec id="sec-3-1-1">
          <title>4.2. Weaving the network</title>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>From 2021 to 2023, the objective was to seek support from</title>
        <p>various organizations and research groups. This was
obtained from ten entities: HiTZ (UPV/EHU), Udako Euskal
Unibertsitatea (UEU), Iker research group, Elhuyar, Gogo</p>
      </sec>
      <sec id="sec-3-3">
        <title>3https://www.ehu.eus/eu/web/nazioarteko-harremanak</title>
        <p>4https://www.hitz.eus/eu
5https://www.clariah.es/
6https://www.clariah.eus/eu/1-workshop
7https://www.clariah.eus/eu/manifestua</p>
      </sec>
      <sec id="sec-3-4">
        <title>Elebiduna research group (UPV/EHU), Elebilab research</title>
        <p>group (UPV/EHU), Aholab research group (UPV/EHU),
Ixa research group (UPV/EHU), Soziolinguistika
Klusterra, and the Unesco Chair in Human Rights and Public
Powers (UPV/EHU). Nine of these organizations or
research groups are from the southern Basque Country and
one (Iker) is from the northern Basque Country.</p>
        <p>During this time, the CLARIAH-EUS node was also
defined in relation to CLARIAH-ES in Spain and the
CLARIN ERIC and DARIAH ERIC infrastructures at the
European level.</p>
        <sec id="sec-3-4-1">
          <title>4.3. Second workshop: community and organization</title>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>5. Projects and resources</title>
      <p>As underscored above, one of CLARIAH-EUS’s principal
objectives is to support researchers by providing tools
and resources10 that can be employed in digital
humanities and social sciences. Some of the tools and resources
available through CLARIAH-EUS were produced before
the infrastructure was constructed. These have been
integrated into the infrastructure to multiply their
outreach and usability. Others, however, have been or are
being developed under the auspices of CLARIAH-EUS.
The following examples fall under both categories.</p>
      <sec id="sec-4-1">
        <title>5.1. Parlamint-ES-PV 4.0</title>
        <p>CLARIAH-EUS’s second workshop8 was held in Novem- ParlaMint 4.0 is a set of comparable corpora11
containber 2023. In contrast to the previous workshop, its objec- ing transcriptions of parliamentary debates from
twentytive was to present the CLARIAH-EUS infrastructure and nine European countries and autonomous regions, mostly
its aims, as well as to survey ongoing work in Basque dating from 2015 and to mid-2022. The individual corpora
digital humanities. comprise between nine and 126 million words and the</p>
        <p>
          The event served as a kickof ceremony for the found- complete set contains over 1.1 billion words.
CLARIAHing members. CLARIAH-EUS’s structural organization EUS has created the corpus [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] in Basque and Spanish
and road map were discussed, with a focus on the strate- utilizing data and metadata from the Basque Parliament.
gic lines that will be developed over next five years. In
addition, two invited speakers gave talks and twenty-one 5.2. Computational social science
posters were presented. A selection of these, along with
descriptions of the research groups that took part in the
workshop, will be described in a forthcoming
publication.9
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>4.4. Action protocol</title>
        <sec id="sec-4-2-1">
          <title>We have put in place an action protocol as part of</title>
          <p>CLARIAH-EUS’s implementation. To use the service,
a petitioner must first register a request at www.clariah.
eus/contacto. Once a petition is activated,
CLARIAHEUS updates the status of the request as changes
occur. The petitioner’s opinion is solicited at the end of
the collaboration or service through a survey: https:
//www.ixa.eus/events/clarink_survey. All node services
are evaluated at the end of each year.</p>
          <p>Furthermore, petitioners will also be asked to
acknowledge the node in resulting publications or on websites
by including the CLARIAH-EUS and HiTZ logos, along
with the following statement: “SUPPORTED by
CLARINEUS. HiTZ Center - University of the Basque Country
UPV/EHU.”</p>
        </sec>
        <sec id="sec-4-2-2">
          <title>8https://www.donostiakultura.eus/eu/ikastaroak/</title>
          <p>
            clariah-eus-euskararako-ikerketa-azpiegitura-eraikitzen
9https://www.clariah.eus/eu/2-workshopa-azpiegitura-eraikitzen
Three datasets related to social media analysis are
available for the purposes of experimentation and the
development of tools for Basque: 1) the Heldugazte12 [
            <xref ref-type="bibr" rid="ref8">8</xref>
            ]
dataset, designed to identify the writing style of a
specific text sequence; 2) the Heldugazte-Age 13 [
            <xref ref-type="bibr" rid="ref9">9</xref>
            ] dataset,
meant to identify the age of Basque social media users by
classifying them as either minors or adults; and 3)
VaxxStance,14 [
            <xref ref-type="bibr" rid="ref10">10</xref>
            ] which seeks to identify the stance expressed
on social media regarding vaccines. Its objective is to
determine whether a given tweet expresses an AGAINST,
FAVOR, or NEUTRAL stance towards a previously
deifned topic.
          </p>
        </sec>
      </sec>
      <sec id="sec-4-3">
        <title>5.3. BIM/SAHCOBA</title>
        <p>
          Basque in the Making (BIM): A Historical Look at a
European Language Isolate and Syntactically Annotated
Historical Corpus in Basque (SAHCOBA) are two projects15
for the construction of a morphosyntactically annotated
Basque historical corpus [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. BIM and SAHCOBA are
interdisciplinary projects that include experts in
linguistics and natural language processing. The BIM project
aims to collect the most significant writings from the
iffteenth century to the mid-eighteenth century (Archaic
10https://www.clariah.eus/eu/baliabideak_sailkapena
11https://www.clarin.si/repository/xmlui/handle/11356/1859
12https://github.com/joseba-fdl/heldugazte-corpus
13https://github.com/joseba-fdl/heldugazte-age-corpus
14https://vaxxstance.github.io/
15http://bim.ixa.eus/search
and Old Basque), while the SAHCOBA project aims to ex- References
tend this corpus from the mid-eighteenth century to the
mid-twentieth century (Early and Late Modern Basque),
when standard Basque appeared. The corpus comprises
both part-of-speech and syntactic annotation, as well as
a rich set of metadata structure. The database allows the
annotated corpus to be searched by words, lemmas,
grammatical categories, sequences of grammatical categories,
and specific structural configurations.
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>6. The Future of CLARIAH-EUS</title>
      <sec id="sec-5-1">
        <title>With regard to the institutional nature of CLARIAH</title>
        <p>EUS, our objective is to restructure its current status
as a CLARIN K-centre into a CLARIN B-centre in order
to provide technical services as well as instructional
guidance to researchers. With respect to future work, our
current outlook is shaped by three criteria:
• Creating or adapting resources and services that
researchers can access from the CLARIAH-EUS
node.
• Creating or adapting resources and services that
are strategically needed within the Basque
community.
• Creating or adapting resources or services that
can be articulated with CLARIN ERIC and
DARIAH ERIC.</p>
      </sec>
      <sec id="sec-5-2">
        <title>In the short term, our goal is to ofer various resources</title>
        <p>
          and services in CLARIN and DARIAH by adapting
existing resources. By way of example, we hope to integrate
the Analhitza tool [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ], the Euscrawl system [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ], and
ParlaMint. Additionally, we intend to ofer several types
of corpora, such as literature, historical texts, and social
networks, as well as produce new resources, including
a data repository and APIs. In the medium term, our
main objective is to ofer resources and tools for the field
of education, while also working on other areas, such
as integration with the Virtual Language Observatory
(VLO) and the construction of language models. In the
long term, we will attempt to fashion tools and resources
for sociology, journalism, literature, and history. Ideally,
this work will coincide with GLAM-related projects.
        </p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <sec id="sec-6-1">
        <title>We wish to thank the Basque Government and its De</title>
        <p>partment of Culture and Linguistic Policy, the Provincial
Council of Gipuzkoa, the Vice-Rectorate of Basque,
Culture and Internationalization at the University of the
Basque Country (UPV/EHU), and the HiTZ center for
their generous support.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Terras</surname>
          </string-name>
          ,
          <article-title>Quantifying digital humanities, UCL Centre for Digital Humanities (</article-title>
          <year>2011</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>K.</given-names>
            <surname>Crawford</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Miltner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. L.</given-names>
            <surname>Gray</surname>
          </string-name>
          , Critiquing Big Data: Politics, Ethics, Epistemology,
          <source>International Journal of Communication</source>
          <volume>8</volume>
          (
          <year>2014</year>
          )
          <fpage>1663</fpage>
          --
          <lpage>1672</lpage>
          . URL: https://ijoc.org/index.php/ijoc/article/ view/2167/1164.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Burdick</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Drucker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Lunenfeld</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Presner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schnapp</surname>
          </string-name>
          , Digital_Humanities, Mit Press,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>X.</given-names>
            <surname>Arzoz</surname>
          </string-name>
          ,
          <article-title>The Impact of Language Policy on Language Revitalization: The Case of the Basque Language, Cultural and Linguistic Minorities in the Russian Federation and the European Union: Comparative Studies on Equality and Diversity (</article-title>
          <year>2015</year>
          )
          <fpage>315</fpage>
          -
          <lpage>334</lpage>
          . URL:
          <volume>10</volume>
          .1007/978-3-
          <fpage>319</fpage>
          -10455-3_
          <fpage>12</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>K.</given-names>
            <surname>Sarasola</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Aldabe</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Díaz de Ilarraza,
          <string-name>
            <given-names>A.</given-names>
            <surname>Estarrona</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Farwell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Hernáez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Navas</surname>
          </string-name>
          ,
          <source>Language Report Basque, in: European Language Equality: A Strategic Agenda for Digital Language Equality</source>
          , Springer,
          <year>2023</year>
          , pp.
          <fpage>95</fpage>
          -
          <lpage>98</lpage>
          . doi:
          <volume>10</volume>
          .1007/ 978-3-
          <fpage>031</fpage>
          -28819-
          <issue>7</issue>
          _
          <fpage>5</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>I.</given-names>
            <surname>Gonzalez-Dios</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Altuna</surname>
          </string-name>
          ,
          <article-title>Natural Language Processing and Language Technologies for the Basque Language</article-title>
          , Cuadernos Europeos de Deusto (
          <year>2022</year>
          )
          <fpage>203</fpage>
          -
          <lpage>230</lpage>
          . doi:https://doi.org/ 10.18543/ced.2477.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Alkorta</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Iruskieta, Adding the Basque Parliament Corpus to ParlaMint Project</article-title>
          ,
          <source>in: Proceedings of the Workshop ParlaCLARIN III within the 13th Language Resources and Evaluation Conference</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>107</fpage>
          -
          <lpage>110</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J.</given-names>
            <surname>Fernandez de Landa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Agerri</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Alegria</surname>
          </string-name>
          ,
          <source>Large Scale Linguistic Processing of Tweets to Understand Social Interactions among Speakers of Less Resourced Languages: The Basque Case, Information</source>
          <volume>10</volume>
          (
          <year>2019</year>
          ). URL: https://www.mdpi.com/2078-2489/ 10/6/212. doi:
          <volume>10</volume>
          .3390/info10060212.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J.</given-names>
            <surname>Fernandez de Landa</surname>
          </string-name>
          , R. Agerri,
          <article-title>Social analysis of young Basque-speaking communities in twitter</article-title>
          ,
          <source>Journal of Multilingual and Multicultural Development</source>
          <volume>0</volume>
          (
          <year>2021</year>
          )
          <fpage>1</fpage>
          -
          <lpage>15</lpage>
          . URL: https://doi.org/10.1080/ 01434632.
          <year>2021</year>
          .
          <volume>1962331</volume>
          . doi:
          <volume>10</volume>
          .1080/01434632.
          <year>2021</year>
          .
          <volume>1962331</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>R.</given-names>
            <surname>Agerri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Centeno</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Espinosa</surname>
          </string-name>
          , J. Fernandez de Landa, A. Rodrigo, VaxxStance@
          <article-title>IberLEF 2021: overview of the task on going beyond text in cross-lingual stance detection</article-title>
          ,
          <source>Procesamiento del Lenguaje Natural</source>
          <volume>67</volume>
          (
          <year>2021</year>
          )
          <fpage>173</fpage>
          -
          <lpage>181</lpage>
          . URL:
          <volume>10</volume>
          .26342/ 2021-67-15.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>A.</given-names>
            <surname>Estarrona</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Etxeberria</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Soraluze</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Etxepare</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Padilla-Moyano</surname>
          </string-name>
          ,
          <article-title>The first annotated corpus of historical Basque, Digital Scholarship in the Humanities 37 (</article-title>
          <year>2022</year>
          )
          <fpage>391</fpage>
          -
          <lpage>404</lpage>
          . URL: https: //hal.science/hal-03505658.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>A.</given-names>
            <surname>Otegi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Imaz</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Díaz de Ilarraza,
          <string-name>
            <given-names>M.</given-names>
            <surname>Iruskieta</surname>
          </string-name>
          , L. Uria,
          <article-title>ANALHITZA: a tool to extract linguistic information from large corpora in Humanities research (</article-title>
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>M.</given-names>
            <surname>Artetxe</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Aldabe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Agerri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Perez-de Viñaspre</surname>
          </string-name>
          , A. Soroa,
          <article-title>Does Corpus Quality Really Matter for Low-Resource Languages?</article-title>
          , in: Y.
          <string-name>
            <surname>Goldberg</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          <string-name>
            <surname>Kozareva</surname>
          </string-name>
          , Y. Zhang (Eds.),
          <source>Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing</source>
          , Association for Computational Linguistics, Abu Dhabi, United Arab Emirates,
          <year>2022</year>
          , pp.
          <fpage>7383</fpage>
          -
          <lpage>7390</lpage>
          . URL: https: //aclanthology.org/
          <year>2022</year>
          .emnlp-main.
          <volume>499</volume>
          . doi:
          <volume>10</volume>
          . 18653/v1/
          <year>2022</year>
          .emnlp-main.
          <volume>499</volume>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>