<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>IberLEF 2021 Overview: Natural Language Processing for Iberian Languages</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Julio Gonzalo</string-name>
          <email>julio@lsi.uned.es</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Manuel Montes-y-Gomez</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Paolo Rosso</string-name>
          <email>prosso@dsic.upv.es</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>National Institute of Astrophysics</institution>
          ,
          <addr-line>Optics and Electronics, Puebla</addr-line>
          ,
          <country country="MX">Mexico</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>PRHLT Research Center, Universitat Politecnica de Valencia</institution>
          ,
          <country country="ES">Spain</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>UNED NLP and IR Research Group</institution>
          ,
          <addr-line>Madrid</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2021</year>
      </pub-date>
      <abstract>
        <p>Resumen IberLEF is a comparative evaluation campaign for Natural Language Processing Systems in Spanish and other Iberian languages. Its goal is to encourage the research community to organize competitive text processing, understanding and generation tasks in order to de ne new research challenges and set new state-of-the-art results in those languages. This paper summarizes the evaluation activities carried out in IberLEF 2021, which included twelve tasks dealing with emotions, stance and opinions, harmful information, health-related information extraction and discovery, humor and irony, and lexical acquisition. Overall, IberLEF activities were a remarkable collective e ort involving 359 researchers from 22 countries in Europe, Asia and the Americas.</p>
      </abstract>
      <kwd-group>
        <kwd>Natural Language Processing</kwd>
        <kwd>Arti cial Intelligence</kwd>
        <kwd>Evaluation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        IberLEF is a comparative evaluation campaign for Natural Language
Processing Systems in Spanish and other Iberian languages. Its goal is to encourage
the research community to organize competitive text processing, understanding
and generation tasks in order to de ne new research challenges and set new
state-of-the-art results in those languages. This paper summarizes the
evaluation activities carried out in IberLEF 2021, which included twelve tasks dealing
with emotions, stance and opinions, harmful information, health-related
information extraction and discovery, humor and irony, and lexical acquisition. Overall,
IberLEF activities were a remarkable collective e ort involving 359 researchers
from 22 countries in Europe, Asia and the Americas. Papers with system
descriptions are included in the IberLEF 2021 Proceedings [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], and papers with
task overviews have been published in the journal Procesamiento del Lenguaje
Natural, vol. 67 (September 2021 issue).
      </p>
      <p>In this paper we summarize the activities carried on in IberLEF 2021,
extracting some aggregated gures for a better understanding of this collective
e ort.
2.</p>
    </sec>
    <sec id="sec-2">
      <title>IberLEF 2021 Tasks</title>
      <p>These are the twelve tasks organized succesfully in 2021, grouped
thematically:
2.1.</p>
      <sec id="sec-2-1">
        <title>Emotions, Stance and Opinions</title>
        <p>
          EmoEvalEs [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] was an emotion classi cation task, where systems were asked
to predict which emotions are present in texts written in Spanish (from this set:
anger, disgust, fear, joy, sadness, surprise, others). Twitter was used as textual
source, and the dataset consists of 8232 manually annotated tweets. 15 research
groups submitted runs for this task, out of which 11 submitted papers to the
proceedings.
        </p>
        <p>
          REST-MEX [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] was an evaluation exercise focused on recommendation
tasks using TripAdvisor as textual source, with texts written in several variants
of Spanish (Mexican Spanish being the most common). Task 1
(Recommendation) consists in predicting the degree of satisfaction (in a 1-5 scale) of a tourist
visiting a given Mexican place, given the information available in TripAdvisor
about the tourist and about the site. The tourist pro le includes gender, place of
origin, her textual self-description in TripAdvisor, and her opinions on places she
has visited. The information about the place is a brief textual description and
a series of representative characteristics of the place for touristic purposes
(adventure, beach, family atmosphere, etc.). Task 2 (Sentiment Polarity ) consists
of predicting the polarity (in a 1-5 scale) of a given TripAdvisor opinion.
        </p>
        <p>Overall, the dataset gathers 2263 instances tourist/destination for the rst
task and 7413 opinions for the second task. 2 groups submitted results for task
1 and 7 for task 2.</p>
        <p>
          VaxxStance [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] focused on predicting the stance of short texts (tweets) with
respect to vaccines (in favour, neutral or against). This was a multilingual task
including Spanish (2697) and Basque (1384) tweets.
        </p>
        <p>The challenge was addressed in three variants: in Task 1 (close track), systems
could only use the text of the tweets; in Task 2 (open track), systems could use
any kind of data (including tweets' metadata); nally, Task 3 (zero-shot track)
was a cross-lingual stance detection challenge: systems were trained on one of
the languages and tested on the other language. Three groups participated in
the rst task, and one in the second and third tasks.
2.2.</p>
      </sec>
      <sec id="sec-2-2">
        <title>Harmful Information</title>
        <p>
          There were four challenges around harmful textual information in 2021:
MeO endES [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] focused on o ensive language detection in Spanish, and
included two subtasks on a dataset of generic Spanish and two subtasks on
a Mexican Spanish corpus. The generic Spanish dataset (O endES) comprises
30,416 comments collected from Twitter, Instagram and Youtube; the Mexican
Spanish dataset (O endMEX) comprises 7319 annotated tweets.
        </p>
        <p>The tasks on generic Spanish asked systems to predict the right class from
OFP (o ensive, target person), OFG (o ensive, target group), OFO (o
ensive, target others), NOE (non o ensive, but with expletive language), NO (not
o ensive). Systems were also asked to predict the strenght of the class, taken
as the ratio of annotators than concur on the class. Subtask 1 allowed textual
data as input, and Subtask 2 allowed metadata as additional input. Four teams
submitted results for the rst task, and one for the second.</p>
        <p>The tasks on Mexican Spanish asked systems to do a binary prediction
(offensive / not o ensive), using only textual input (subtask 3) or also metadata
(subtask 4). 10 groups submitted results to subtask 3 and one to subtask 4.</p>
        <p>
          EXIST [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] focused on the identi cation of sexism in Spanish and English
texts, asking systems to predict whether a text has sexist content (Subtask 1)
and to identify the type of sexism (ideological and inequality / stereotyping
and dominance / objecti cation / sexual violence / mysogyny and non-sexual
violence) in Subtask 2. The dataset comprises 13,000 tweets and 982 gabs. 31
groups submitted results for the rst subtask, and 27 for the second.
        </p>
        <p>
          DETOXIS [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] focused on the identi cation of toxic content in texts, and
prepared a dataset with 4359 comments from news and online forums, annotated
with their level of toxicity (in a scale from 0 to 3). Subtask 1 required a binary
classi cation (toxic / non toxic) and Subtask 2 asked systems to predict the level
of toxicity in the same scale that was annotated. 31 groups submitted to the rst
task and 24 to the second.
        </p>
        <p>
          Finally, FakeDeS [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] focused on discovering fake news written in Spanish,
and prepared a dataset with 971 news articles written in Spanish from Spain
and Mexico. It was designed as a binary classi cation task (fake or real), and 16
groups submitted results.
2.3.
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>Health-Related Information Extraction and Discovery</title>
        <p>
          Health-Related content received special attention in IberLEF 2021, as in
previous editions, with two tasks related to the medical domain:
e-HealthKD [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] focused on entity recognition and classi cation. Systems
had to recognize and classify concepts, actions, predicates and references in
subtask 1, and to extract relations between them (subtask B). e-HealthKD also
contemplated a main, complex task where both entity recognition and relation
extraction were evaluated jointly. 8 participants submitted results to subtask A
and, out of them, 7 also submitted results to subtask B and to the main
challenge. The organizers performed an exhaustive annotation of 1,800 sentences
extracted from MedLinePlus, WikiNews and the CORD-19 corpus.
        </p>
        <p>
          MEDDOPROF [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] worked on clinical cases (the annotations include 1844
cases extracted from medical literature), and asked systems to annotate
information related to occupations/professions. Task 1 (NER) was about nding
mentions of occupations and classifying each of them as a profession, an employment
status or an activity; Task 2 (CLASS) involved nding mentions of occupations
and determining whether they are related to the patient, to a family member,
to a health professional or to someone else; and Task 3 (NORM) was about
mapping predictions to one of the codes in a list of unique concept identi ers
from the European Skills, Competences, Quali cations and Occupations (ESCO)
classi cation and relevant SNOMED-CT terms. 15 groups submitted results to
Task 1, 11 to Task 2 and 8 to Task 3.
2.4.
        </p>
      </sec>
      <sec id="sec-2-4">
        <title>Humour and Irony</title>
        <p>There were two tasks related to Humour and Irony in 2021:</p>
        <p>
          HAHA [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] dealt with humour detection and characterization in Spanish
texts, and included four subtasks: (1) humour detection, which required
determining whether a tweet was humorous or not; (2) funniness score prediction, in
a 1-5 scale; (3) humour mechanism classi cation, out of a set of classes such as
irony, wordplay, hyperbole or shock; (4) humour content classi cation: predict
the content of the joke from a set of classes such as racist jokes, sexist jokes,
dark humour, dirty jokes, etc. The dataset included 36,000 annotated tweets. 14
groups submitted to the rst task, 11 to the second, 9 to the third and 8 to the
fourth.
        </p>
        <p>
          IDPT [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] was a task on irony detection in Portuguese texts, de ned as a
binary classi cation problem (is this text ironic or not?). The dataset included
18494 news pieces and 15212 tweets, and 7 groups submitted results for the task.
2.5.
        </p>
      </sec>
      <sec id="sec-2-5">
        <title>Lexical Acquisition</title>
        <p>
          ADoBo [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] focused on the acquisition of borrowings into Spanish from
other languages (English primarily). Systems were asked to detect expressions
(in Spanish news articles) that have been imported from other languages in their
raw form. The dataset is an annotated collection of news articles that comprise
372,701 tokens. Four systems submitted results for this task.
3.
3.1.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Aggregated Analysis of IberLEF 2021 Tasks</title>
      <sec id="sec-3-1">
        <title>Tasks characterization</title>
        <p>In terms of languages, the distribution per tasks (including subtasks) is
shown in Figure 1. 74 % of the tasks deal at least with Spanish, which is the
predominant subject of study in IberLEF. In terms of variants of Spanish, Spain
and Mexico are the best represented, with other variants having only anecdotal
presence. English is used (never as the main language) in 14 % of the tasks, and
this year Basque appears for the rst time in IberLEF being present in 9 % of
the tasks (all belonging to VaxxStance). Finally, there is also one task dealing
with Portuguese.</p>
        <p>The trend in the number of languages is positive: there were two in IberLEF
2019 (Spanish and Portuguese), only one in 2020 (Spanish) and four languages
in 2021.</p>
        <p>In terms of abstract task types, the distribution of tasks can be seen in
Figure 2. Out of a total of 29 tasks (each subtask is counted as a task here),
7 (24 %) are binary classi cation tasks, which is the most popular choice.
Multiclass classi cation problems are also well represented with 6 tasks. There are
also four tasks where classes are ordinals (e.g. 0,1,2,3) that can be interpreted
either as a regression or a multiclass classi cation problem (regression /
multiclass classi cation in the gure). Another variant of classi cation problems is
ordinal classi cation, where classes have a relative ordering (e.g. in favour,
neutral or against in stance classi cation): 3 tasks match this abstract task type.
Finally, there is also a normalization task which implies matching profession
descriptions in text with standard thesauri / ontologies, which can be seen as
an extreme classi cation task (i.e. a classi cation problem where the number of
classes is extremely large).</p>
        <p>There are only 3 sequence labelling tasks, which is perhaps less than expected
for an evaluation campaigned focused heavily on Natural Language Problems:
tasks that identify speci c structures or text chunks in text, such as named
entities, fall into this category. Two of them are related to the medical domain,
and the other one looks for lexical borrowings (imports from other languages).</p>
        <p>Finally, there are two genuine regression tasks, where systems must predict a
real number, and only one complex task, where the organizers try to measure the
joint performance of systems in two subtasks that build together with a common
goal: the e-healthKD main task.</p>
        <p>Figura 2. Distribution of IberLEF 2021 tasks per abstract task type.</p>
        <p>Overall, IberLEF 2021 tasks address a representative sample of abstract task
types, covering a wide range of problems. Probably, to get nearer industry needs,
in the future we should investigate more how to evaluate complex, end user tasks.
IberLEF is also missing tasks that involve text generation, such as text
summarization or machine translation problems; and tasks that involve interaction with
the users, such as dialogue systems. Finally, we would like to see more application
domains in the list of tasks.</p>
        <p>
          In terms of evaluation metrics, the distribution can be seen in Figure 3.
As in previous years, there is a remarkable predominance of F1 (20 tasks used
it as the main evaluation metric to rank systems), which is used for all types
of classi cation tasks (even if it does not perfectly match the problem at hand,
as in ordinal classi cation problems) and for sequence labelling problems.
Accuracy is used in a couple of classi cation tasks, and Bacc (Balanced Accuracy) in
another. Finally, CEM (Closeness Evaluation Metric), a metric specially useful
for ordinal classi cation tasks and introduced recently [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] is used for one of the
classi cation/regression tasks. Tasks interpreted as regression problems are
evaluated with MSE (Mean Squared Error), RMSE (Root Mean Squared Error) and
MAE (Mean Absolute Error), with none of these metrics particularly favoured.
        </p>
        <p>Overall, we take this as a hint that the eld might be relying too much on F1.
It has some desirable properties (particularly, it is robust to the characteristics
of the dataset), but it has severe limitations too. Its primary shortcoming is that
it hides the actual behaviour of systems, as with all averages (F1 is a harmonic</p>
        <p>Figura 3. Distribution of o cial evaluation metrics in IberLEF 2021 tasks.
average of precision and recall). For multiclass classi cation, the most common
procedure is to compute the arithmetic average of the harmonic averages of
precision/recall across classes, which is a way of focusing exclusively on system
ranking and giving up on understanding why systems fail and when. We think
that the usage of F1 should be accompanied with other metrics.</p>
        <p>Most importantly, the choice of metrics does not seem to be made justi ed on
how the system output is going to be used, but rather on mere popularity of the
metrics. This is not a shortcoming of IberLEF tasks only: most NLP challenges
su er from the same problems.</p>
        <p>Figure 4 shows how IberLEF tasks have evolved in the three years that it
has been running on. The number of tasks has increased (from 9 in 2019 to 12
in 2021); and in 2021 the number of new tasks is 9 (75 %), a sign that the scope
of problems being studied becomes larger every year. The lower gures in 2020
are due to the irruption of COVID-19: some of the tasks could not be completed
and are not depicted in the graph.
3.2.</p>
      </sec>
      <sec id="sec-3-2">
        <title>Datasets and results</title>
        <p>
          In terms of types of textual sources, Figure 5 shows how they are used in
IberLEF 2021 tasks. Twitter is the most popular source, with 15 tasks relying
solely or partially on Twitter data. This does not necessarily mean that the eld
is primarily interested in microblogging communication; it probably re ects that
collecting Twitter data is more cost e ective given IPR issues and other di
culties in gathering data to redistribute to the scienti c community [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. All other
sources are used by at most three tasks. The good news is that there are many
        </p>
        <p>Figura 4. Evolution of IberLEF tasks across time.</p>
        <p>Figura 5. Types of textual sources in IberLEF 2021 tasks.
additional sources used by two or three tasks: news and news comments,
medical sources, material from other social networks such as YouTube, Instagram,
TripAdvisor and Gab, etc.</p>
        <p>Figura 6. Dataset sizes in IberLEF 2021 classi cation tasks.</p>
        <p>In terms of dataset sizes and annotation e orts, it is di cult to establish fair
comparisons, because of the diversity of text sizes and the wide variance in terms
of annotation di culty. Figure 6 compares dataset sizes for the classi cation
tasks, where it is more reasonable to establish direct comparisons.</p>
        <p>Overall the annotation e ort in IberLEF 2021 is remarkable, and it is a
signicant contribution to enlarge test collections at least for Spanish; and, therefore,
to enable signi cant advances in our eld for Spanish and the other languages
involved. The number of documents varies substantially, from over 35,000 tweets
(HAHA dataset on humour) to 971 news stories for FakeDes (fake news
detection). But again, direct comparisons are not fair: for instance, in the case of
HAHA, they are expanding annotations on a previously existing dataset
(developed in other HAHA editions); and, on the other hand, establishing whether
a piece of news is fake or real is probably much more time consuming than
classifying humor in tweets.</p>
        <p>IberLEF 2021 has been carried out without funding sources (other than those
obtained individually by the teams organizing and participating in the tasks).
If the IberLEF organization could directly fund the task organizers, this would
probably help reaching large and high quality annotations for all of the tasks
accepted each year.</p>
        <p>Figura 7. Performance of best systems versus baselines in IberLEF 2021 classi cation
tasks.</p>
        <p>In terms of progress with respect the state of the art, it is really di cult
to extract aggregated conclusions for the whole IberLEF e ort. In Figure 7 we
display a pairwise comparison between the best system and the best baseline, for
each of the tasks where at least one baseline is provided, and with respect to the
o cial ranking metric used in each task. To avoid confusion, we have restricted
the chart to tasks where the o cial metric varies between 0 (worst quality) and
1 (perfect output). Still, it is di cult to extract conclusions, because the e ort
put by task organizers in providing state-of-the-art baselines varies considerably
between tasks. We can say, however, that in a few cases improving the baseline
has proved to be challenging, and there is one case (MeO endEs subtask 4) where
the baseline beats the best system (by a narrow margin). It would probably
bene cial for future IberLEF editions to establish some minimum guidelines
about the types of baselines to expect in every task; again, this would be easier
to implement with dedicated funding.
3.3.</p>
      </sec>
      <sec id="sec-3-3">
        <title>Participation</title>
        <p>Given that IberLEF 2021 was not a funded initiative, participation has been
impressive, with a large fraction of current research groups interested in NLP
for Spanish organizing and/or participating in one or more tasks. Overall, 359
researchers representing 173 research groups from 22 countries in Europe, Asia
and the Americas were involved in IberLEF tasks.</p>
        <p>Figura 8. Number of groups participating in IberLEF 2021 tasks per country
Figura 9. Number of researchers participating in IberLEF 2021 tasks per country.
ve) representing roughly 80 % of the researchers involved. The fact that two
countries in the top ve, China and India, appear in the top ve indicates two
things: rst, that Spanish attracts the attention of the NLP community at large;
and second, that current NLP technologies enable processing dataset without
language-speci c machinery, other than pretrained language models made
available to the research community.</p>
        <p>Figura 10. Distribution of participants per task in IberLEF 2021.</p>
        <p>The distribution of research groups per task is shown in Figure 10.
Participation ranges between 31 groups (EXIST subtask 1 and DETOXIS subtask 1)
and one group (MeO endES subtask 2, VaxxStance zero-shot track and
VaxxStance open track). As in other evaluation initiatives, participation seems to be
driven not only by the task intrinsic interest, but also by the cost of entry: in
general, classi cation tasks (the most basic machine learning task, for which
more plug and play software packages exist) receive more participation than tasks
which require more elaborated approaches and more creativity to assemble
algorithmic solutions. In the middle of the table we can nd most tasks in the
medical domain, which attract many groups in spite of being (in general) highly
challenging.</p>
        <p>Figure 11 shows how participation has evolved in time; while 2020 was a
di cult year with the irruption of COVID-19, in 2021 participation has grown
considerably, with 173 groups (three times larger than in 2020 and a 30 % increase
with respect to 2019). The number of countries involved has also grown from 18
to 22.</p>
        <p>Figura 11. Number of research groups participating in IberLEF across time.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Conclusions</title>
      <p>In its third edition, IberLEF has again been a remarkable collective e ort for
the advancement of Natural Language Processing in Spanish and other Iberian
languages: with 12 main tasks and 359 researchers involved, from institutions
in 22 countries in Europe, Asia and the Americas. IberLEF 2021 has been the
largest up to date, and has contributed to advance the eld in the areas of
emotions, stance and opinions, harmful information, health-related information
extraction and discovery, humour and irony, and lexical acquisition. In a eld
where machine learning is the ubiquitous approach to solve challenges, the de
nition of research challenges, their associated evaluation methodologies and the
development of high-quality test collections that allow for iterative evaluation is
probably the most critical step towards success. We believe IberLEF is making
a signi cant contribution in this direction.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgements</title>
      <p>The authors of this overview have been supported by the Spanish
Government, Ministry of Science and Innovation, via research grants MISMIS
(PGC2018096212-B), MISMIS-BIAS (PGC2018-096212-B-C32) and
MISMISFAKEnHATE (PGC2018-096212-B-C31); and by CONACyT-Mexico project
CB-2015-01257383 and the thematic networks program (Language Technologies Thematic
Network).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Agerri</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Centeno</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Espinosa</surname>
          </string-name>
          , M.,
          <string-name>
            <surname>de Landa</surname>
            ,
            <given-names>J.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alvaro</surname>
            <given-names>Rodrigo</given-names>
          </string-name>
          : Vaxxstance@
          <article-title>iberlef 2021: Overview of the task on going beyond text in cross-lingual stance detection</article-title>
          .
          <source>Procesamiento del Lenguaje Natural</source>
          <volume>67</volume>
          ,
          <issue>173</issue>
          {
          <fpage>181</fpage>
          (
          <year>2021</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Amigo</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gonzalo</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mizzaro</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Carrillo-de Albornoz</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>An e ectiveness metric for ordinal classi cation: Formal properties and experimental results</article-title>
          .
          <source>In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics</source>
          . pp.
          <volume>3938</volume>
          {
          <fpage>3949</fpage>
          . Association for Computational Linguistics,
          <source>Online (Jul</source>
          <year>2020</year>
          ). https://doi.org/10.18653/v1/
          <year>2020</year>
          .acl-main.
          <volume>363</volume>
          , https://aclanthology.org/
          <year>2020</year>
          .acl-main.
          <fpage>363</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>del Arco</surname>
            ,
            <given-names>F.M.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Casavantes</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Escalante</surname>
            ,
            <given-names>H.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mart</surname>
            n-Valdivia,
            <given-names>M.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>MontejoRaez</surname>
            , A., y Gomez,
            <given-names>M.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jarqu</surname>
            n-Vasquez,
            <given-names>H.</given-names>
          </string-name>
          ,
          <article-title>Villasen~or-</article-title>
          <string-name>
            <surname>Pineda</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Overview of meo endes at iberlef 2021: O ensive language</article-title>
          detection in
          <source>spanish variants 67</source>
          ,
          <volume>183</volume>
          {
          <fpage>194</fpage>
          (
          <year>2021</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>del Arco</surname>
            ,
            <given-names>F.M.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jimenez-Zafra</surname>
            ,
            <given-names>S.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Montejo-Raez</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Molina-Gonzalez</surname>
            ,
            <given-names>M.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>L. Alfonso</surname>
          </string-name>
          <article-title>Uren~a-</article-title>
          <string-name>
            <surname>Lopez</surname>
            ,
            <given-names>M.T.M.V.</given-names>
          </string-name>
          :
          <article-title>Overview of the emoevales task on emotion detection for spanish at iberlef 2021</article-title>
          .
          <source>Procesamiento del Lenguaje Natural</source>
          <volume>67</volume>
          ,
          <issue>155</issue>
          {
          <fpage>161</fpage>
          (
          <year>2021</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>Alvarez</given-names>
            <surname>Carmona</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Aranda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Arce-Cardenas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Fajardo-Delgado</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Guerrero-Rodr</surname>
          </string-name>
          <string-name>
            <given-names>guez</given-names>
            , R.,
            <surname>Lopez-Monroy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.P.</given-names>
            ,
            <surname>Mart</surname>
          </string-name>
          nez-Miranda,
          <string-name>
            <given-names>J.</given-names>
            , PerezEspinosa, H.,
            <surname>Rodr</surname>
          </string-name>
          guez-Gonzalez,
          <string-name>
            <surname>A.Y.</surname>
          </string-name>
          :
          <article-title>Overview of rest-mex at iberlef 2021: Recommendation system for text mexican tourism 67,</article-title>
          <volume>163</volume>
          {
          <fpage>172</fpage>
          (
          <year>2021</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Chiruzzo</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Castro</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gongora</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosa</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Meaney</surname>
            ,
            <given-names>J.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mihalcea</surname>
          </string-name>
          , R.:
          <article-title>Overview of haha at iberlef 2021: Detecting, rating and analyzing humor in spanish</article-title>
          .
          <source>Procesamiento del Lenguaje Natural</source>
          <volume>67</volume>
          ,
          <issue>257</issue>
          {
          <fpage>268</fpage>
          (
          <year>2021</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7. Corr^ea, U.B.,
          <string-name>
            <surname>Coelho</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Santos</surname>
          </string-name>
          , L.,
          <string-name>
            <surname>de Freitas</surname>
            ,
            <given-names>L.A.</given-names>
          </string-name>
          :
          <article-title>Overview of the idpt task on irony detection in portuguese at iberlef 2021</article-title>
          .
          <source>Procesamiento del Lenguaje Natural</source>
          <volume>67</volume>
          ,
          <issue>269</issue>
          {
          <fpage>276</fpage>
          (
          <year>2021</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Gomez-Adorno</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Posadas-Duran</surname>
            ,
            <given-names>J.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Enguix</surname>
            ,
            <given-names>G.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Porto</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          : Overview of fakedes at iberlef 2021:
          <article-title>Fake news detection in spanish shared task</article-title>
          .
          <source>Procesamiento del Lenguaje Natural</source>
          <volume>67</volume>
          ,
          <issue>223</issue>
          {
          <fpage>231</fpage>
          (
          <year>2021</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Lima-Lopez</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Farre-Maduell</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Miranda-Escalada</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Briva-Iglesias</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Krallinger</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Nlp applied to occupational health: Meddoprof shared task at iberlef 2021 on automatic recognition, classi cation and normalization of professions and occupations from medical texts</article-title>
          .
          <source>Procesamiento del Lenguaje Natural</source>
          <volume>67</volume>
          ,
          <issue>243</issue>
          {
          <fpage>256</fpage>
          (
          <year>2021</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>Alvarez</given-names>
            <surname>Mellado</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            ,
            <surname>Anke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.E.</given-names>
            ,
            <surname>Arroyo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.G.</given-names>
            ,
            <surname>Lignos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Zamorano</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.P.</surname>
          </string-name>
          : Overview of adobo 2021:
          <article-title>Automatic detection of unassimilated borrowings in the spanish press</article-title>
          .
          <source>Procesamiento del Lenguaje Natural</source>
          <volume>67</volume>
          ,
          <issue>277</issue>
          {
          <fpage>285</fpage>
          (
          <year>2021</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Montes</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gonzalo</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Aragon</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Agerri</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alvarez-Carmona</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alvarez Mellado</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Carrillo-de Albornoz</surname>
          </string-name>
          , J.,
          <string-name>
            <surname>Chiruzzo</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Freitas</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gomez</surname>
            <given-names>Adorno</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            ,
            <surname>Gutierrez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Jimenez-Zafra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.M.</given-names>
            ,
            <surname>Lima</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          ,
          <article-title>Plaza-del-</article-title>
          <string-name>
            <surname>Arco</surname>
            ,
            <given-names>F.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Taule</surname>
          </string-name>
          , M. (eds.):
          <source>Proceedings of the Iberian Languages Evaluation Forum (IberLEF</source>
          <year>2021</year>
          ) (
          <year>2021</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Piad-Mor s</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Estevez-Velarde</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gutierrez</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Almeida-Cruz</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Montoyo</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , Mun~oz, R.:
          <article-title>Overview of the ehealth knowledge discovery challenge at iberlef 2021</article-title>
          .
          <source>Procesamiento del Lenguaje Natural</source>
          <volume>67</volume>
          ,
          <issue>233</issue>
          {
          <fpage>242</fpage>
          (
          <year>2021</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Rangel</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>On the implications of the general data protection regulation on the organisation of evaluation tasks</article-title>
          .
          <source>Language and Law / Linguagem e Direito</source>
          <volume>5</volume>
          (
          <issue>2</issue>
          ),
          <volume>80</volume>
          {
          <fpage>102</fpage>
          (
          <year>2018</year>
          ), https://ojs.letras.up.pt/index.php/LLLD/article/view/6119
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Rodriguez-Sanchez</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>de Albornoz</surname>
            ,
            <given-names>J.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Plaza</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gonzalo</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Comet</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Donoso</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Overview of exist 2021: sexism identi cation in social networks</article-title>
          .
          <source>Procesamiento del Lenguaje Natural</source>
          <volume>67</volume>
          ,
          <issue>195</issue>
          {
          <fpage>207</fpage>
          (
          <year>2021</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Taule</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ariza</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nofre</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Amigo</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          : Overview of detoxis at iberlef 2021:
          <article-title>Detection of toxicity in comments in spanish</article-title>
          .
          <source>Procesamiento del Lenguaje Natural</source>
          <volume>67</volume>
          ,
          <issue>209</issue>
          {
          <fpage>221</fpage>
          (
          <year>2021</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>