UniOfGalway@IberLEF 2024:Hope Speech Recognition in
                         Spanish: A Comparative Analysis of Transformer-Based
                         Models
                         Arunraj Subburaj1,*,† , Amirthagadeshwaran Kathiresan1,† , Rahul Ponnusamy2 , Paul Butilaar2
                         and Bharathi Raja Chakravarthi3
                         1
                           University of Galway, Ireland
                         2
                           Data Science Institute, University of Galway, Ireland
                         3
                           School of Computer Science, University of Galway, Ireland


                                      Abstract
                                      Hope speech detection in social media is crucial for fostering positive engagement and resilience within marginal-
                                      ized communities. By utilizing machine learning and NLP techniques, researchers have focused on identifying
                                      hope speech in online content, achieving F1-score between 76% and 84%. This detection of hope speech contributes
                                      to understanding how social media platforms can either create a hopeful or hostile environment, influencing
                                      community sentiments and individual identities. Recognizing and promoting hope speech is essential for enhanc-
                                      ing interaction quality and support among users, especially within Equality, Diversity, and Inclusion initiatives,
                                      emphasizing the significance of hope in digital communication for building inclusive and supportive online
                                      ecosystems.

                                      Keywords
                                      Hope Speech Detection, Deep Learning, Natural Language Processing (NLP), Equality Diversity, and Inclusion
                                      (EDI),


                         1. Introduction
                         Hope is a significant psychological construct that influences human emotions, behavior, and overall
                         well-being. Individuals with high levels of hope tend to exhibit resilience in the face of challenges,
                         viewing them as opportunities for growth, which can lead to positive outcomes such as academic
                         success and reduced levels of depression [1]. Despite the profound impact of hope on individuals, its
                         exploration within the realm of Natural Language Processing (NLP) has been limited [2]. Recent efforts
                         have been made to promote research on hope through shared tasks, particularly focusing on hope
                         speech detection in various domains, including contexts related to Equality, Diversity, and Inclusion
                         (EDI) [3].
                            Social media platforms have become integral in shaping human behavior and decision-making
                         processes, with a growing number of individuals engaging with these platforms [4]. The increased
                         usage of social media has provided opportunities for informed decision-making based on social media
                         sentiment, particularly influencing marginalized groups such as women in STEM fields, the LGBTQ+
                         community, racial minorities, and individuals with disabilities. While social media serves as a vital
                         source of support and affirmation for many minority groups, it also presents risks, especially for young
                         internet users [4].
                            In the computational domain, various methods have been developed to identify hope speech, distin-
                         guishing it from neutral or non-hopeful content. Techniques such as deep learning, transformer models,
                         IberLEF 2024, September 2024, Valladolid, Spain.
                         *
                           Corresponding author.
                         †
                           These authors contributed equally.
                         $ arunraj.subburaj@gmail.com (A. Subburaj); amirthagadesh317@gmail.com (A. Kathiresan);
                         rahulponnusamy160032@gmail.com (R. Ponnusamy); paul.buitelaar@nuigalway.ie (P. Butilaar);
                         bharathi.raja@universityofgalway.ie (B. R. Chakravarthi)
                          0009-0002-6971-1997 (A. Subburaj); 0009-0006-3608-3735 (A. Kathiresan); 0000-0001-8023-7742 (R. Ponnusamy);
                         0000-0002-4575-7934 (B. R. Chakravarthi)
                                   © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
and linguistic features have been employed for hope-speech detection, emphasizing the importance
of online interactions in shaping individuals’ identities and perceptions of society [5]. Social media
platforms often exhibit two prevalent tones: hope and hate, with hope speech characterized by a positive
outlook anticipating favorable outcomes. Detecting hope speech in social media is crucial as it provides
comfort and empathy for marginalized groups seeking relatable content [6].
   Hope plays a fundamental role in human psychology, impacting emotions, behavior, and well-being.
Leveraging NLP techniques to analyze social media data for hope speech detection can offer valuable
insights into human behavior, decision-making, and the promotion of positive and inclusive online
environments, particularly for vulnerable communities like the LGBTQ+ community. Further research
in this area is essential to harness the potential of hope in fostering well-being and resilience among
diverse populations.
   This paper presents a novel deep-learning method for detecting hope speech and identifies effective
psycho-linguistic and linguistic features for the Spanish language. Section 2 details the Literature
Review, Section 3 describes the task and dataset statistics, while Section 4 outlines the methodology
employed. Sections 5 and 6 discuss the results and conclusions derived from the study.


2. Literature Review
Hope speech detection is a critical area of research, particularly in the realm of social media and
its impact on minority communities. Several studies have delved into this area, emphasizing the
importance of leveraging advanced technologies like deep learning and transformer-based models
for effective hope speech detection. Palakodety et al. [7] provide a foundational approach to hope
speech detection, their model’s limited linguistic diversity raises questions about its applicability across
different cultural contexts. This study extends their framework by incorporating a wider range of
linguistic and cultural nuances, particularly focusing on the Spanish-speaking LGBTQI community,
thus addressing the noted limitations. Additionally, Balouchzahi et al. [3] and Deepanshu et al. [6]
demonstrated a robust methodology utilizing BERT embeddings for hope speech detection. However,
their study did not explore the impact of informal language and slang frequently used in social media,
which could affect the generalizability of their model. Our study addresses this gap by incorporating
an enhanced preprocessing step that adapts to informal language variations, thereby improving the
model’s performance across diverse social media texts.. These studies underscore the importance of
utilizing advanced technologies for effective hope speech detection [8].
   Hope recognition in Spanish has gained significant attention in recent shared tasks such as LT-EDI-
EACL and Iber-LEF. Various transformer-based models have been developed to detect hope speech
in social media content. Transformer-based models, such as mBERT and XLM, have demonstrated re-
markable performance in cross-lingual understanding tasks [9]. These models utilize large Transformer
models pre-trained on multiple languages, showcasing their effectiveness in handling multilingual data.
Additionally, the combination of XLM-RoBERTa language models has proven successful in detecting
hope speech in social media [6]. Teams like giniUs and LPS have contributed to this field by present-
ing transformer-based models for hope speech detection, emphasizing the importance of identifying
hopeful comments for equality, diversity, and inclusion [10]. The shared tasks focus on automatically
identifying hopeful comments to promote positive sentiments and well-being in online interactions by
LT-EDI-EACL 2022 [11]. The significance of hope speech detection for equality, diversity, and inclusion
is underscored in various shared tasks and workshops [12, 13, 14]. These initiatives aim to identify hope
speech in comments to foster positivity and support among individuals. Furthermore, the utilization of
Transformer models in speech recognition systems has been widely recognized for their capacity to
parallelize tasks and integrate internal attention mechanisms [15, 16].
   Furthermore, the research by Chakravarthi et al.[17] showcases the success of models in identifying
and eliminating negativity, demonstrating the potential of automated systems in fostering positive
discourse. Moreover, Sidorov et al. [18] and García-Baena et al. [19] explore the effectiveness of
transformer models in detecting hope speech within the LGBT community. While their findings are
promising, they fall short in addressing the dynamic nature of language evolution on social media
platforms.It also emphasize the effectiveness of transformer models and contextual-aware resources in
detecting hope speech, especially in multilingual contexts like Spanish and English.
   Theoretical frameworks supporting hope speech detection often involve sentiment analysis and
hate speech detection. Studies by Laaksonen et al. [20] and Zhu [12] discuss the datafication of hate
speech and the importance of hate speech detection in society. These frameworks provide a foundation
for understanding the context in which hope speech detection operates, highlighting the necessity of
distinguishing between positive and negative speech content.
   The research landscape on hope speech detection is diverse, encompassing deep-learning models
and multilingual approaches, all aimed at promoting positivity and inclusivity in online dialogues. By
leveraging advanced technologies and theoretical frameworks from related fields such as sentiment
analysis and hate speech detection, researchers are making significant progress in identifying and
promoting hope speech on social media platforms.


3. Task and Dataset Description
Hope speech detection within the LGBTQI community and across various domains is a crucial task
with practical applications that necessitate models to generalize effectively. Numerous studies have
investigated related areas, offering valuable insights for this task. The Language Technology for
Equality, Diversity, and Inclusion (IberLEF-2024) dataset was utilized in the IberLEF 2024 shared task
[21, 22, 19, 8, 18] on Hope Speech Detection for Equality, Diversity, and Inclusion. The IberLEF-20241
dataset plays a significant role in advancing language technology applications that support these
important societal goals.
   The datasets used in this study were sourced from the Language Technology for Equality, Diversity,
and Inclusion (IberLEF-2024) dataset, made available by IberLEF 20242 for the shared task on Hope
Speech Detection [21, 22, 19, 8, 18]. This dataset comprises 1,400 labeled tweets and an additional 200
tweets for validation purposes. Additionally, it includes 400 unlabeled tweets intended for testing the
model’s generalization capabilities. Each tweet within the dataset is labeled with two categories: ’hope
speech’ (hs) or ’non-hope speech’ (nhs).
   Table 1 provides detailed statistics of the dataset used in this study for each sets with each labels
for Spanish language., including the distribution of hope speech and non-hope speech across training,
development, and test sets. This balanced dataset is crucial for training unbiased models and reflects
the diverse nature of language use in social media, pivotal for the accurate detection of hope speech.

Table 1
Dataset Statistics
                            Data                Classes                  Count of Tweets - Spanish
                                            Hope Speech (HS)                        700
                          Training
                                          Non Hope Speech (NHS)                     700
                                            Hope Speech (HS)                        100
                       Development
                                          Non Hope Speech (NHS)                     100
                            Test                    -                               400
                                                 Total                             2,000

   To understand the common terms used in hope speech (HS) and non-hope speech (NHS) tweets, word
clouds were generated for both categories (Figure 1). These visualizations highlight the most frequent
terms, providing insight into the linguistic features that may differentiate HS from NHS tweets. The HS
word cloud prominently features terms like "ser," "lgtbi," and "persona," indicating a focus on identity
and inclusivity. In contrast, the NHS word cloud also highlights "lgtbi" and "persona" but includes more
varied terms reflecting broader discourse.
1
    https://codalab.lisn.upsaclay.fr/competitions/17714
2
    https://codalab.lisn.upsaclay.fr/competitions/17714#participate-get_data
Figure 1: Word clouds for HS and NHS tweets


   Analyzing the text length distribution helps in understanding the typical length of tweets in each
category, which can inform preprocessing steps such as padding or truncation during model training.
The text length distribution for the validation set shows that HS tweets are often longer, with a
concentration around 275 characters (Figure 3). NHS tweets in the validation set exhibit a wider
distribution, with peaks around 150 and 275 characters. The training set exhibits a similar pattern, with
HS tweets predominantly longer and NHS tweets showing more variation in length (Figure 2). The test
set shows a varied distribution of tweet lengths, with a notable peak around 275 characters, indicating
that many tweets are close to the maximum allowed length on Twitter (Figure 4).


Figure 2: Training Set Length Distribution


   These visual analyses provide a comprehensive understanding of the dataset characteristics, aiding
in the development of effective NLP models for hope speech detection. From these figures, we can see
that the length and content of tweets play a crucial role in distinguishing between hope speech and
non-hope speech.


4. Methodology
4.1. Data Preprocessing
The text data preprocessing steps involved removing numerical characters, stripping punctuation,
converting text to lowercase, eliminating Spanish stopwords using NLTK, and removing emojis with
Figure 3: Validation Set Length Distribution


Figure 4: Test Set Length Distribution


regular expressions. These procedures aimed to enhance the dataset quality for machine learning
applications by focusing on linguistic content and meaningful words while maintaining consistency.

4.2. Data Encoding and Batching
Following preprocessing, the text data was tokenized using the AutoTokenizer from the Hugging Face
transformers library. The tokenizer converted the text into a sequence of tokens, which were then padded
to a consistent length of 128 tokens to ensure uniform input size. These tokenized sequences, along
with attention masks indicating real tokens versus padding, and the encoded labels, were encapsulated
in a TensorDataset. DataLoaders were subsequently created with a batch size of 16 to facilitate efficient
processing during model training.
Figure 5: Methodology Workflow


4.3. Model Training and Evaluation
4.3.1. Model Configuration
The models utilized in this study were initialized from pre-trained configurations using the AutoMod-
elForSequenceClassification. This allowed for leveraging the pre-trained knowledge and fine-tuning
the models for the specific binary classification task at hand, distinguishing between ’hope speech’ and
’non-hope speech’.

4.3.2. Training
The model training phase involved training the models over 10 epochs. The AdamW optimizer was
employed with an initial learning rate set at 5×10-̂5. Additionally, a linear scheduler was utilized to
dynamically adjust the learning rate during training based on the model’s progress. The training process
consisted of forward and backward passes to compute the loss and update the model parameters through
gradient descent, optimizing the model for the binary classification task.

4.3.3. Comparative Approaches explored
Following the training phase, the trained models were evaluated on a balanced validation set. Our
experiments included a range of models such as bert-base-spanish-wwm-cased-finetuned-spa-squad2-es,
bert-base-spanish-wwm-cased, bert-base-spanish-wwm-uncased [23], xlm-roberta-base [24], roberta-
base-bne [25] and electricidad-base-discriminator [26]. Moreover, our methodology involved utilizing
deep neural network transformer models with contextually sensitive lexical augmentation to enhance
the training datasets, thereby generating additional training samples. This approach aligns with current
trends in natural language processing research, emphasizing the significance of data augmentation for
enhancing model performance.
   Performance metrics such as accuracy and F1-score were computed to assess the models’ effectiveness
in distinguishing between ‘hope speech’ and ‘non-hope speech’. Furthermore, detailed classification
reports were generated to provide insights into the model’s performance, highlighting its ability to
correctly classify instances of ’hope speech’ and ’non-hope speech’. Model performances were visualized
using bar charts to facilitate a comparative analysis of different models based on their accuracy and
F1 scores. This visual representation aided in identifying the most effective models for hope speech
detection.


5. Results
5.1. Overview of Model Performance
The study evaluates the performance of six transformer-based models on the task of detecting hope
speech within tweets from the LGBTQI community. Each model was tested on a balanced validation
set, labeled as ’hope speech’ (hs) or ’non-hope speech’ (nhs). The models varied in their approach and
specialization in handling Spanish language texts. Below, we present a summary of their performance
in terms of precision, recall, F1 score, and overall accuracy.

5.2. Performance Metrics
The performance of the models is presented in Table 2. This table captures detailed metrics including
precision, recall, f1-score, and overall accuracy, assessed on a balanced validation set consisting of
tweets categorized into ’hope speech’ (hs) and ’non-hope speech’ (nhs).

5.3. Analysis of Results
5.3.1. Performance Discrepancies Across Models
The dccuchile/bert-base-spanish-wwm-cased model exhibited superior performance with the highest
overall accuracy (0.84). This model achieved particularly high precision in detecting hope speech (0.90)
and excellent recall in identifying non-hope speech (0.91). The high precision indicates that when
the model predicts a tweet as hope speech, it is very likely correct, making it a reliable choice for
applications where false positives (non-hope speech incorrectly labeled as hope speech) are particularly
problematic. The high recall in non-hope speech suggests that the model is also effective at capturing
most instances of non-hope speech, thereby reducing false negatives (hope speech incorrectly labeled
as non-hope speech). Conversely, the mrm8488/electricidad-base-discriminator model, while still
performing adequately, showed the lowest accuracy among the tested models at (0.76). Despite this,
its balanced precision and recall metrics indicate a conservative approach to classification—neither
overly penalizing nor overly rewarding any particular class. This balance makes it suitable for scenarios
where maintaining a moderate level of detection across categories is more crucial than achieving high
performance in one at the expense of the other.

5.3.2. Effectiveness in Linguistic Context Handling
Models like the xlm-roberta-base and PlanTL-GOB-ES/roberta-base-bne performed comparably well,
with accuracies of 0.82. These models showed a balanced capability in handling both hope and non-hope
speech, which reflects their robustness in linguistic context handling. Their performance underscores
the capability of transformer-based models to adapt to the subtleties of language used in different social
contexts, which is essential for applications deployed across diverse social media platforms.
Table 2
Model Performance Evaluation
    Methods                           Metrics     Hope    Non-Hope     Macro Avg     Weighted Avg
    dccuchile/bert-base-spanish-      Precision   0.81       0.90          0.85            0.84
    wwm-cased
                                      Recall      0.91       0.78          0.84            0.84
                                      F1-score    0.85       0.83          0.84            0.84
    dccuchile/bert-base-spanish-      Precision   0.79       0.85          0.82            0.81
    wwm-uncased
                                      Recall      0.86       0.77          0.81            0.81
                                      F1-score    0.82       0.81          0.81            0.81
    xlm-roberta-base                  Precision   0.84       0.82          0.83            0.82
                                      Recall      0.81       0.84          0.82            0.82
                                      F1-score    0.82       0.83          0.82            0.82
    PlanTL-GOB-ES/roberta-base-       Precision   0.83       0.81          0.82            0.82
    bne
                                      Recall      0.81       0.83          0.82            0.82
                                      F1-score    0.82       0.82          0.82            0.82
    mrm8488/electricidad-base-        Precision   0.73       0.80          0.76            0.76
    discriminator
                                      Recall      0.82       0.70          0.76            0.76
                                      F1-score    0.77       0.74          0.76            0.76
    mrm8488/bert-base-spanish-        Precision   0.76       0.85          0.81            0.80
    wwm-cased-finetuned-spa-
    squad2-es
                                      Recall      0.87       0.73          0.80            0.80
                                      F1-score    0.81       0.78          0.80            0.80


5.3.3. Implications of Model Specificities
The nuanced differences in model performance also shed light on the implications of model architecture
and training specifics. For instance, models fine-tuned on Spanish-language corpora or those specifically
optimized for question-answering tasks (like the mrm8488/bert-base-spanish-wwm-cased-finetuned-spa-
squad2-es) indicate that domain-specific training can influence model behavior significantly, especially
in terms of how they interpret the contextual and emotional nuances of a text.


6. Conclusion
This research has demonstrated the efficacy of deep learning models in the detection of hope speech
across social media platforms, underscoring their potential to foster positive online environments.
Practically, these models can be integrated into social media monitoring tools to automatically flag
and promote hopeful content, thereby enhancing user engagement and creating supportive online
communities. Additionally, organizations focused on mental health and community support can utilize
these models to identify and amplify positive messages, contributing to the well-being of marginalized
groups. By employing advanced NLP techniques, the study has effectively identified linguistic patterns
indicative of hope speech, which is particularly crucial for supporting marginalized communities. The
application of models like BERT and RoBERTa has revealed not only high accuracy in classifying text
but also the ability to adapt to the nuanced expressions of hope in different linguistic contexts.
   Furthermore, the results highlight the importance of continued development in machine learning
to enhance the sensitivity and specificity of these models. This is vital for reducing false positives
and ensuring that the detection of hope speech does not inadvertently suppress free expression. The
research also calls attention to the need for comprehensive datasets that reflect the diversity of language
use across various demographics and geographies to improve model generalization.
   In conclusion, while significant strides have been made, the path forward involves refining these
technologies through rigorous testing and broadening their linguistic and cultural scope. This ongoing
work will contribute to more inclusive and supportive online communities, ultimately using technology
to elevate the quality of social discourse and resilience among users.

6.1. Future Research Directions
Given the observed performance disparities, future research could investigate combining the strengths
of these models through ensemble techniques, which could leverage the high precision of one model
and the high recall of another to achieve better overall performance. Moreover, exploring additional
linguistic features and integrating socio-linguistic context more deeply could further enhance the
models’ ability to discern hope speech from varied textual inputs.


Acknowledgement
Author Bharathi Raja Chakravarthi was supported in part by a research grant from Science Foundation
Ireland (SFI) under Grant Number SFI/12/RC/2289_P2(Insight_2). Rahul Ponnusamy was supported in
part by a research grant from Science Foundation Ireland Centre for Research Training in Artificial
Intelligence under Grant No. 18/CRT/6223.


References
 [1] R. Vogel, F. Hattke, A century of Public Administration: Traveling through time and topics, Public
     Administration 100 (2022) 17–40.
 [2] S. Fyffe, P. Lee, S. Kaplan, “Transforming” personality scale development: Illustrating the potential
     of state-of-the-art natural language processing, Organizational Research Methods 27 (2024)
     265–300.
 [3] F. Balouchzahi, S. Butt, G. Sidorov, A. Gelbukh, CIC@ LT-EDI-ACL2022: Are transformers the
     only hope? Hope speech detection for Spanish and English comments, in: Proceedings of the
     second workshop on language technology for equality, diversity and inclusion, 2022, pp. 206–211.
 [4] T. Nath, V. K. Singh, V. Gupta, BongHope: An Annotated Corpus for Bengali Hope Speech
     Detection (2023).
 [5] B. R. Chakravarthi, Hope speech detection in YouTube comments, Social Network Analysis and
     Mining 12 (2022) 75.
 [6] D. Khanna, M. Singh, P. Motlicek, IDIAP_TIET@ LT-EDI-ACL2022: Hope Speech Detection in
     Social Media using Contextualized BERT with Attention Mechanism, in: Proceedings of the Second
     Workshop on Language Technology for Equality, Diversity and Inclusion, 2022, pp. 321–325.
 [7] S. Palakodety, A. R. KhudaBukhsh, J. G. Carbonell, Hope speech detection: A computational
     analysis of the voice of peace (2019).
 [8] F. Balouchzahi, G. Sidorov, A. Gelbukh, Polyhope: Two-level hope speech detection from tweets,
     Expert Systems with Applications 225 (2023) 120078. doi:10.1016/j.eswa.2023.120078.
 [9] A. Conneau, K. Khandelwal, N. Goyal, V. Chaudhary, G. Wenzek, F. Guzmán, E. Grave, M. Ott,
     L. Zettlemoyer, V. Stoyanov, Unsupervised cross-lingual representation learning at scale, arXiv
     preprint arXiv:1911.02116 (2019).
[10] H. Surana, B. Chinagundi, ginius@ lt-edi-acl2022: Aasha: transformers based hope-edi, in:
     Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion,
     2022, pp. 291–295.
[11] B. R. Chakravarthi, B. Bharathi, J. P. Mccrae, M. Zarrouk, K. Bali, P. Buitelaar, Proceedings of the
     second workshop on language technology for equality, diversity and inclusion, in: Proceedings of
     the Second Workshop on Language Technology for Equality, Diversity and Inclusion, 2022.
[12] Y. Zhu, LPS@ LT-EDI-ACL2022: an ensemble approach about hope speech detection, in: Proceed-
     ings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion, 2022,
     pp. 183–189.
[13] K. Puranik, A. Hande, R. Priyadharshini, S. Thavareesan, B. R. Chakravarthi, Iiitt@ lt-edi-eacl2021-
     hope speech detection: there is always hope in transformers, arXiv preprint arXiv:2104.09066
     (2021).
[14] B. R. Chakravarthi, V. Muralidaran, R. Priyadharshini, S. Cn, J. P. McCrae, M. Á. García, S. M.
     Jiménez-Zafra, R. Valencia-García, P. Kumaresan, R. Ponnusamy, et al., Overview of the shared
     task on hope speech detection for equality, diversity, and inclusion, in: Proceedings of the second
     workshop on language technology for equality, diversity and inclusion, 2022, pp. 378–388.
[15] M. Orken, O. Dina, A. Keylan, T. Tolganay, O. Mohamed, A study of transformer-based end-to-end
     speech recognition system for kazakh language, Scientific reports 12 (2022) 8337.
[16] L. B. Letaifa, J.-L. Rouas, Transformer model compression for end-to-end speech recognition on
     mobile devices, in: 2022 30th European Signal Processing Conference (EUSIPCO), IEEE, 2022, pp.
     439–443.
[17] B. R. Chakravarthi, Multilingual hope speech detection in English and Dravidian languages,
     International Journal of Data Science and Analytics 14 (2022) 389–406.
[18] G. Sidorov, F. Balouchzahi, S. Butt, A. Gelbukh, Regret and hope on transformers: An analysis of
     transformers on regret and hope speech detection datasets, Applied Sciences 13 (2023) 3983.
[19] D. García-Baena, M. Á. García-Cumbreras, S. M. Jiménez-Zafra, J. A. García-Díaz, R. Valencia-
     García, Hope speech detection in Spanish: The LGBT case, Language Resources and Evaluation
     57 (2023) 1487–1514.
[20] S.-M. Laaksonen, J. Haapoja, T. Kinnunen, M. Nelimarkka, R. Pöyhtäri, The datafication of hate:
     Expectations and challenges in automated hate speech monitoring, Frontiers in big Data 3 (2020)
     3.
[21] L. Chiruzzo, S. M. Jiménez-Zafra, F. Rangel, Overview of IberLEF 2024: Natural Language Process-
     ing Challenges for Spanish and other Iberian Languages, in: Proceedings of the Iberian Languages
     Evaluation Forum (IberLEF 2024), co-located with the 40th Conference of the Spanish Society for
     Natural Language Processing (SEPLN 2024), CEUR-WS.org, 2024.
[22] D. García-Baena, F. Balouchzahi, S. Butt, M. Á. García-Cumbreras, A. Lambebo Tonja, J. A. García-
     Díaz, S. Bozkurt, B. R. Chakravarthi, H. G. Ceballos, V.-G. Rafael, G. Sidorov, L. A. Ureña-López,
     S. M. Jiménez-Zafra, Overview of hope at iberlef 2024: Approaching hope speech detection in
     social media from two perspectives, for equality, diversity and inclusion and as expectations,
     Procesamiento del Lenguaje Natural 73 (2024).
[23] J. Cañete, G. Chaperon, R. Fuentes, J.-H. Ho, H. Kang, J. Pérez, Spanish Pre-Trained BERT Model
     and Evaluation Data, in: PML4DC at ICLR 2020, 2020.
[24] A. Conneau, K. Khandelwal, N. Goyal, V. Chaudhary, G. Wenzek, F. Guzmán, E. Grave, M. Ott,
     L. Zettlemoyer, V. Stoyanov, Unsupervised Cross-lingual Representation Learning at Scale, CoRR
     abs/1911.02116 (2019). URL: http://arxiv.org/abs/1911.02116. arXiv:1911.02116.
[25] A. Gutiérrez-Fandiño, J. Armengol-Estap’e, M. Pàmies, J. Llop-Palao, J. Silveira-Ocampo, C. P.
     Carrino, C. Armentano-Oller, C. R. Penagos, A. Gonzalez-Agirre, M. Villegas, MarIA: Spanish
     Language Models, Proces. del Leng. Natural 68 (2021) 39–60. URL: https://api.semanticscholar.org/
     CorpusID:252847802.
[26] M. Romero, Spanish Electra by Manuel Romero, https://huggingface.co/mrm8488/
     electricidad-base-discriminator/, 2020.