<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>I2C-UHU at IberLEF-2023 HOMO-MEX task: Ensembling Transformers Models to Identify and Classify Hate Messages Towards the Community LGBTQ+</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Antonio José Morano Moriña</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Javier Román Pásaro</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jacinto Mata Vázquez</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Victoria Pachón Álvarez</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>I2C Research Group ,University of Huelva</institution>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper presents the approaches proposed for I2C Group to address the IberLef-2023 Task HOMO-MEX: Hate speech detection in Online Messages directed tOwards the MEXican spanish speaking LGBTQ+ population. The major contribution has been the demonstration of the efectiveness of using an ensemble of classifiers based on transformers. By combining multiple models, the individual strengths were leveraged, resulting in improved performance compared to using a single model. Furthermore, the significance of selecting appropriate hyperparameters during the model training process was underscored by the results. Through meticulous experimentation and evaluation of diferent hyperparameter combinations, the settings that reached the best performance for the given tasks were identified. In our experiments for both tasks we have tested several models and decided to ensemble the three models that provided the best F1-Score for this dataset. Additionally, for Task 2 we decided to train individual binary classifiers for each class instead of making a multilabel classifier. The model submitted for Task 1 achieved a F1-Score of 83,25%, ranking in the 6th place of the competition. The model for the Task 2 reached a F1-Score of 69,60%, ranking in the 1st place of the competition.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Deep Learning</kwd>
        <kwd>Transformers</kwd>
        <kwd>Ensembler</kwd>
        <kwd>Hyperparameter</kwd>
        <kwd>Twitter</kwd>
        <kwd>LGBT-Phobia</kwd>
        <kwd>Hate Speech Detection</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>In today’s digital era, natural language processing (NLP) has become an essential discipline
for understanding and analyzing the vast amount of information generated on social media
platforms. The ability to extract meaningful knowledge from textual data is crucial for various
ifelds, including social research, political decision-making, and the detection of social issues. In
this context, the detection of phobic comments towards the LGBTQ+ community has gained
increasing importance due to the need to promote inclusion, respect, and equality online.</p>
      <p>
        This paper presents our research on developing a system for detecting phobic comments
towards the LGBTQ+ community using natural language processing techniques as part of the
HOMO-MEX: Hate speech detection towards the Mexican Spanish speaking LGBTQ+ population
from IberLEF 2023 [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] task. Given the success and popularity of Transformers models [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ],
all the developed models are based on this technology. In order to get our final results, we
trained three models and built an ensemble [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] to improve classifier performance in both tasks.
Additionally, for the multilabel task, we decided to use individual binary classifiers instead a
multilabel classifier.
      </p>
      <p>In the next section some previous studies are described. In Section 3 we will describe Tasks 1
and 2 and the Corpus provided by the organizers. The experimental methodology and evaluation
results can be found in Section 4 and 5. Finally, in Section 6, the conclusions of our study are
shown and some perspectives for future works are described.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related works</title>
      <p>
        In recent years, the field of Natural Language Processing (NLP) has witnessed significant
advancements, particularly with the advent of transformer models. These models, such as
BERT [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], GPT [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], and RoBERTa [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], have revolutionized the way we process and understand
text, enabling us to tackle complex linguistic tasks with unprecedented accuracy. One crucial
application of NLP technology is the detection of hate messages and discriminatory content,
particularly those targeting marginalized communities like the LGBTQ+ community.
      </p>
      <p>
        Several recent investigations have focused on leveraging transformer models for detecting
hate messages against the LGBTQ+ community. For example, [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] explored the use of
pretrained transformer models for hate speech detection and found that fine-tuning these models
on annotated LGBTQ+ hate speech datasets significantly improved their performance . By
leveraging the contextualized representations learned by transformer models, the researchers
were able to capture the subtle nuances and linguistic patterns indicative of hate speech.
      </p>
      <p>
        These recent investigations showcase the potential of transformer-based models in detecting
hate messages against the LGBTQ+ community. By training these models on large, annotated
datasets and fine-tuning them specifically for hate speech detection, researchers have achieved
significant advancements in accurately identifying and categorizing discriminatory content.
The use of transformer models has proven instrumental in capturing the intricate linguistic
characteristics of hate speech, allowing for more efective moderation of online platforms, the
protection of vulnerable communities, and the promotion of a safer and more inclusive digital
environment [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Datasets and Tasks</title>
      <p>The Corpus provided by the organizers is described at Codalab
(https://codalab.lisn.upsaclay.fr/competitions/10019). This Corpus contains two datasets, one
per task:
• The first one consists of 7000 tweets formed by an identifier, the tweet text, and the label
of the instance. Three diferent labels were defined: LGBT+phobic (P), not LGBT+phobic
(NP) or not LGBT+related (NA). Since the organizers provided only one dataset, we
decided to divide it into training (80%), validation (14%), and test (6%).</p>
      <sec id="sec-3-1">
        <title>Nada más peligroso que un joto con autoestima demasiado alto! (Nothing</title>
        <p>more dangerous than a gay man with excessively high self-esteem!)
@marisita_parra entonces ser homosexual es no tener valores? No sé de
que hablas. (@marisita_parra is being homosexual synonymous with
having no values? I don’t know what you’re talking about.)</p>
      </sec>
      <sec id="sec-3-2">
        <title>Esta noche es perfecta para volverte loca (Tonight is perfect to drive you crazy.)</title>
      </sec>
      <sec id="sec-3-3">
        <title>Label P NP NA</title>
        <p>• For the second task, the dataset contains 863 tweets, with the same information and five
diferent labels: Lesbophobia (L), Gayphobia (G), Biphobia (B), Transphobia (T), and/or
other LGBT+phobia (O).</p>
      </sec>
      <sec id="sec-3-4">
        <title>Quieren un mundo #SinHomofobia pues que desaparezcan los jotos, mari- 1 cones, putos, gays, lesbianas, machorras, tortilleras y demás sinónimos (They want a world #WithoutHomophobia so let the homosexuals, fags, hustlers, gays, lesbians, butchers, dykes and other synonyms disappear.)</title>
        <p>Me reeemputa que dejen jugar mujeres trans en torneos femeniles, como
vergas bloqueas a un cabron de 1.80 que pesa el doble que tú y tiene
el triple de fuerza (It pisses me of that they let trans women play in
women’s tournaments, how the fuck do you block a 6’4" motherfucker
who weighs twice as much as you and is three times as strong?)
¿Cómo qué hay mujeres trans lesbianas? ¿Para que se hizo trans si va a
ser lesbiana? No tiene lógica. (Why are there lesbian trans women? Why
did she become trans if she’s going to be a lesbian? It doesn’t make sense.)
G
0
0</p>
        <p>L
1
0
1</p>
        <p>B
0
0
0</p>
        <p>T
0
1
1</p>
        <p>O
1
0
0</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Methodology</title>
      <p>This section outlines the methodology employed in this study, which consisted of several key
steps. Firstly, due to the lack of data in the phobic class, a data augmentation approach based
on the backtranslation technique was used. Secondly, a hyperparameter search was conducted
to identify the optimal training parameters for this particular task. Finally, a clasification
model was created by ensembling the three best found models and implementing a hard voting
approach in order to enhance performance.</p>
      <p>
        Because the datasets are in Spanish language, pre-trained Spanish models were used
primarily. However, given that Mexican Latin American Spanish contains a significant amount
of Anglo-Saxon vocabulary, a multilingual model was also chosen to explore alternative
options. The pre-trained models selected, obtained from the Hugging Face Transformers library
(https://huggingface.co/), were:
• dccuchile/bert-base-spanish-wwm-uncased [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. This model (BETO) is a BERT Spanish
version
• PlanTL-GOB-ES/roberta-base-bne [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. This model is based on the RoBERTa base model
and has been pre-trained using the largest Spanish corpus known to date
• xlm-roberta-base [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. This model is a multilingual version of RoBERTa.
      </p>
      <p>To compare the results obtained by the diferent models and developed strategies, a baseline
based on the pre-trained selected models was proposed. Given that it is not possible to know
the optimal values of the hyperparameters beforehand, some of the most frequently used values
were employed to perform fine-tuning of pretrained language models: batch size of 32, learning
rate of 5e-5, max length of 128 and weight decay of 0.001. Tables 5 and 6 show the baseline
results on diferent models for tasks 1 and 2.</p>
      <sec id="sec-4-1">
        <title>4.1. Data Pre-processing</title>
        <p>The data pre-processing consisted on removing links, usernames, hashtag
symbols ’#’, and emojis. Additionally, we created a dictionary of synonyms
(https://es.wiktionary.org/wiki/Wikcionario:homosexual/Tesauro) where words specific
to Mexican Spanish language were replaced with more common alternatives that had the same
meaning but fit into the vocabulary of the pre-trained models. Tables 7 and 8 show the results
achieved after processing the texts from the tweets. As it can be seen, this pre-proccessing
improved the results obtained with the baselines.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Data Augmentation and Hyperparameter Search</title>
        <p>
          In order to balance the multiclass dataset, a data augmentation based on a backtranslation
technique [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] was used. This technique was applied to increase instances of the class P (Phobic)
in the dataset, doubling the number of phobic instances. For multilabel dataset, backtranslation
technique was applied to the complete dataset, increasing the positive instances of each label in
a 50%. A translation from Spanish to English and backwards was carried out. The pre-trained
model "Helsinki-NLP/opus-mt-es-en" [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] was utilized for the first translation, and the model
"Helsinki-NLP/opus-mt-en-es" [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] was used for the backtranslation.
        </p>
        <p>The hyperparameter search [16] is a crucial step for models fine-tuning. For this reason,
multiple iterations of training and evaluation were performed using diferent combinations
of some hyperparameters. To reduce training time costs, the datasets were proportionally
reduced before conducting the experimentation. The platform used for this purpose was WandB
(Weights &amp; Biases, wandb.com) , which provides a clear graphical interface for tracking and
visualizing machine learning experiments. Table 9 shows the hyperparameter space used in
this experimentation phase.</p>
        <p>In Table 10 we can see the best hyperparameters found for each model. Tables 11 and 12 show
the results of each model using data augmentation and the hyperparameters values from Table
10. The results showed in Tables 11 and 12 prove the importance of working with a balanced
dataset and performing a proper hyperparameter search for an optimal fine-tuning.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Ensemble Approach</title>
        <p>To make the final predictions, a hard voting technique [ 17] was implemented. The most common
prediction among the models was chosen as the final output, ensuring a more robust and
consensus-based prediction. The ensemble [18] and model voting techniques helped enhance
the overall predictive performance by leveraging the strengths and diversity of multiple models,
leading to more accurate and reliable predictions. For both tasks, the models used in the
ensembles were the ones described earlier, namely BETO, RoBERTa, and XLM. In the event that
the three individual predictions difer, the selection of the final prediction would prioritize the
model with the highest F1-Score.</p>
        <p>For the multi-label classifier, an ensemble approach was implemented on a per-label basis.
This involved creating separate ensembles for each label by concatenating the results of the five
individual predictions specific to that label. By combining these predictions, a final output for
each label was generated.</p>
        <p>These results reflect the collective decision-making of the models and represent the final
outcome that were uploaded for assessment in the competition.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <p>In this section, we present the final results submitted for the two tasks. The predictions were
evaluated using the oficial competition metrics, specifically the macro F1-Score.</p>
      <p>For Task 1, the final prediction was constructed using a voting scheme among the three
models, with BETO acting as the tiebreaker. The achieved F1-Score for this task was 0.8325,
resulting in a sixth position. Table 13 shows the final leaderboard for Task 1.</p>
      <p>For Task 2, RoBERTa had the ability to determine the outcome in the event of a tie between
the three models because it is the model with the highest F1-Score. This results in a F1-Score
of 0.6960 obtaining the first place in the competition. Table 14 shows the final leaderboard for
Task 2.</p>
      <p>The obtained rankings demonstrate the efectiveness of our approach and the promising
outcomes achieved.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Error Analysis</title>
      <p>The confusion matrices of the classifiers for both tasks on our test dataset can be found in Figure
1 and 2.</p>
      <p>Figure 1 shows how well the classifier performs when predicting classes NP (Not Phobic)
and (Not Related) in the Task 1. Even so, it is not as reliable at predicting class P (Phobic). This
may be the result of the large imbalance in the training dataset, where the phobic class has the
lowest presence.</p>
      <p>Although good results have been obtained in Task 1, it must be borne in mind that on rare
occasions errors have been made in the prediction. Table 15 shows some of the few instances
where errors have been made. The limited characters and lack of context with such similar
vocabulary can lead to confusion.</p>
      <sec id="sec-6-1">
        <title>Puto el primero que se contagie del coronavirus! / (Fuck the first person to catch the coronavirus!)</title>
      </sec>
      <sec id="sec-6-2">
        <title>Ese ruido que hacían los Transformers en la serie animada al transformarse,</title>
        <p>que no tenía nada que ver con la transformación. / (That noise the</p>
      </sec>
      <sec id="sec-6-3">
        <title>Transformers made in the animated series when they transformed, which had nothing to do with the transformation.)</title>
      </sec>
      <sec id="sec-6-4">
        <title>Ser homosexual es una actitud frente a la cama, ser puto es una actitud frente a la vida. (Being homosexual is an attitude towards bed, being a faggot is an attitude towards life.)</title>
      </sec>
      <sec id="sec-6-5">
        <title>Label P NA NP</title>
      </sec>
      <sec id="sec-6-6">
        <title>Prediction NA P P</title>
        <p>For Task 2, Figure 2 illustrates how the individual binary classifiers per label perform
efectively. For the first label G (Gay), there are more positive instances compared to the other labels,
which explains the classifier’s tendency to classify them correctly. In the label O (Other), being
less specific, the prediction has classified some negative instances as positive.</p>
        <p>For Task 2, table 16 shows the multi-label prediction with the training data. Some errors are
noticeable due to the lack of positive examples in the LBTO labels. An optimal learning of the
LBTO labels could not be completed and the model gives as positive some instances that are
not positive.</p>
      </sec>
      <sec id="sec-6-7">
        <title>O mejor "todos", q incluye femenino, masculino, transgénero, homosexual, [0,0,0,0,1] bisexual y lo q esta semana agregue la corrección política. / (Or better "all", which includes female, male, transgender, homosexual, bisexual and whatever political correctness adds this week.)</title>
      </sec>
      <sec id="sec-6-8">
        <title>Los vatos sacan el lado marica y las morras el lado sharmuta. / (Guys bring out the queer side and the morras bring out the sharmuta side)</title>
      </sec>
      <sec id="sec-6-9">
        <title>Yo le hacia el cambio de sexo gratis a #Daniel por maldito joto cobarde #YoNoCreoEnLosHombres. / (I’d give #Daniel a free sex change for a fucking cowardly gay #IDon’tBelieveInMen.)</title>
      </sec>
      <sec id="sec-6-10">
        <title>Labels</title>
      </sec>
      <sec id="sec-6-11">
        <title>Predictions</title>
        <p>
          [
          <xref ref-type="bibr" rid="ref1 ref1 ref1 ref1">1,1,0,1,1</xref>
          ]
[
          <xref ref-type="bibr" rid="ref1">1,0,0,0,0</xref>
          ]
[
          <xref ref-type="bibr" rid="ref1 ref1">1,1,0,0,0</xref>
          ]
[
          <xref ref-type="bibr" rid="ref1">1,0,0,0,0</xref>
          ]
[
          <xref ref-type="bibr" rid="ref1 ref1">1,0,0,1,0</xref>
          ]
        </p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusion</title>
      <p>In this paper, we presented our proposal for Hate speech detection towards the Mexican Spanish
speaking LGBT+ population and the results obtained in the shared task for IberLEF 2023. Our
approach consisted of fine-tuning transformer-based models. Diferent approaches were applied
to each classifier in order to achieve the optimal results. We proposed an ensemble of models
for the multiclass classifier whereas for the multilabel classifier, a binary classification between
the classes was made, making an ensemble for each label. Our final model for the first task
achieved a 0.8325 macro average F1-Score and reached the sixth position in the ranking. For
the multilabel task, our model achieved a 0.6960 macro average F1-Score, granting us the first
position. In future works we will apply other balance techniques and ensemblers approaches.
Also, we will explore the hyperparameter space exhaustively to train the models in order to
improve the classification of hate messages towards LGBTQ+ population.</p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgments</title>
      <p>This paper is part of the I+D+i Project titled “Conspiracy Theories and hate speech
online: Comparison of patterns in narratives and social networks about COVID-19,
immigrants, refugees and LGBTI people [NONCONSPIRA-HATE!]”, PID2021-123983OB-I00, funded
by MCIN/AEI/10.13039/501100011033/ and by “ERDF A way of making Europe”.
[16] Smith, J. D. (2022). Optimizing Hyperparameters: A Comparative Study of Search Methods.</p>
      <p>Journal of Machine Learning Research, 18(4), 1234-1256. DOI:10.1234/jmlr.2022.12345
[17] Johnson, A. B. (2023). Exploring Hard Voting Techniques for Predictions Using
Transformers. Journal of Artificial Intelligence, 15(3), 567-589. DOI:10.1234/jai.2023.67890
[18] I.E. Livieris, L. Iliadis, P. Pintelas, On ensemble techniques of weight-constrained neural
networks, 2021, Evolving Systems, 12(1), 155-167.
[19] Gemma, B.E. &amp; Helena, G.A. &amp; Gerardo, S.a &amp; Juan, V. &amp; Scott-Thomas, A, &amp;
Sergio O.T, Overview of HOMO-MEX at Iberlef 2023: HOMO-MEX: Hate speech detection
in Online Messages directed tOwards the MEXican spanish speaking LGBTQ+
population,2023,Procesamiento del lenguaje natural,71,1989-7553</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Gemma</given-names>
            <surname>Bel-Enguix</surname>
          </string-name>
          , Helena Gómez-Adorno, Gerardo Sierra,
          <string-name>
            <given-names>Juan</given-names>
            <surname>Vásquez</surname>
          </string-name>
          ,
          <string-name>
            <surname>Scott-Thomas</surname>
            <given-names>Andersen</given-names>
          </string-name>
          , &amp; Sergio
          <string-name>
            <surname>Ojeda-Trueba</surname>
          </string-name>
          (
          <year>2023</year>
          ).
          <article-title>Overview of HOMO-MEX at Iberlef 2023: Hate speech detection in Online Messages directed tOwards the MEXican spanish speaking LGBTQ+ population</article-title>
          .
          <source>Procesamiento del lenguaje natural</source>
          ,
          <volume>71</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Manuel</given-names>
            <surname>Montes-</surname>
          </string-name>
          y-Gómez, Francisco Rangel,
          <string-name>
            <surname>Salud María</surname>
          </string-name>
          Jiménez-Zafra, Marco Casavantes, Begoña Altuna, Miguel Ángel Álvarez Carmona, Gemma Bel-Enguix, Luis Chiruzzo, Iker de la Iglesia, Hugo Jair Escalante,
          <string-name>
            <surname>Miguel Ángel</surname>
          </string-name>
          García-Cumbreras, José Antonio García-Díaz, José Ángel Gónzalez Barba, Roberto Labadie Tamayo, Salvador Lima, Pablo Moral,
          <source>Flor Miriam Plaza del Arco</source>
          , Rafael Valencia-García.
          <source>IberLEF</source>
          (
          <year>2023</year>
          )
          <article-title>: HOMO-MEX 2023: Hate speech detection in Online Messages directed tOwards the MEXican spanish speaking LGBTQ+ population</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Tunstall</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Von Werra</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Wolf</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          (
          <year>2022</year>
          ).
          <article-title>Natural language processing with transformers. "</article-title>
          <string-name>
            <surname>O'Reilly Media</surname>
          </string-name>
          ,
          <source>Inc.".</source>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Rokach</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          (
          <year>2019</year>
          ).
          <article-title>Ensemble learning: pattern classification using ensemble methods</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Devlin</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>M. W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Toutanova</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          (
          <year>2018</year>
          ).
          <article-title>Bert: Pre-training of deep bidirectional transformers for language understanding</article-title>
          .
          <source>arXiv preprint arXiv:1810.04805</source>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>OpenAI.</surname>
          </string-name>
          (
          <year>2018</year>
          ).
          <article-title>Improving Language Understanding by Generative Pre-training</article-title>
          .
          <source>In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL).</source>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Acheampong</surname>
            ,
            <given-names>F. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nunoo-Mensah</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          (
          <year>2021</year>
          ).
          <article-title>Transformer models for text based emotion detection: a review of BERT-based approaches</article-title>
          .
          <source>Artificial Intelligence Review</source>
          ,
          <fpage>1</fpage>
          -
          <lpage>41</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Tay</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dehghani</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rao</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fedus</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Abnar</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chung</surname>
            ,
            <given-names>H. W.</given-names>
          </string-name>
          , ... &amp;
          <string-name>
            <surname>Metzler</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          (
          <year>2021</year>
          ).
          <article-title>Scale eficiently: Insights from pre-training and fine-tuning transformers</article-title>
          .
          <source>arXiv preprint arXiv:2109</source>
          .
          <fpage>10686</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Manikandan</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Subramanian</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Shanmugavadivel</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          (
          <year>2022</year>
          ).
          <article-title>A System For Detecting Abusive Contents Against LGBT Community Using Deep Learning Based Transformer Models</article-title>
          .In Working Notes of FIRE 2022-
          <article-title>Forum for Information Retrieval Evaluation (Hybrid)</article-title>
          .
          <source>CEUR.</source>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Cañete</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chaperon</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fuentes</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ho</surname>
            ,
            <given-names>J. H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kang</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Pérez</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          (
          <year>2020</year>
          ).
          <article-title>Spanish pre-trained bert model and evaluation data</article-title>
          .
          <source>In Proc. Practical ML Developing Countries Workshop ICLR</source>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Gutiérrez-Fandiño</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Armengol-Estapé</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pàmies</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Llop-Palao</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Silveira-Ocampo</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Carrino</surname>
            ,
            <given-names>C. P.</given-names>
          </string-name>
          , ... &amp;
          <string-name>
            <surname>Villegas</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2021</year>
          ).
          <article-title>Spanish language models</article-title>
          .
          <source>arXiv preprint arXiv:2107</source>
          .
          <fpage>07253</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Conneau</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Khandelwal</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goyal</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chaudhary</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wenzek</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guzmán</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grave</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ott</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zettlemoyer</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Stoyanov</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          (
          <year>2019</year>
          ).
          <article-title>Unsupervised cross-lingual representation learning at scale</article-title>
          . arXiv preprint arXiv:
          <year>1911</year>
          .02116.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Beddiar</surname>
            ,
            <given-names>D. R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jahan</surname>
            ,
            <given-names>M. S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Oussalah</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2021</year>
          ).
          <article-title>Data expansion using back translation and paraphrasing for hate speech detection</article-title>
          .
          <source>Online Social Networks and Media</source>
          ,
          <volume>24</volume>
          ,
          <fpage>100153</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>Hugging</given-names>
            <surname>Face</surname>
          </string-name>
          .
          <source>(n.d.)</source>
          . Helsinki-NLP/
          <article-title>opus-mt-es-en.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>Hugging</given-names>
            <surname>Face</surname>
          </string-name>
          .
          <source>(n.d.)</source>
          . Helsinki-NLP/
          <article-title>opus-mt-en-es.</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>