<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>CogniCIC at HOMO-LAT 2025: Fine-tuning and Semantic Similarity Approaches for Polarity Detection in Spanish Texts</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Cesar Macias</string-name>
          <email>cmaciass2021@cic.ipn.mx</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Miguel Soto</string-name>
          <email>msotoh2021@cic.ipn.mx</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tania Alcántara</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Omar García</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>José Eduardo Valdez-Rodríguez</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Edgardo Felipe-Riveron</string-name>
          <email>edgardo@cic.ipn.mx</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Centro de Investigación en Computación (CIC), Instituto Politécnico Nacional (IPN)</institution>
          ,
          <country country="MX">México</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <abstract>
        <p>Hate speech against the LGBTQ+ community in digital environments represents a growing challenge that requires automatic solutions that are able to adapt to the dialectal and cultural diversity of Spanish. This paper presents two methodologies based on deep learning models for polarity classification of messages on Reddit targeting the LGBTQ+ community. The first methodology (Track 1) addresses a classification of polarity (positive, negative or neutral) using a shared corpus of tweets and posts on Reddit generated in Mexico, Colombia, Chile and Argentina. The second methodology (Track 2) focuses on a cross-dialect classification, training on dialects from the centre (Mexico, Colombia, Chile, Argentina) and evaluating on dialects from the rest of Latin America (Bolivia, Costa Rica, Cuba, Dominican Republic, Ecuador, El Salvador, Guatemala, Honduras, Nicaragua, Panama, Paraguay, Peru, Puerto Rico, Uruguay and Venezuela). Our best results reach a F1-score of 0.4350 and 0.4388 for Tracks 1 and 2, respectively, evidencing the complexity of dialectal variants and the need to incorporate adapted preprocessing, more data and additional techniques to improve nuance detection.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;hate speech</kwd>
        <kwd>LGBTQ+ community</kwd>
        <kwd>multi-dialect classification</kwd>
        <kwd>dialect cross-classification</kwd>
        <kwd>deep learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        for LGBT+phobia detection. For 2024, it extended the challenge to more complex domains, such as song
lyrics. These eforts have shown that, while Spanish transformers (BETO [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] or mBERT [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]) achieve
F1-macro scores above 0.88 in binary detection, challenges remain in capturing nuances of implicit
hatred and adapting to particular dialectal variants and cultural contexts.
      </p>
      <p>
        While existing competencies have already laid a solid foundation, the linguistic and sociocultural
diversity of Spanish-speaking countries demands specific resources that take into account regional
variations of Spanish. The HOMO-LAT 2025 [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] is a shared task of the shared evaluation campaign
IberLEF 2025 [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. This task aims to extend the analysis to the Spanish of several Latin American
countries on platforms such as Reddit, focusing on the polarity of LGBTQ+ mentions (positive, negative
or neutral) and addressing the challenges associated with local contexts. In this article, we propose the
design and implementation of two models for the automatic classification of hate speech against the
LGBTQ+ community, both based on deep learning techniques and preprocessing adapted to Spanish
characteristics, with a view to proposing a model capable of strengthening online moderation and
inclusion.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. State of the Art</title>
      <p>
        In recent years, the automatic detection of anti-LGBTQ+ hate speech has evolved significantly, driven
by the availability of new annotated resources and the refinement of deep language models. Earlier in
the decade, research focused on the creation of multilingual corpora and descriptive analysis of hate
phenomena in diferent linguistic contexts. In 2021, Chakravarthi et al. introduced a pioneering corpus
of YouTube comments in English and Dravidian languages (Tamil, Malayalam, and Tamil-English mixed
text) annotated for homophobia and transphobia, serving as the basis for the first LT-EDI competition in
2022 [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. In the same year, Plaza-del Arco et al. comprehensively evaluated various pre-training models
(BETO [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], mBERT [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], XLM-R [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]) and traditional architectures (CNN [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], LSTM [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]) on a corpus of
Spanish tweets to detect hate speech, including homophobic insults, demonstrating the superiority of
Spanish monolingual models over multilingual and traditional approaches [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Similarly, Hudhayri,
in 2021, documented, from a qualitative perspective, forms of linguistic harassment against LGBTQ+
people on Arabic Twitter (now X), identifying verbal and visual patterns of homophobic aggression
that highlight the need for Arabic-specific PLN tools [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
      </p>
      <p>
        During 2022, the ranking competitions were consolidated with the second LT-EDI Workshop.
Chakravarthi et al. described the first edition of the Shared Task in LT-EDI 2022, where around
50 teams competed in detecting homophobia and transphobia in social media comments in English,
Tamil, and mixed code Tamil-English. The results showed that the multilingual transformer-based
models outperformed traditional classifiers by far, achieving F1-macro scores close to 0.80 in the English
and Tamil versions [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. In the same year, Karayıgıt et al. explored the Turkish language, building the
HATC corpus to detect homophobic and generic hate speech on Instagram. By comparing mBERT [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]
with neural network architectures and classical classifiers (SVM [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], Random Forest [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], GRU [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]),
they showed that the transformer-based architecture (mBERT [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]) achieved an average F1 of 0.90,
accurately capturing the subtle homophobic expressions in Turkish [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
      </p>
      <p>
        In 2023, attention shifted to Mexican Spanish resources and the refinement of shared tasks. Vázquez
et al. presented HOMO-MEX, the first annotated corpus of Mexican Spanish tweets for LGBT+phobia
detection, with approximately 7, 000 tweets tagged in binary and multi-label categories, serving as
an initial benchmark where again, BETO [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] demonstrated outstanding performance with F1 of 0.74
in binary classification and 0.85 in fine-granular classification, leaving as evidence the complexity of
the homophobic language [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. In parallel, Bel-Enguix et al. described the 2023 edition of the shared
task HOMO-MEX in IberLEF, where with the use of similar models, participants managed to obtain
superior results, emphasizing the importance of careful preprocessing in Spanish [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. In the same
year, Chakravarthi et al. conducted the second edition of LT-EDI 2023, expanding the languages to
ifve (English, Spanish, Tamil, Hindi and Malayalam) and adding a sub-task with seven hate classes.
The use of transformer-based models again outperformed classical classification models, resulting in
F1s of 0.90 in English and 0.85 in Spanish and Tamil, although performance was lower in Hindi and
Malayalam due to the paucity of data [19]. Kumaresan et al. complemented these eforts by presenting a
high-quality corpus for detecting homophobia and transphobia in Hindi and Malayalam in 2023, where
they trained a model with both languages achieving an F1-macro close to 0.80 in Hindi and close to
0.75 in Malayalam, pointing out the benefit of sharing information between related languages [20].
      </p>
      <p>
        In 2024, the challenges diversified into specialised domains and socio-temporal analyses. For the
2024 edition of HOMO-MEX at IberLEF, Gómez-Adorno et al. introduced three sub-tasks: general hate
detection in tweets, fine-granular classification by type of phobia and detection of homophobic content
in song lyrics. The winning proposals achieved Macro-F1 close to 0.91 on tweets and up to 0.97 on fine
classification, but the music lyrics task proved more dificult reaching an F1 of 0.58, demonstrating the
need for domain-specific models and additional data [ 21]. Similarly, Chakravarthi et al. presented the
third edition of LT-EDI at EACL 2024, extending the competition to ten languages (including South
Indian languages such as Telugu, Kannada, Gujarati, Marathi and Tulu). Systems based on XLM-R [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]
and specific monolingual models achieved F1s between 0.80 and 0.95, demonstrating that joint learning
of multilingual data benefits under-resourced languages, although challenges remain in capturing
cultural nuances [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Complementarily, Andersen et al. analysed the sociolinguistic evolution of LGBT+
related terms on Mexican Twitter between 2011 and 2021, using context vectors and sentiment analysis
to demonstrate changes in the semantic polarity of slurs and neutral terms, highlighting that detection
models require constant temporal adjustments to adapt to language evolution [22]. Finally, this year,
McGif and Nikolov explored the diference in performance between classical and BERT [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] models
for detecting homophobia on Twitter (now X), highlighting the relevance of capturing context for
identifying implicit homophobia [23].
      </p>
      <p>
        Finally, in 2025, Leoni Santos et al. applied deep learning and explainability techniques to a specific
domain: tweets homologous to the European Football Championship, where homophobia is prevalent.
They introduced H-DICT, a specialized dictionary for filtering and annotating homophobic tweets,
and compared five variants of BERT [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], where RoBERTa Ofensive [ 24] obtained the best result with
an F1 close to 0.89, and using Integrated Gradients they identified the words guiding the detection,
concluding that adapting models to the domain and using specialized lexicons substantially improves
the detection of anti-LGBT+ hate in sporting contexts [25]. The HOMO-LAT [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] task is also announced
at IberLEF 2025, extending the analysis to Reddit throughout Latin America, focusing on the polarity of
mentions of LGBTQ+ terms (positive, negative or neutral) and addressing dialectal variants and local
contexts, reflecting the constant evolution and expansion of research in this field.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <p>A brief description of the provided datasets for the competition is given in this section; furthermore, a
detailed description of the method selected to resolve the problem is provided.</p>
      <sec id="sec-3-1">
        <title>3.1. Tracks description</title>
        <sec id="sec-3-1-1">
          <title>3.1.1. Track 1: Multi-dialect polarity detection track (multi-class)</title>
          <p>For this track, the organizers proposed a dataset composed of Reddit posts that contained LGBTQ+
keywords. The keywords were defined by the organizers in a lexicon format; some of those keywords
are: {trans, LGBT, queer, ...}. The objective of this task was to indicate the polarity {positive, negative,
neutral} of the posts towards the keyword. The posts collected in this dataset were written in Spanish
dialects from Mexico, Colombia, Chile, and Argentina. Training and testing datasets contained posts
from every dialect; a detailed description of the datasets is presented in Section 3.2.</p>
        </sec>
        <sec id="sec-3-1-2">
          <title>3.1.2. Track 2: Cross-dialect polarity detection (multi-labeled)</title>
          <p>The training dataset for this track is the same as in Track 1. The diference from Task 1 was that the
Spanish dialects contained in the testing corpus were not the same as in the training set. The posts
contained in the testing set are written in Spanish dialects from Bolivia, Costa Rica, Cuba, the Dominican
Republic, Ecuador, El Salvador, Guatemala, Honduras, Nicaragua, Panama, Paraguay, Peru, Puerto Rico,
Uruguay, and Venezuela. The goal of this task was to predict the polarity of the post, as in Task 1. A
more detailed description of the training and testing set is presented in Section 3.2.</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Corpus description</title>
        <p>The corpus for this Shared Task is composed of three subsets {Train (shared by both tasks), Task 1
Test, Task 2 Test}. The distribution of the posts by country of the Train and Task 1 Test corpus is
shown in Figure 1a. The distribution of the posts by country of the Task 2 Test corpus is shown in
Figure 1b.Finally, the label distribution by country of the train corpus is shown in Figure 2
(a) Train and Track 1 Test subsets
(b) Track 2 Test subset</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Model Specifications</title>
        <p>
          3.3.1. Model one
Our approach for this shared task was to fine-tune a BERT [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] model pretrained in a large Spanish
corpus, better known as BETO [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. To optimize the model performance, we have configured several key
specifications. Just one model was trained due to the existence of a sole training corpus; this model
was trained for 25 epochs, we have selected the best model amongst all (tested over validation split
against F1-score) to make the final predictions, a batch size of 64, the learning rate was set to 5 × 10− 6
and the epsilon value was set to 1 × 10− 8, the validation size was set to be the 20% of the training set
and finally the environment random generation seeds were set to 42. This model was used to make
predictions over the testing sets of Track 1 and Track 2.
3.3.2. Model two
For subtasks 1 and 2, we used the BETO model [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], a Spanish pre-trained variant of BERT. The model
was fine-tuned using the training dataset provided. Afterward, the development dataset was split into
two equal parts: the first was used to monitor the fine-tuning process and select the best performing
weights, while the second was used to evaluate the classification system.
        </p>
        <p>Once the model was adjusted, the semantic representations (embeddings) for each text were generated.
Instead of using the vector associated with the [CLS] token, a mean pooling strategy was applied over the
last attention layers. This technique has been shown to be more robust for sentence-level classification
tasks, as it better captures the overall semantic meaning of the text.</p>
        <p>The generated vectors were stored in a vector database using OpenSearch [26]. For classification, the
test data set was processed in the same way, and their embeddings were used to perform a semantic
similarity search using the k-Nearest Neighbors (kNN) algorithm, with cosine similarity as the main
distance metric. In addition to semantic search, a filtering step was applied based on country and
keyword (LGBT + segment). This allowed the system to restrict the search to contextually similar
vectors, considering that idiomatic expressions and community references can vary significantly between
Spanish-speaking countries.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experiments and results</title>
      <p>Here we present the final ranking results obtained for the predictions made with the fine-tuned BETO
model. The predictions were made with the best model tested over validation; the best performing
model obtained an F1-score of 0.5381 and this was used to make the final predictions. The results
obtained with the best validation model for Tracks 1 and 2 are shown in Table 1, and Table 2 respectively.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Discussion</title>
      <p>The results obtained in both tracks suggest that, although the fine-tuned BETO model achieves
competitive validation scores, the generalisation on the test set reveals a significant gap against the best-ranked
systems (Track 1: 0.5296 vs. 0.4360 reported and Track 2: 0.5086 vs. 0.4388 obtained). This diference
could be explained mainly by the diference between training and test data, i.e. in Track 1, the test
dialects (Mexico, Colombia, Chile and Argentina) were part of the training corpus, which favoured the
adaptation of our models to the polarity of the messages. On the other hand, in Track 2, when testing
on unseen dialects (e.g. Bolivia, Cuba, Ecuador or Venezuela), performance decreased by more than 10
percentage points in F1. This tells us that the contextual representation of our models loses robustness
to idiomatic expressions and localisms not contemplated in the training.</p>
      <p>The dialectal disparity highlights the importance of adapted pre-processing and strategies that
mitigate linguistic variability. That is, the second methodology based on mean pooling of BETO embeddings
and kNN classification with country and keyword filtering could help to retrieve semantically close
examples despite lexical diferences. However, its efectiveness depends directly on the diversity and
quality of the reference corpus, so without a large repository of local expressions, it risks generating
false positives or negatives. Furthermore, the variety in the distribution of tags by country and the
presence of spelling variants and colloquialisms on Reddit (abbreviations, emojis, misspellings, irony)
require incorporation through normalisation techniques and perhaps data augmentation (e.g., spelling
correction and hashtag-sensitive tokenization) to improve the model’s ability to capture specific nuances
and tones of hate speech.</p>
      <p>For future research, it is essential to explore class balancing methods (oversampling of minority
instances or weighted loss) and to expand BETO’s vocabulary by vocabulary expansion to include
idioms and localisms. Furthermore, the integration of explainability modules (e.g., integrated gradients
or LIME) would facilitate the identification of biases in polarity prediction and guide the construction
of specialised lexicons, as demonstrated by the use of H-DICT in sports contexts. Taken together, these
implementations would aim to strengthen the automatic detection of hate speech against the LGBTQ+
community in Latin America and the interpretation of pragmatic nuances in short social media texts.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>The authors wish to thank the support of the Instituto Politécnico Nacional (COFAA, SIP-IPN, Research
grants SIP-20251352, SIP-), and the Mexican Government (SECIHTI, SNI).</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors used Writefull’s model in order to: Grammar and
spelling check. Further, the authors used DeepL Translator in order to: Translate texts from Spanish to
English.
the mexican spanish speaking lgbtq+ population, Procesamiento del Lenguaje Natural 71 (2023)
361–370.
[19] B. R. Chakravarthi, R. Ponnusamy, S. Malliga, P. Buitelaar, M. Á. García-Cumbreras, S. M.
JiménezZafra, J. A. García-Díaz, R. Valencia-García, N. Jindal, Overview of second shared task on
homophobia and transphobia detection in social media comments, in: Proceedings of the Third
Workshop on LT-EDI, 2023, pp. 1–10.
[20] P. K. Kumaresan, R. Ponnusamy, R. Priyadharshini, P. Buitelaar, B. R. Chakravarthi, Homophobia
and transphobia detection for low-resourced languages in social media comments, Natural
Language Processing (Elsevier) Journal 5 (2023) 100041. doi:10.1016/j.nlp.2023.100041.
[21] H. Gómez-Adorno, G. Bel-Enguix, G. Sierra, J. Vásquez, S. T. Andersen, S.-L. Ojeda-Trueba,
Overview of homo-mex at iberlef 2024: Hate speech detection towards the mexican spanish
speaking lgbt+ population, Procesamiento del Lenguaje Natural 73 (2024) 393–405.
[22] S. T. Andersen, S.-L. Ojeda-Trueba, J. Vásquez, G. Bel-Enguix, The mexican gayze: A computational
analysis of the attitudes towards the lgbt+ population in mexico on social media across a decade,
in: Proceedings of the 8th Workshop on Online Abuse and Harms (WOAH), Association for
Computational Linguistics, 2024, pp. 178–200.
[23] J. McGif, N. S. Nikolov, Bridging the gap in online hate speech detection: A comparative analysis
of bert and traditional models for homophobic content identification on twitter, arXiv preprint
arXiv:2405.09221 (2024).
[24] F. Barbieri, J. Camacho-Collados, L. Neves, L. E. Anke, Tweeteval: Unified benchmark and
comparative evaluation for tweet classification, CoRR abs/2010.12421 (2020). URL: https://arxiv.
org/abs/2010.12421. arXiv:2010.12421.
[25] G. Leoni Santos, V. Gaboardi dos Santos, C. Kearns, G. Sinclair, J. Black, M. Doidge, T. Fletcher,
D. Kilvington, K. Liston, P. T. Endo, T. Lynn, Detecting homophobic speech in soccer tweets
using large language models and explainable ai, in: Proc. of the 16th Int. Conf. on Advances in
Social Networks Analysis and Mining (ASONAM), Lecture Notes in Computer Science, vol. 15211,
Springer, 2024, pp. 489–504.
[26] OpenSearch Project, Opensearch: Open source search and analytics suite, https://opensearch.org,
2021. Accessed: 2025-06-02.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , Bert:
          <article-title>Pre-training of deep bidirectional transformers for language understanding</article-title>
          ,
          <year>2019</year>
          . URL: https://arxiv.org/abs/
          <year>1810</year>
          .04805. arXiv:
          <year>1810</year>
          .04805.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Conneau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Khandelwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Chaudhary</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Wenzek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Guzmán</surname>
          </string-name>
          , E. Grave,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ott</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zettlemoyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Stoyanov</surname>
          </string-name>
          ,
          <article-title>Unsupervised cross-lingual representation learning at scale</article-title>
          , CoRR abs/
          <year>1911</year>
          .02116 (
          <year>2019</year>
          ). URL: http://arxiv.org/abs/
          <year>1911</year>
          .02116. arXiv:
          <year>1911</year>
          .02116.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>B. R.</given-names>
            <surname>Chakravarthi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Priyadharshini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Durairaj</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>McCrae</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Buitelaar</surname>
          </string-name>
          ,
          <article-title>Overview of the shared task on homophobia and transphobia detection in social media comments</article-title>
          ,
          <source>in: Proceedings of the Second Workshop on Language Technology for Equality</source>
          , Diversity, and
          <string-name>
            <surname>Inclusion (LT-EDI)</surname>
          </string-name>
          ,
          <year>2022</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>B. R.</given-names>
            <surname>Chakravarthi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. K.</given-names>
            <surname>Kumaresan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Priyadharshini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Buitelaar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hegde</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Shashirekha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Rajiakodi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Á. García</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Jiménez-Zafra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>García-Díaz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Valencia-García</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. K.</given-names>
            <surname>Ponnusamy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Shetty</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          <article-title>García-Baena, Overview of third shared task on homophobia and transphobia detection in social media comments</article-title>
          ,
          <source>in: Proceedings of the Fourth Workshop on LT-EDI (EACL)</source>
          ,
          <year>2024</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Cañete</surname>
          </string-name>
          , G. Chaperon,
          <string-name>
            <given-names>R.</given-names>
            <surname>Fuentes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-H.</given-names>
            <surname>Ho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Kang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Pérez</surname>
          </string-name>
          ,
          <article-title>Spanish pre-trained bert model and evaluation data</article-title>
          ,
          <source>in: PML4DC at ICLR</source>
          <year>2020</year>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>G.</given-names>
            <surname>Bel-Enguix</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Gómez-Adorno</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ojeda-Trueba</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Sierra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Barco</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Dunstan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Manrique</surname>
          </string-name>
          ,
          <article-title>Overview of HOMO-LAT at IberLEF 2025: Human-centric polarity detection in Online Messages Oriented to the Latin American-speaking lgbtq+ populaTion</article-title>
          ,
          <source>Procesamiento del lenguaje natural 75</source>
          (
          <year>2025</year>
          )
          <article-title>-</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Á</surname>
          </string-name>
          .
          <string-name>
            <surname>González-Barba</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Chiruzzo</surname>
            ,
            <given-names>S. M.</given-names>
          </string-name>
          <string-name>
            <surname>Jiménez-Zafra</surname>
          </string-name>
          ,
          <article-title>Overview of IberLEF 2025: Natural Language Processing Challenges for Spanish and other Iberian Languages, in: Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2025), co-located with the 41st Conference of the Spanish Society for Natural Language Processing (SEPLN 2025), CEUR-WS</article-title>
          . org,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>B. R.</given-names>
            <surname>Chakravarthi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Priyadharshini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ponnusamy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. K.</given-names>
            <surname>Kumaresan</surname>
          </string-name>
          , et al.,
          <article-title>Dataset for identification of homophobia and transphobia in multilingual youtube comments</article-title>
          ,
          <source>arXiv preprint arXiv:2109.00227</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lecun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Bengio</surname>
          </string-name>
          ,
          <article-title>Convolutional Networks for Images, Speech</article-title>
          and Time Series, The MIT Press,
          <year>1995</year>
          , pp.
          <fpage>255</fpage>
          -
          <lpage>258</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>S.</given-names>
            <surname>Hochreiter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schmidhuber</surname>
          </string-name>
          ,
          <article-title>Long short-term memory</article-title>
          ,
          <source>Neural computation 9</source>
          (
          <year>1997</year>
          )
          <fpage>1735</fpage>
          -
          <lpage>1780</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>F. M.</given-names>
            <surname>Plaza-del Arco</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. D.</surname>
            Molina-González,
            <given-names>L. A.</given-names>
          </string-name>
          <string-name>
            <surname>Ureña-López</surname>
            ,
            <given-names>M. T.</given-names>
          </string-name>
          <string-name>
            <surname>Martín-Valdivia</surname>
          </string-name>
          ,
          <article-title>Comparing pre-trained language models for spanish hate speech detection</article-title>
          ,
          <source>Expert Systems with Applications</source>
          <volume>166</volume>
          (
          <year>2021</year>
          )
          <article-title>114120</article-title>
          . doi:
          <volume>10</volume>
          .1016/j.eswa.
          <year>2020</year>
          .
          <volume>114120</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>K.</given-names>
            <surname>Hudhayri</surname>
          </string-name>
          ,
          <article-title>Linguistic harassment against arab lgbts on cyberspace</article-title>
          ,
          <source>International Journal of English Linguistics</source>
          <volume>11</volume>
          (
          <year>2021</year>
          )
          <fpage>58</fpage>
          -
          <lpage>75</lpage>
          . doi:
          <volume>10</volume>
          .5539/ijel.v11n4p58.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>C.</given-names>
            <surname>Cortes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Vapnik</surname>
          </string-name>
          ,
          <article-title>Support-vector networks</article-title>
          ,
          <source>Machine learning 20</source>
          (
          <year>1995</year>
          )
          <fpage>273</fpage>
          -
          <lpage>297</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>T. K. Ho</surname>
          </string-name>
          ,
          <article-title>Random decision forests</article-title>
          ,
          <source>in: Proceedings of 3rd international conference on document analysis and recognition</source>
          , volume
          <volume>1</volume>
          , IEEE,
          <year>1995</year>
          , pp.
          <fpage>278</fpage>
          -
          <lpage>282</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>J.</given-names>
            <surname>Chung</surname>
          </string-name>
          , Ç. Gülçehre,
          <string-name>
            <given-names>K.</given-names>
            <surname>Cho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Bengio</surname>
          </string-name>
          ,
          <article-title>Empirical evaluation of gated recurrent neural networks on sequence modeling</article-title>
          ,
          <source>CoRR abs/1412</source>
          .3555 (
          <year>2014</year>
          ). URL: http://arxiv.org/abs/1412.3555. arXiv:
          <volume>1412</volume>
          .
          <fpage>3555</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>H.</given-names>
            <surname>Karayıgıt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Akdağlı</surname>
          </string-name>
          , İnan Aci,
          <article-title>Homophobic and hate speech detection using multilingualbert model on turkish social media</article-title>
          ,
          <source>Information Technology and Control</source>
          <volume>51</volume>
          (
          <year>2022</year>
          )
          <fpage>346</fpage>
          -
          <lpage>363</lpage>
          . doi:
          <volume>10</volume>
          .5755/j01.itc.
          <volume>51</volume>
          .2.29988.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>J.</given-names>
            <surname>Vásquez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Andersen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Bel-Enguix</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Gómez-Adorno</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.-L.</given-names>
            <surname>Ojeda-Trueba</surname>
          </string-name>
          ,
          <article-title>HOMO-MEX: A Mexican Spanish annotated corpus for LGBT+phobia detection on Twitter</article-title>
          ,
          <source>in: Proceedings of the 7th Workshop on Online Abuse and Harms (WOAH)</source>
          ,
          <source>Association for Computational Linguistics</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>202</fpage>
          -
          <lpage>214</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>G.</given-names>
            <surname>Bel-Enguix</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Gómez-Adorno</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Sierra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Vásquez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. T.</given-names>
            <surname>Andersen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.-L.</given-names>
            <surname>Ojeda-Trueba</surname>
          </string-name>
          ,
          <article-title>Overview of homo-mex at iberlef 2023: Hate speech detection in online messages directed towards</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>