<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>C. Díez-Fenoy);</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>DACKERS at MiSonGyny 2025: A Transformer Ensemble Approach for Misogyny Detection in Spanish</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Carlos Díez-Fenoy</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ariel López-González</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jorge Daniel Valle-Díaz</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Computer Science and Engineering Department, Universidad Carlos III de Madrid</institution>
          ,
          <addr-line>Madrid, 28911</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0001</lpage>
      <abstract>
        <p>This work presents a two-stage classification approach for detecting misogynistic content in Spanish song lyrics. In the first task, we addressed the binary classification of misogynistic vs. non-misogynistic lyrics. In the second, we further classified misogynistic content into four subcategories: Not Related, Violence, Sexual, and Harassment. We experimented with several transformer-based models, including BETO, RoBERTa, BERT, and LLaMA, as well as multiple ensemble configurations. In Task 1, the best performance was achieved by the Transformer Ensemble 2, with an F1-Score of 0.828, followed by RoBERTa (0.812) and BERT (0.798). In Task 2, where the classification problem was more fine-grained, the best-performing model was again an ensemble (Transformer Ensemble 2), with an F1-Score of 0.434, followed by LLaMA (0.488) and BERT (0.415). These results highlight the robustness of transformer ensembles for detecting subtle forms of misogyny in song lyrics and demonstrate the challenges of ifne-grained categorization in this domain.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Misogyny detection</kwd>
        <kwd>Spanish lyrics Classification</kwd>
        <kwd>Transformer Ensemble</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Misogynistic content in song lyrics has become a growing social concern, particularly in music genres
where degrading or violent references to women are normalized or even glamorized. In Spanish-speaking
contexts, this issue is of particular pertinence due to the global reach and influence of certain musical
styles, which are often consumed extensively by younger demographics [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. From a societal perspective,
the normalization of misogyny through music has the potential to reinforce gender stereotypes and
contribute to broader patterns of symbolic violence and discrimination [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        From a computational perspective, the automatic detection of misogynistic speech in lyrics poses
several challenges [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ]. These include the presence of figurative language, slang, irony, and ambiguous
phrasing, all of which are common in creative domains such as music. Furthermore, there is a notable
scarcity of annotated datasets focused on song lyrics, especially in the Spanish language, and the
few available resources tend to be highly imbalanced, which hinders the development of robust and
generalisable models [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ].
      </p>
      <p>
        In this study, we propose a two-stage classification pipeline for the detection of misogynistic content in
Spanish-language song lyrics [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. In the initial phase, we undertake binary classification to diferentiate
between misogynistic and non-misogynistic content. In the second stage of the analysis, the misogynistic
lyrics are further classified into four categories: The following categories are not relevant: violence,
sexual, and harassment [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        In this study, we undertake a thorough evaluation of multiple pre-trained transformer models, namely
BERT, RoBERTa, and LLaMA, in addition to a classical Vector Space Model (VSM) approach. This
investigation involves a meticulous fine-tuning and evaluation process, ensuring a comprehensive
assessment of the models’ capabilities and limitations [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Furthermore, we have developed multiple
ensemble strategies that combine the predictions of individual transformer models to improve robustness.
      </p>
      <p>
        The experimental results demonstrate that transformer-based ensembles consistently outperform
individual models in terms of F1-Score, particularly in the context of binary classification tasks [
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ].
      </p>
      <p>
        The dataset under consideration consists of manually annotated Spanish-language song lyrics and
represents a novel, socially impactful resource in a low-resource language [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. This study underscores
the promise of integrating conventional and deep learning methodologies, encompassing transformer
ensembles and established baselines such as VSM, to confront the intricate and culturally contingent
classification dilemmas prevalent in natural language processing [
        <xref ref-type="bibr" rid="ref10 ref11 ref12 ref9">9, 10, 11, 12</xref>
        ].
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Task Description</title>
      <p>
        The MiSonGyny 2025 [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] shared task focuses on detecting misogynistic content in Spanish song lyrics.
It is structured into two classification tasks of increasing complexity: binary detection of misogyny and
ifne-grained classification of misogynistic speech types. Below, we summarize the objectives and label
definitions for each task, as well as the evaluation metrics used.
      </p>
      <sec id="sec-2-1">
        <title>2.1. Task 1: Misogynistic Speech Detection</title>
        <p>The first task consists of a binary classification problem where the goal is to determine whether a
given phrase from a song lyric contains misogynistic speech.</p>
        <p>• Misogynist (M): The phrase expresses hate speech or contempt directed at women, or reinforces
harmful gender stereotypes that promote subordination, objectification, or marginalization of
women.
• Not Misogynist (NM): The phrase does not contain hate speech or contempt against women. It
may refer to women without reinforcing gender stereotypes or expressing negative attitudes.
Example:</p>
        <p>ID_Track1, "M" – this song is defined as misogynistic</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Task 2: Fine-grained Misogynistic Speech Detection</title>
        <p>The second task aims to classify the specific type of misogynistic content present in a phrase. This
is a multi-class classification task, where each misogynistic phrase is assigned to one of the following
categories:
• Sexualization (S): Phrases that describe or imply sexual acts, use sexual language, or make
sexual insinuations.
• Violence (V): Phrases that refer to physical or verbal aggression, threats, or violent behavior.
• Hate (H): Phrases that include ofensive, hostile, or discriminatory language, targeting women
or groups through expressions of contempt or dehumanization.
• Not Related (NR): Phrases that do not belong to the above categories and contain no sexual,
violent, or hateful content.</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Evaluation Metrics</title>
        <p>
          Both tasks are evaluated using macro-averaged metrics to address potential class imbalance and ensure
that each class contributes equally to the final scores, regardless of its frequency in the dataset [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ].
        </p>
        <p>The following metrics are reported for each task:
• Macro F1-Score
• Macro Precision
• Macro Recall
Let (,  , ,  ) be any binary evaluation measure (e.g., precision, recall, or F1-score), where:
• : true positives
•  : false positives
• : true negatives
•  : false negatives</p>
        <p>For each class label  in a set of  labels, the metric  is computed using the binary evaluation for
that specific class (one-vs-rest). Then, the macro-averaged version of the metric is calculated as:

macro = 1 ∑=︁1 ( ,   ,  ,   )</p>
        <p>This formulation ensures that all classes contribute equally, which is particularly important for
datasets where some classes may be underrepresented.</p>
      </sec>
      <sec id="sec-2-4">
        <title>2.4. Data Description</title>
        <p>
          The datasets provided for the MiSonGyny 2025 [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] shared task include annotated Spanish song lyrics,
separated by task. Each dataset consists of short textual segments (phrases or lines) labeled according
to the task definitions.
        </p>
        <p>For Task 1 (binary misogyny detection), the training set contains 2,104 instances, with a noticeable
class imbalance: 642 misogynistic (M) and 1,462 non-misogynistic (NM) examples.</p>
        <p>For Task 2 (fine-grained classification), the training set includes 1,168 instances labeled across four
categories. The distribution is as follows:
• Sexualization (S): 435 instances
• Violence (V): 129 instances
• Hate (H): 78 instances
• Not Related (NR): 526 instances</p>
        <p>These imbalances reinforce the need to evaluate models using macro-averaged metrics and highlight
the importance of robust classification methods, especially for the underrepresented categories.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Models and Approaches</title>
      <p>For both subtasks, we explore a range of models suited for natural language processing (NLP) tasks.
These include transformer-based architectures as well as a traditional machine learning baseline:
• BERT [15, 16]: A widely used transformer model pre-trained on large corpora using a masked
language modeling objective. It provides strong baseline performance for sentence-level classification
tasks.
• RoBERTa [17, 16]: An optimized version of BERT trained with larger data and no next-sentence
prediction objective, ofering improved results on various NLP benchmarks.
• BETO [18]: A Spanish version of BERT, pre-trained on a large Spanish corpus, making it more
suitable for tasks in this language.
• LLaMA [19]: A recent multilingual transformer-based model designed for eficiency and strong
generalization across languages. We use it to assess its ability to handle subtle linguistic
phenomena in Spanish lyrics.
• SVM (Support Vector Machine) [20, 21]: As a classical baseline, we include an SVM classifier
using TF-IDF representations. This provides a useful point of comparison for transformer-based
methods.
• Random Forest (RF) [22, 23, 24]: A tree-based ensemble method that builds multiple decision
trees and aggregates their predictions. It is included as a second traditional baseline due to its
robustness and interpretability in high-dimensional text classification tasks.</p>
      <p>These models allow us to compare traditional and modern approaches to the detection and
classification of misogynistic speech in Spanish song lyrics.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Methodology</title>
      <sec id="sec-4-1">
        <title>4.1. Task 1 Overview</title>
        <p>This section describes the methodology used for Task 1: binary classification of misogynistic vs.
nonmisogynistic lyrics.</p>
        <p>The process for Task 1 (binary misogyny detection) consisted of the following main steps. Figure 1
illustrates the complete pipeline:
1. Initial data analysis: The original training set for Task 1 contained 2,104 examples, with a highly
imbalanced class distribution: 642 labeled as Misogynistic (M) and 1,462 as Not Misogynistic
(NM). The imbalance ratio was calculated and confirmed to be significant. The imbalance ratio
(IR) is calculated as follows:</p>
        <p>NM 1462</p>
        <p>
          IR = M = 642 ≈ 2.28 (1)
2. Data balancing: To reduce bias in model training, we augmented the dataset using two external
resources with similar annotation schemes:
• Sexism in the Lyrics of the Most Listened to Songs in Spain, with 15,836 NM and 4,758 M
examples [
          <xref ref-type="bibr" rid="ref5">5, 25</xref>
          ].
• Corpus of Song Lyrics in Spanish Labeled for Gender-Based Violence Against Women, with 778
        </p>
        <p>
          NM and 222 M examples [
          <xref ref-type="bibr" rid="ref2">26, 2</xref>
          ].
        </p>
        <p>Using a sampling strategy, we balanced the final dataset to contain 6,000 NM and 5,622 M
instances.
3. Preprocessing and tokenization: All lyrics were tokenized using the default tokenizer of
each transformer model (BERT, RoBERTa, BETO, and LLaMA). No additional preprocessing (e.g.,
lemmatization or stopword removal) was applied, to preserve the original semantics and language
style of the lyrics.
4. Model training: Each model was trained independently using the same data split: 70% for
training, 20% for validation, and 10% for testing. We used 3 to 5 epochs per model depending on
the validation loss convergence. The following models were used:
• BERT
• RoBERTa
• BETO
• LLaMA
• Vector Space Model (VSM) as a classical baseline
5. Ensemble construction: Two ensemble models were created by combining the predictions from
BERT, RoBERTa, and BETO. A majority voting scheme was applied, where the final prediction
corresponds to the label chosen by at least two of the three models.</p>
        <p>This multi-step pipeline allowed us to integrate data augmentation, robust modeling with pre-trained
transformers, and ensemble techniques to address the challenges of Task 1.</p>
        <sec id="sec-4-1-1">
          <title>From Dataset Analysis to Model Evaluation</title>
          <p>Data Augmentation and
Balancing
External datasets used to</p>
          <p>balance classes</p>
          <p>Model Training
Training BERT, RoBERTa, BETO,
LLaMA, VSM</p>
          <p>Evaluation
Metrics: Macro-F1,
MacroPrecision, Macro-Recall</p>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Task 2 Overview</title>
        <p>The process for Task 2 (fine-grained misogyny classification) consisted of the following main steps.
Figure 2 shows the complete pipeline:
1. Initial Dataset Analysis: The original dataset contained the following class distribution:
• Sexualization (S): 435 instances
• Violence (V): 129 instances
• Hate (H): 78 instances
• Not Related (NR): 526 instances
The most common class (S) had 435 examples, while the rarest (H) had 78. The resulting imbalance
ratio (IR) was:</p>
        <p>435</p>
        <p>IR = 78 ≈ 5.58
2. Data Augmentation with Generative AI: To mitigate the imbalance, we used the GROK
generative AI tool to synthesize additional Spanish song lyrics for the minority classes (Violence
and Hate). This resulted in a more balanced dataset with the following class distribution:
3. Preprocessing and Tokenization: As in Task 1, tokenization was performed using the respective
tokenizers of the transformer models. No additional text cleaning or normalization was applied.
4. Model Training: The following models were trained on the augmented dataset:
• NR: 526 instances
• S: 435 instances
• V: 324 instances
• H: 273 instances
• BERT
• RoBERTa
• BETO
• LLaMA
Data Augmentation</p>
        <p>(Generative AI)
Balancing underrepresented</p>
        <p>classes</p>
        <p>Model Training
Training diverse models</p>
        <p>Evaluation
Metrics: Macro-F1,
Macro</p>
        <p>Precision, Macro-Recall</p>
        <sec id="sec-4-2-1">
          <title>Enhancing Text Classification with AI</title>
          <p>We used a 70% training, 20% validation, and 10% testing split. Each model was trained for 3 to 5
epochs depending on convergence.
5. Ensemble Construction: Two ensembles were created using the predictions from BERT,</p>
          <p>RoBERTa, and BETO. As in Task 1, majority voting was applied for the final prediction.
6. Evaluation: Performance was assessed using macro-averaged F1-score, precision, and recall,
following the evaluation guidelines of the MiSonGyny 2025 shared task.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Experiments and Results</title>
      <p>This section presents the evaluation results obtained for both tasks using the macro-averaged metrics
required by the MiSonGyny 2025 shared task: F1-score, Precision, and Recall. We report test set
performance for each model, highlighting the benefits of transformer-based ensembles compared to
individual models and traditional baselines.</p>
      <sec id="sec-5-1">
        <title>5.1. Task 1: Binary Classification of Misogynistic Speech</title>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. Task 2: Fine-Grained Classification of Misogynistic Content</title>
        <p>For Task 2, the best result was achieved by the LLaMA model (F1 = 0.488), followed by the ensemble
Transformer model (F1 = 0.434) and BERT (F1 = 0.415). Again, ensemble methods improved over most
base models. However, the overall F1 scores are lower than in Task 1, which reflects the increased
dificulty of the fine-grained classification problem. Random Forest and traditional models performed
poorly on this task, with an F1 score of only 0.261.</p>
      </sec>
      <sec id="sec-5-3">
        <title>5.3. Results on the Oficial Competition Test Sets</title>
        <p>In addition to local test evaluation, the oficial MiSonGyny 2025 organizers provided two blind test sets
for external evaluation: 527 unlabeled instances for Task 1 and 393 for Task 2. Our team submitted a
total of seven prediction files for Task 1 and eight for Task 2, each corresponding to a diferent model or
ensemble configuration. The organizers computed macro-averaged F1-scores for each submission.
5.3.1. Task 1 – Oficial Test Set Results
The best result was obtained in Submission 6 (ID: 289232), with an F1-score of 0.8280. This submission
secured 3rd place overall in Task 1 of the competition.
5.3.2. Task 2 – Oficial Test Set Results
The best result was achieved in Submission 2 (ID: 281225), with an F1-score of 0.4883, which placed
our team in 6th position overall for Task 2.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusions</title>
      <p>The detection of misogynistic speech in Spanish-language song lyrics is both a socially urgent and
technically challenging task. Misogynistic and violent lyrics contribute to the normalization of harmful
gender-based stereotypes and symbolic violence, particularly in musical genres with massive reach and
influence among young audiences.</p>
      <p>This study highlights the dificulty of developing robust classification systems in domains with limited
annotated data. One of the main challenges faced in both tasks was the scarcity of publicly available
labeled datasets, especially for fine-grained categories such as hate or violence. Moreover, the original
datasets provided for the MiSonGyny 2025 shared task were significantly imbalanced. To address this
issue, we adopted a dual strategy: leveraging external Spanish corpora with related annotations for
Task 1, and generating new examples using generative AI (GROK) for Task 2.</p>
      <p>Our experiments confirm the efectiveness of transformer-based models, especially when used in
ensemble configurations. In Task 1, a transformer ensemble reached an F1-score of 0.828 and secured
third place in the competition. In Task 2, which required multi-class classification, the best-performing
system achieved an F1-score of 0.488 and ranked sixth. These results validate the value of ensemble
learning in low-resource and high-variance classification problems such as misogyny detection in lyrics.</p>
      <p>Future work may focus on improving class-specific performance, leveraging multilingual models,
or incorporating contextual metadata such as genre or artist information to enhance classification
robustness.</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>Generative AI tools were used in this work in two specific contexts:
• Visualization: Figures 1 and 2 were created using Napkin AI (https://app.napkin.ai/), a document
editing platform that transforms structured text into visual diagrams.
• Data Augmentation: For Task 2, we employed the generative platform GROK (https://grok.com)
to synthesize additional text samples for the underrepresented classes (Hate and Violence). This
was necessary to mitigate class imbalance and support model training.</p>
      <p>The use of generative AI was limited to these tasks and did not involve the generation of article
content, code, or evaluation results.
[15] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, Bert: Pre-training of deep bidirectional transformers
for language understanding, in: Proceedings of the 2019 Conference of the North American
Chapter of the Association for Computational Linguistics (NAACL), 2019.
[16] S. Chopra, P. Agarwal, J. Ahmed, S. S. Biswas, Ahmed, J. Obaid, Roberta and bert:
Revolutionizing mental healthcare through natural language, SN Computer Science 2024 5:7 5
(2024) 1–12. URL: https://link.springer.com/article/10.1007/s42979-024-03202-8. doi:10.1007/
S42979-024-03202-8.
[17] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, V. Stoyanov,</p>
      <p>Roberta: A robustly optimized bert pretraining approach, arXiv preprint arXiv:1907.11692 (2019).
[18] J. Cañete, G. Chaperon, R. Fuentes, J. Pérez, D. Parra, Spanish pre-trained bert model and evaluation
data, arXiv preprint arXiv:2002.02340 (2020).
[19] I. Almubark, Exploring the impact of large language models on disease diagnosis, IEEE Access
(2025). doi:10.1109/ACCESS.2025.3527025.
[20] B. E. Boser, I. M. Guyon, V. N. Vapnik, A training algorithm for optimal margin classifiers, in:
Proceedings of the fifth annual workshop on Computational learning theory, ACM, 1992, pp.
144–152.
[21] M. E. Hassan, M. Hussain, I. Maab, U. Habib, M. A. Khan, A. Masood, Detection of sarcasm in
urdu tweets using deep learning and transformer based hybrid approaches, IEEE Access 12 (2024)
61542 – 61555. doi:10.1109/ACCESS.2024.3393856.
[22] L. Breiman, Random forests, Machine learning 45 (2001) 5–32.
[23] W. Li, M. Zhang, A comparative study of classical and transformer-based models for social bias
detection, Journal of Computational Linguistics and Applications (2024).
[24] M. H. Shohan, K. R. Ahmed, N. F. Kahar, N. Jahan, M. M. Hassan, R. B. Ahmad, N. Yaakob, O. B.</p>
      <p>Lynn, N. Islam, Use of natural language processing for the detection of hate speech on social
media, Journal of Advanced Research in Applied Sciences and Engineering Technology 51 (2025)
86 – 96. doi:10.37934/araset.51.2.8696.
[25] L. Casanovas-Buliart, C. Castillo, P. Alvarez-Cueva, Sexism in the lyrics of the most listened to songs
in spain, 2023. URL: https://doi.org/10.5281/zenodo.8134122. doi:10.5281/zenodo.8134122.
[26] R. Calbullanca Viluñir, A. Segura Navarrete, C. Vidal-Castro, C. Martínez-Araneda, Corpus of song
lyrics in spanish labeled for gender-based violence against women, 2024. URL: https://doi.org/10.
5281/zenodo.13370289. doi:10.5281/zenodo.13370289.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>E.</given-names>
            <surname>Aldana-Bobadilla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Molina-Villegas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Montelongo-Padilla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Lopez-Arevalo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O. S.</given-names>
            <surname>Sordia</surname>
          </string-name>
          ,
          <article-title>A language model for misogyny detection in latin american spanish driven by multisource feature extraction and transformers</article-title>
          ,
          <source>Applied Sciences</source>
          <year>2021</year>
          , Vol.
          <volume>11</volume>
          , Page 10467
          <volume>11</volume>
          (
          <year>2021</year>
          )
          <article-title>10467</article-title>
          . doi:
          <volume>10</volume>
          . 3390/APP112110467.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>R. C.</given-names>
            <surname>Viluñir</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. S.</given-names>
            <surname>Navarrete</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Vidal-Castro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Martínez-Araneda</surname>
          </string-name>
          ,
          <article-title>Improving automatic detection of gender-based violence in spanish song lyrics using deep learning, data augmentation and undersampling</article-title>
          ,
          <source>Lecture Notes in Networks and Systems 1284 LNNS</source>
          (
          <year>2025</year>
          )
          <fpage>189</fpage>
          -
          <lpage>209</lpage>
          . URL: https:// link.springer.com/chapter/10.1007/978-3-
          <fpage>031</fpage>
          -85363-0_
          <fpage>12</fpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>031</fpage>
          -85363-0_
          <fpage>12</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>T.</given-names>
            <surname>Alcántara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Soto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Macias</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Garcia-Vazquez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Espinosa-Juarez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Calvo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. E.</given-names>
            <surname>ValdezRodríguez</surname>
          </string-name>
          , E. Felipe-Riveron, Overview of MiSonGyny at IberLEF 2025:
          <article-title>Misogyny Speech Detection in Spanish Language Song Lyrics</article-title>
          ,
          <source>Procesamiento del Lenguaje Natural</source>
          <volume>75</volume>
          (
          <year>2025</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.</given-names>
            <surname>Á</surname>
          </string-name>
          .
          <string-name>
            <surname>González-Barba</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Chiruzzo</surname>
            ,
            <given-names>S. M.</given-names>
          </string-name>
          <string-name>
            <surname>Jiménez-Zafra</surname>
          </string-name>
          ,
          <article-title>Overview of IberLEF 2025: Natural Language Processing Challenges for Spanish and other Iberian Languages, in: Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2025), co-located with the 41st Conference of the Spanish Society for Natural Language Processing (SEPLN 2025), CEUR-WS</article-title>
          . org,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>L.</given-names>
            <surname>Casanovas-Buliart</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Alvarez-Cueva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Castillo</surname>
          </string-name>
          ,
          <article-title>Evolution over 62 years: an analysis of sexism in the lyrics of the most-listened-to songs in spain</article-title>
          ,
          <source>Cogent Arts and Humanities</source>
          <volume>11</volume>
          (
          <year>2024</year>
          )
          <article-title>2436723</article-title>
          . URL: https://www.tandfonline.com/doi/pdf/10.1080/23311983.
          <year>2024</year>
          .
          <volume>2436723</volume>
          . doi:
          <volume>10</volume>
          .1080/23311983.
          <year>2024</year>
          .2436723;
          <string-name>
            <surname>JOURNAL:JOURNAL:</surname>
          </string-name>
          <article-title>OAAH20;WGROUP:STRING: PUBLICATION</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>R.</given-names>
            <surname>Calderon-Suarez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. M.</given-names>
            <surname>Ortega-Mendoza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Montes-Y-Gomez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Toxqui-Quitl</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>A. MarquezVera, Enhancing the detection of misogynistic content in social media by transferring knowledge from song phrases</article-title>
          ,
          <source>IEEE Access 11</source>
          (
          <year>2023</year>
          )
          <fpage>13179</fpage>
          -
          <lpage>13190</lpage>
          . doi:
          <volume>10</volume>
          .1109/ACCESS.
          <year>2023</year>
          .
          <volume>3242965</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>E.</given-names>
            <surname>Hashmi</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. M. Yamin</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Imran</surname>
            ,
            <given-names>S. Y.</given-names>
          </string-name>
          <string-name>
            <surname>Yayilgan</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Ullah</surname>
          </string-name>
          ,
          <article-title>Enhancing misogyny detection in bilingual texts using fasttext and explainable ai</article-title>
          ,
          <source>Proceedings - 2024 International Conference on Engineering and Computing</source>
          ,
          <string-name>
            <surname>ICECT</surname>
          </string-name>
          <year>2024</year>
          (
          <year>2024</year>
          ).
          <source>doi:10.1109/ICECT61618</source>
          .
          <year>2024</year>
          .
          <volume>10581058</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>E.</given-names>
            <surname>Hashmi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. Y.</given-names>
            <surname>Yayilgan</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. M. Yamin</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Ullah</surname>
          </string-name>
          ,
          <article-title>Enhancing misogyny detection in bilingual texts using explainable ai and multilingual fine-tuned transformers</article-title>
          ,
          <source>Complex and Intelligent Systems</source>
          <volume>11</volume>
          (
          <year>2025</year>
          )
          <fpage>1</fpage>
          -
          <lpage>19</lpage>
          . URL: https://link.springer.com/article/10.1007/s40747-024-01655-1. doi:
          <volume>10</volume>
          .1007/ S40747-024-01655-1/FIGURES/14.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.</given-names>
            <surname>Jiang</surname>
          </string-name>
          , Tackling Sexist Hate Speech:
          <article-title>Cross-Lingual Detection and Classification on Social Media</article-title>
          ,
          <source>Phd thesis</source>
          , Queen Mary University of London,
          <year>2025</year>
          . URL: https://qmro.qmul.ac.uk/xmlui/handle/ 123456789/98199.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>N.</given-names>
            <surname>Jindal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. K.</given-names>
            <surname>Kumaresan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ponnusamy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Thavareesan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Rajiakodi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. R.</given-names>
            <surname>Chakravarthi</surname>
          </string-name>
          , Mistra:
          <article-title>Misogyny detection through text-image fusion and representation analysis</article-title>
          ,
          <source>Natural Language Processing Journal</source>
          <volume>7</volume>
          (
          <year>2024</year>
          )
          <article-title>100073</article-title>
          . URL: https://www.sciencedirect.com/science/article/ pii/S2949719124000219. doi:
          <volume>10</volume>
          .1016/J.NLP.
          <year>2024</year>
          .
          <volume>100073</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>J. A.</given-names>
            <surname>García-Díaz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Jiménez-Zafra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>García-Cumbreras</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Valencia-García</surname>
          </string-name>
          ,
          <article-title>Evaluating feature combination strategies for hate-speech detection in spanish using linguistic features and transformers</article-title>
          ,
          <source>Complex and Intelligent Systems</source>
          <volume>9</volume>
          (
          <year>2023</year>
          )
          <fpage>2893</fpage>
          -
          <lpage>2914</lpage>
          . URL: https://link.springer. com/article/10.1007/s40747-022-00693-x. doi:
          <volume>10</volume>
          .1007/S40747-022-00693-X/TABLES/13.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>F. B. M. P. del Arco</surname>
            <given-names>SUPERVISED</given-names>
          </string-name>
          ,
          <string-name>
            <surname>M. L. T. M.-V. A. U.-L. JAÉN</surname>
          </string-name>
          ,
          <string-name>
            <surname>DETECTING OFFENSIVE LANGUAGE BY INTEGRATING MULTIPLE LINGUISTIC</surname>
            <given-names>PHENOMENA</given-names>
          </string-name>
          , volume
          <volume>52</volume>
          , Jaén : Universidad de Jaén,
          <year>2023</year>
          . URL: https://hdl.handle.net/10953/2400.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Codabench</surname>
            ,
            <given-names>MiSonGyny</given-names>
          </string-name>
          <year>2025</year>
          :
          <article-title>Misogyny in Song Lyrics</article-title>
          , https://www.codabench.org/competitions/ 5914/,
          <year>2025</year>
          . Accessed:
          <fpage>2025</fpage>
          -05-22.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>G.</given-names>
            <surname>Naidu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Zuva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. M.</given-names>
            <surname>Sibanda</surname>
          </string-name>
          ,
          <article-title>A review of evaluation metrics in machine learning algorithms</article-title>
          ,
          <source>Lecture Notes in Networks and Systems 724 LNNS</source>
          (
          <year>2023</year>
          )
          <fpage>15</fpage>
          -
          <lpage>25</lpage>
          . URL: https://link.springer.com/ chapter/10.1007/978-3-
          <fpage>031</fpage>
          -35314-
          <issue>7</issue>
          _2. doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>031</fpage>
          -35314-
          <issue>7</issue>
          _
          <fpage>2</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>