<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Knowledge Expansion Guided by Justification for Improved Sexism Categorization</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Kapioma Villarreal-Haro</string-name>
          <email>kapioma.villarreal@cimat.mx</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fernando Sánchez-Vega</string-name>
          <email>fernando.sanchez@cimat.mx</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Adrián Pastor López-Monroy</string-name>
          <email>pastor.lopez@cimat.mx</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Computer Science Department, Mathematics Research Center (CIMAT)</institution>
          ,
          <addr-line>Jalisco S/N Valenciana, 36023, Guanajuato, Guanajuato</addr-line>
          ,
          <country country="MX">México</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Secretaría de Ciencia, Humanidades, Tecnología e Innovación (SECIHTI)</institution>
          ,
          <addr-line>Av. Insurgentes Sur 1582, Col. Crédito Constructor, 03940, CDMX</addr-line>
          ,
          <country country="MX">México</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <abstract>
        <p>We describe in this paper the participation of team CIMAT-GTO in Task 1 (hard label setting) of EXIST 2025, which focuses on identifying sexism in tweets, determining the source's intention, and categorizing the types of sexism expressed. We propose a hybrid methodology that combines generative large language models with fine-tuned transformer-based classifiers through knowledge expansion. Our approach utilizes a generative model to highlight contextually relevant elements for the task as well as provide classification answers, and subsequently extracts justification texts that support the given predictions. We then conduct a justification-guided knowledge expansion when fine-tuning a smaller transformer-based model for classification, aiming for the model to learn from the reasoning encoded in the generated texts. We evaluate both monotask and multitask fine-tuning strategies and implement ensemble methods to improve robustness. Our results demonstrate that knowledge expansion using justifications obtained from generative models enhances performance over baseline few-shot classification and ifne-tuned models. The proposed systems prove to be competitive and achieve second place in all three textual tasks.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Generative Large Language Models</kwd>
        <kwd>LLM Reasoning</kwd>
        <kwd>Knowledge Expansion</kwd>
        <kwd>Sexism Detection</kwd>
        <kwd>Sexism Categorization</kwd>
        <kwd>Social Media</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Social media platforms have become a widely used medium for communication and information
consumption in recent years. These platforms, despite allowing for quick and easy information exchange
among users, lead to problems such as exposure to misinformation and biased content [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Among the
diferent social health problems that emerge in these platforms as a reflection of real-life problems,
sexism is one of the most concerning ones as it is deeply embedded in societal norms and cultural
attitudes [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Both overt hostile and subtle forms of sexism have been studied to be recurrent and
negatively afect the psychological well-being of people in everyday interactions [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. This phenomenon
translates from the physical world to digital spaces, where it follows its own dynamics that frequently
amplify extreme viewpoints due to anonymity and online disinhibition efects [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. To better shape
and understand the problem, we need to study how technology transforms social interactions and the
impact that it has from diferent perspectives: technology can be viewed as a facilitator for gender-based
violence [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], but also as a tool to challenge it and raise awareness [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>However, current detection systems struggle with contextual understanding and nuanced
categorization. Understanding the presence and prevalence of this phenomenon, and its many-sided manifestations,
sheds light on the necessity of developing systems that are capable of identifying and characterizing
this type of content while also managing large volumes of information.</p>
      <p>
        This paper describes CIMAT-GTO’s participation in Task 1 of EXIST (sEXism Identification in Social
neTworks) 2025 [
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ]: identifying sexism in tweets, determining the source’s intention, and categorizing
the types of sexism expressed. We present a hybrid approach that utilizes fine-tuning of
transformerbased models with knowledge expansion using contextualized reasoning produced with generative LLMs.
We use Gemini-1.5-Flash due to its general-purpose capabilities, and relatively moderate parameter size.
This choice leaves room for further experimentation with larger or specialized reasoner models that
may provide higher-quality responses and justifications, potentially enhancing performance following
our proposed framework.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        In previous editions of EXIST, transformer-based architectures, such as BERT or RoBERTa, were the most
widely used and efective models in textual tasks. These models were typically fine-tuned with or without
prior pre-training, combined with techniques such as data augmentation, hierarchical classification,
annotator information injection, model cascades, and ensembles [
        <xref ref-type="bibr" rid="ref10 ref11 ref12 ref9">9, 10, 11, 12</xref>
        ]. Meanwhile, during the
2024 edition of EXIST, generative models were not only being explored as zero or few-shot classifiers,
but also in hybrid approaches where their answers or knowledge were combined or leveraged by other
LLMs. Still, they were unable to surpass other types of models proposed, despite their impressive
performance [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. While fine-tuning is a classical framework for classification, where the input data
relies only on text, there is work where performance is improved using injection of external knowledge
like handcrafted textual features, graphs of knowledge, or retrieval from outside sources to provide with
an expanded range of information [
        <xref ref-type="bibr" rid="ref13 ref14">13, 14</xref>
        ]. Among several applications, utilizing knowledge extraction
of rationales to be further processed by a transformer alongside the tweet and use the expanded input
for classification has been shown helpful for hate speech detection [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] and sexism detection [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ],
demonstrating how outputs from generative models can be leveraged by smaller classifiers in this type
of schemes.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. The data and the tasks addressed</title>
      <p>
        sEXism Identification in Social neTworks (EXIST) is a shared task that aims to foster sexism identification
and characterization in social media [
        <xref ref-type="bibr" rid="ref7 ref8">8, 7</xref>
        ]. The most recent edition includes textual and multimodal
challenges. In this work, we focus only on text data from tweets.
3.1. Data
The tweet dataset comprises 10,034 tweets in both Spanish and English. The dataset is mostly balanced
across languages and is roughly split into a (70:10:20) proportion for training, development, and testing,
maintaining proportions for each class.
      </p>
      <p>Six annotators labeled each tweet, and hard labels were computed by organizers following probabilistic
thresholds. As we focus on predicting hard labels, we only considered tweets where such categorization
was retained. In the case of task 1.1, binary identification of sexism-related content, this reduced the set
to approximately 87% of the original dataset.</p>
      <sec id="sec-3-1">
        <title>3.2. The tasks</title>
        <p>The three textual tasks addressed, in the hard label setting, in this work are
• Task 1.1: Sexism Identification: Binary classification to detect sexism-related content.
• Task 1.2: Source intention. Classification of intention if sexism-related content was identified.</p>
        <p>Three possible categories: direct, reported, and judgemental.
• Task 1.3: Sexism Categorization. Multi-label classification of sexism-related content was identified.</p>
        <p>Possible categories include ideological and inequality, stereotyping and dominance, objectification,
sexual violence, and misogyny and non-sexual violence.</p>
        <p>Tasks 1.2 and 1.3 derive from the classification of Task 1.1 and provide more insights into the
characterization of sexism-related content.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Methodology</title>
      <p>We propose hybrid systems that utilize generative LLMs to obtain answers to classification tasks (FS-Ctx),
and then extract justification texts that explain the given answers. We then use these texts as knowledge
expansion to fine-tune transformer-based classifiers ( FT-KE-Mono and FT-KE-Multi). For each of these
ifne-tuned variants, we train three models as an ensemble to improve robustness and performance
(Ens-FT-KE-Mono and Ens-FT-KE-Multi). This two-stage process is detailed in the following subsections.
An overall description of the individual systems is provided in Figure 1, while Table 1 summarizes the
models studied and their corresponding run numbers.</p>
      <sec id="sec-4-1">
        <title>4.1. Generation Stage</title>
        <p>
          The first step in our methodology involves prompting an LLM to provide contextual cues about the
tweet and to answer the three tasks within a single query. Items in the set of contextual elements are
obtained by identifying recurrent patterns in the automatically discovered relevant information for
the task obtained in the generation stage of the proposal of other studies [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]. This contextualization
allows the model to retrieve a structured list of relevant elements aligned with the task and improves
the accuracy of classification using generation. The prompt is also enriched with a few-shot examples
and definitions for possible answer categories, as these are popular strategies to improve performance
[
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]. At this stage, the model chosen is Gemini-1.5-Flash, as it is a general-purpose model and ofers a
good trade-of between performance and eficiency.
        </p>
        <p>FS-Ctx (Few-Shot-Context) This is the generation-based method following the previously described
setting. Although the generation following this methodology outperforms other prompt-based methods
and some simple classifiers, it falls short compared to the best-performing classifier models.</p>
        <p>We explore in the next stage how to achieve stronger models leveraging knowledge encoded by
generative models that is indirectly present in the answers. In an intermediate step, we further prompt
our generative LLM to justify the answers produced in the previous step based on the classification. We
expect these texts to be informative and to enhance model performance, as they are used to enrich the
input for classifier models.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Fine-Tuning Stage with Knowledge Expansion</title>
        <p>We fine-tune two diferent types of models, using our proposed knowledge expansion strategy that
concatenates justifications and the original tweet, and feeds these as input to the model, leveraging not
only the classification preferences of the particular generative LLM from the previous step, but also
reasoning in justifications. We evaluate two diferent types of fine-tuning techniques. We choose an
XLM-RoBERTa trained on multilingual tweets [19] as a base model, to leverage information from both
languages in the corresponding domain.</p>
      </sec>
      <sec id="sec-4-3">
        <title>FT-KE-Mono (Fine-Tuned with Knowledge Expansion, Monotask) Individual fine-tuning of the</title>
        <p>RoBERta-based model for each subtask of task 1. Hence, we rely on three diferent models to provide a
complete response to all text tasks. A post-processing correction process is applied to labels predicted
in tasks 1.2 and 1.3 to align with the negative output of task 1.1 and avoid contradiction.</p>
      </sec>
      <sec id="sec-4-4">
        <title>FT-KE-Multi (Fine-Tuned with Knowledge-Expansion, Multitask) Multitask learning to predict</title>
        <p>labels for the three tasks at the same time, using a single multilingual RoBERTa-based model. We train
by optimizing a new loss function obtained from the losses for each task.</p>
        <p>To further enhance the performance of our models, we generate three distinct sets of justifications
and train models for each set. Ensembles are produced using the output scores for each task to develop
more robust labels in the final submission of the run, yielding the other systems submitted for evaluation
(Ens-FT-KE-Mono and Ens-FT-KE-Multi).</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <p>The following section presents the results obtained during the development phase and the final results
achieved on the oficial leaderboard.</p>
      <sec id="sec-5-1">
        <title>5.1. Results on dev</title>
        <p>We conduct a preliminary evaluation on the development set to validate our proposal and estimate
the impact of each component within the proposed methods. Table 2 shows the performance of the
generation-based classification and individual fine-tuned models. We observe that the only
generative approach does not match the overall performance of the fine-tuned baseline, but still reaches a
similar performance in task 1.2. The baseline is outperformed by systems with knowledge expansion,
indicating that these systems are indeed learning from the outputs of the generative models. Multitask
learning yields similar results to training individual systems, but has the advantage of requiring fewer
computational resources, and shared representation benefits task 1.3 slightly.</p>
        <p>ICM-Norm Task 1.1 ICM-Norm Task 1.2 ICM-Norm Task 1.3</p>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. Oficial leaderboard</title>
        <p>System results of the selected systems are presented in Tables 3 and 4. As expected, fine-tuned submitted
models outperform their generative counterparts, with the most significant improvement observed in
task 1.3. Ens-FT-KE-Multi outperforms Ens-FT-KE-Mono in tasks 1.1 and 1.3, suggesting that training all
tasks might provide insights across tasks that benefit the individual scores, and has the advantage of
requiring fewer computational resources. Task 1.2 achieves its best performance with Ens-FT-KE-Mono,
which may indicate that shared representation is not adding new information to the model.</p>
        <p>Results segmented by language are presented in Table 4. Ranking in our system’s performance is
consistent across languages, but shows diferences in metrics, with stronger results in Spanish. This
suggests that further experimentation should be conducted in monolingual settings, possibly using
techniques such as translation to leverage more data and information.</p>
        <p>Our best-performing method achieves second place in all three text-based tasks, showing that
combining generative reasoning with fine-tuned classifiers is a promising direction. While the metric
performance is encouraging, the results indicate room for improvement and further experimentation.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>This work presents a hybrid approach for sexism identification and characterization in social media,
which combines the reasoning capabilities of generative LLMs with fine-tuned transformer-based
classification models in a knowledge expansion process.</p>
      <p>Key findings include: (1) generative models alone, while competitive, do not surpass fine-tuned
classifiers; (2) fine-tuning with knowledge expansion using justifications from generative models
of transformer-based models improves performance across all tasks; (3) multitask learning ofers
computational eficiency while maintaining competitive results and benefiting task 1.3. Our systems
achieve competitive results and attain high rankings on the EXIST 2025 leaderboard. The system’s
performance and internal biases remain to be explored, and future research includes examining other
LLMs as components used in the method.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Ethical Concerns</title>
      <p>It is essential to acknowledge that the models developed focus on predicting hard labels, overlooking
the granularity of sociodemographic groups’ perspectives. In particular, when dealing with violent
expressions, prioritizing the voices of victims and vulnerable groups rather than giving equal weight
to the views of those who cause harm, helps to surface silenced experiences and attain responsible
representation [20]. We also note that LLMs can reproduce various internal biases that might be
misleading if not carefully monitored.</p>
      <p>Task
1.1
1.2
1.3
1.1
1.2
1.3</p>
      <p>EN
ES</p>
      <p>FS-Ctx
Ens-FT-KE-Mono
Ens-FT-KE-Multi
FS-Ctx
Ens-FT-KE-Mono
Ens-FT-KE-Multi
FS-Ctx
Ens-FT-KE-Mono
Ens-FT-KE-Multi
FS-Ctx
Ens-FT-KE-Mono
Ens-FT-KE-Multi
FS-Ctx
Ens-FT-KE-Mono
Ens-FT-KE-Multi
FS-Ctx
Ens-FT-KE-Mono
Ens-FT-KE-Multi</p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgments</title>
      <p>Villarreal-Haro acknowledges Secretaría de Ciencia, Humanidades, Tecnología e Innovaciones
(SECIHTI) for its support provided by the program Becas Nacionales Para Estudios de Posgrados (CVU
1309535). Sanchez-Vega acknowledges SECIHTI for its support through the program “Investigadoras
e Investigadores por México” (Project ID.11989, No.1311). The authors thank SECIHTI for the
computer resources provided through CIMAT Bajio Super-computing Laboratory (#300832) and Google
Cloud Platform for the free cloud credits under its trial program used to access the Gemini API for
experimentation in this project.</p>
    </sec>
    <sec id="sec-9">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors used Grammarly to perform a grammar and spelling
check. After using this tool, the author reviewed and edited the content as needed and takes full
responsibility for the publication’s content.
The prompt report: A systematic survey of prompting techniques, arXiv preprint arXiv:2406.06608
5 (2024).
[19] F. Barbieri, L. Espinosa Anke, J. Camacho-Collados, XLM-T: Multilingual language models in
Twitter for sentiment analysis and beyond, in: Proceedings of the 13th Language Resources and
Evaluation Conference (LREC 2022), European Language Resources Association, Marseille, France,
2022, pp. 258–266.
[20] J. Butler, Precarious life: The powers of mourning and violence, verso, 2004.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>B.</given-names>
            <surname>Kitchens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. L.</given-names>
            <surname>Johnson</surname>
          </string-name>
          , P. Gray,
          <article-title>Understanding echo chambers and filter bubbles: The impact of social media on diversification and partisan shifts in news consumption</article-title>
          .,
          <source>MIS quarterly 44</source>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>D. L.</given-names>
            <surname>Rhode</surname>
          </string-name>
          ,
          <article-title>The subtle side of sexism, Colum</article-title>
          .
          <string-name>
            <given-names>J.</given-names>
            <surname>Gender</surname>
          </string-name>
          &amp;
          <string-name>
            <surname>L.</surname>
          </string-name>
          <year>16</year>
          (
          <year>2007</year>
          )
          <fpage>613</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J. K.</given-names>
            <surname>Swim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. L.</given-names>
            <surname>Hyers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. L.</given-names>
            <surname>Cohen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Ferguson</surname>
          </string-name>
          ,
          <article-title>Everyday sexism: Evidence for its incidence, nature, and psychological impact from three daily diary studies</article-title>
          ,
          <source>Journal of Social issues 57</source>
          (
          <year>2001</year>
          )
          <fpage>31</fpage>
          -
          <lpage>53</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.</given-names>
            <surname>Fox</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Cruz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. Y.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <article-title>Perpetuating online sexism ofline: Anonymity, interactivity, and the efects of sexist hashtags on social media, Computers in human behavior 52 (</article-title>
          <year>2015</year>
          )
          <fpage>436</fpage>
          -
          <lpage>442</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Dunn</surname>
          </string-name>
          ,
          <article-title>Technology-facilitated gender-based violence: An overview (</article-title>
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>L. I. Molnar</surname>
          </string-name>
          , “
          <article-title>i didn't have the language”: Young people learning to challenge gender-based violence through consumption of social media</article-title>
          ,
          <source>Youth</source>
          <volume>2</volume>
          (
          <year>2022</year>
          )
          <fpage>318</fpage>
          -
          <lpage>338</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>L.</given-names>
            <surname>Plaza</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. C. de Albornoz</surname>
            , I. Arcos,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Spina</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Amigó</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Gonzalo</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Morante</surname>
          </string-name>
          , Overview of EXIST 2025:
          <article-title>Learning with disagreement for sexism identification and characterization in tweets, memes, and TikTok videos</article-title>
          , in: J.
          <string-name>
            <surname>C. de Albornoz</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Gonzalo</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Plaza</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. G. S. de Herrera</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Mothe</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Piroi</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Spina</surname>
          </string-name>
          , G. Faggioli, N. Ferro (Eds.),
          <source>Experimental IR Meets Multilinguality</source>
          , Multimodality, and Interaction.,
          <source>Proceedings of the Sixteenth International Conference of the CLEF Association (CLEF</source>
          <year>2025</year>
          ),
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>L.</given-names>
            <surname>Plaza</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. C. de Albornoz</surname>
            , I. Arcos,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Spina</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Amigó</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Gonzalo</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Morante</surname>
          </string-name>
          , Overview of EXIST 2025:
          <article-title>Learning with disagreement for sexism identification and characterization in tweets, memes, and TikTok videos (extended overview)</article-title>
          ,
          <source>in: CLEF 2025 Working Notes</source>
          ,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>F.</given-names>
            <surname>Rodríguez-Sánchez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Carrillo-de Albornoz</surname>
          </string-name>
          , L. Plaza,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gonzalo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Comet</surname>
          </string-name>
          , T. Donoso, Overview of exist 2021:
          <article-title>sexism identification in social networks</article-title>
          ,
          <source>Procesamiento del Lenguaje Natural</source>
          <volume>67</volume>
          (
          <year>2021</year>
          )
          <fpage>195</fpage>
          -
          <lpage>207</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>F.</given-names>
            <surname>Rodríguez-Sánchez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Carrillo-de Albornoz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Plaza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mendieta-Aragón</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Marco-Remón</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Makeienko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Plaza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gonzalo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Spina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          , Overview of exist 2022:
          <article-title>sexism identification in social networks</article-title>
          ,
          <source>Procesamiento del Lenguaje Natural</source>
          <volume>69</volume>
          (
          <year>2022</year>
          )
          <fpage>229</fpage>
          -
          <lpage>240</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>L.</given-names>
            <surname>Plaza</surname>
          </string-name>
          , J. C. de Albornoz,
          <string-name>
            <given-names>R.</given-names>
            <surname>Morante</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Amigó</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gonzalo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Spina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <article-title>Overview of exist 2023-learning with disagreement for sexism identification and characterization (extended overview)</article-title>
          .
          <source>, CLEF (Working Notes)</source>
          (
          <year>2023</year>
          )
          <fpage>813</fpage>
          -
          <lpage>854</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>L.</given-names>
            <surname>Plaza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Carrillo-de-Albornoz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ruiz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Maeso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Chulvi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Amigó</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gonzalo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Morante</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Spina</surname>
          </string-name>
          , Overview of EXIST 2024 -
          <article-title>Learning with Disagreement for Sexism Identification and Characterization in Social Networks and Memes (Extended Overview)</article-title>
          , in: G. Faggioli,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Galuščáková</surname>
          </string-name>
          , A. G. S. de Herrera (Eds.),
          <source>Working Notes of CLEF 2024 - Conference and Labs of the Evaluation Forum</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zeng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ye</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Xiao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Han</surname>
          </string-name>
          ,
          <string-name>
            <surname>Z</surname>
          </string-name>
          . Liu,
          <string-name>
            <given-names>P.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sun</surname>
          </string-name>
          , et al.,
          <string-name>
            <surname>Plug-</surname>
          </string-name>
          and
          <article-title>-play knowledge injection for pre-trained language models</article-title>
          ,
          <source>arXiv preprint arXiv:2305.17691</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Yan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Yan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <article-title>Injecting domain-specific knowledge into large language models: a comprehensive survey</article-title>
          ,
          <source>arXiv preprint arXiv:2502.10708</source>
          (
          <year>2025</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>A.</given-names>
            <surname>Nirmal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bhattacharjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Sheth</surname>
          </string-name>
          , H. Liu,
          <article-title>Towards interpretable hate speech detection using large language model-extracted rationales</article-title>
          , in: Y.
          <string-name>
            <surname>-L. Chung</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          <string-name>
            <surname>Talat</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Nozza</surname>
            ,
            <given-names>F. M.</given-names>
          </string-name>
          <string-name>
            <surname>Plaza-del Arco</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Röttger</surname>
            ,
            <given-names>A. Mostafazadeh</given-names>
          </string-name>
          <string-name>
            <surname>Davani</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . Calabrese (Eds.),
          <source>Proceedings of the 8th Workshop on Online Abuse and Harms (WOAH</source>
          <year>2024</year>
          ),
          <article-title>Association for Computational Linguistics</article-title>
          , Mexico City, Mexico,
          <year>2024</year>
          , pp.
          <fpage>223</fpage>
          -
          <lpage>233</lpage>
          . URL: https://aclanthology.org/
          <year>2024</year>
          .woah-
          <volume>1</volume>
          .17/. doi:
          <volume>10</volume>
          .18653/ v1/
          <year>2024</year>
          .woah-
          <volume>1</volume>
          .
          <fpage>17</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>K.</given-names>
            <surname>Villarreal-Haro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Sánchez-Vega</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rosales-Pérez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. P.</given-names>
            <surname>López-Monroy</surname>
          </string-name>
          ,
          <article-title>Stacked reflective reasoning in large neural language models</article-title>
          , Working Notes of CLEF (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>K.</given-names>
            <surname>Villarreal-Haro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Segura-Gómez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tavarez-Rodríguez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Sánchez-Vega</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rosales-Pérez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. P.</given-names>
            <surname>López-Monroy</surname>
          </string-name>
          ,
          <article-title>Leveraging reasoning of auto-revealed insights via knowledge injection and evolutionary prompting for sexism analysis</article-title>
          ,
          <source>Working Notes of CLEF</source>
          (
          <year>2025</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>S.</given-names>
            <surname>Schulhof</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ilie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Balepur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kahadze</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Si</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gupta</surname>
          </string-name>
          , H. Han,
          <string-name>
            <given-names>S.</given-names>
            <surname>Schulhof</surname>
          </string-name>
          , et al.,
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>