<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Automatic Classification of Gender Stereotypes in Social Media Post</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Gersome Shimi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jerin Mahibha</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Durairaj Thenmozhi</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Madras Christian College</institution>
          ,
          <addr-line>Chennai</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Meenakshi Sundararajan Engineering College</institution>
          ,
          <addr-line>Chennai</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Sri Sivasubramaniya Nadar College of Engineering</institution>
          ,
          <addr-line>Chennai</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Every day, millions of information are shared on the internet through social media. The contents of the social media posts are based on the person's wishes, emotional expressions, ambitions, passions, and achievements. Among these posts there are possibilities of hurtful messages such as sexist contents, getting embedded. It may sometimes be intentional or unintentional, but also may disturb the mental well-being of the recipient. So automatic identification of these sexist languages and terms in social media posts has to be taken into immediate consideration. EXIST (sEXism Identification in Social Media Network) 2024, a shared task has addressed this issue. This shared task addresses binary classification(Task1), multiclass classification(Task2) and multilabel classification(Task3). We contributed Language Agnostic BERT Sentence Embeddings(LaBSE) based MultiLayer Perceptron (MLP) classifier, eXtreme Gradient Boosting (XGBoost) Classifier, and ensemble Convolutional Neural Network (CNN) model for Task1 and LABSE with MLP classifier and XGBoost Classifier for Task2.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;ensemble CNN</kwd>
        <kwd>LaBSE</kwd>
        <kwd>MLP</kwd>
        <kwd>XGBoost</kwd>
        <kwd>Classifier</kwd>
        <kwd>sexism</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Social media platforms have become a basic amenity for communication in the modern world[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. It is
an efective tool for posting content from diverse fields like sports, politics, religion, race, or culture.
According to data reportal global media statistics, the world spends approximately 12 billion hours and
a person actively spends an average of 2 hours and 20 minutes daily in social media. Shared posts may
contain information that gives emotional scars, misguides people, or deprives harmony among social
media fanatics [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Women centered dissemination of ofensive and discriminatory material through
social media platforms has increased rapidly and has emerged as a significant concern. This afects
the well being of women and the freedom of expression [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. All around the world many women have
reported and sufered abuse, discrimination and other sexist experiences in real life. The contribution of
social networks is found to be more, considering the transmission of sexism and other disrespectful and
hateful behaviours. Detection, alert generation and computing the frequency of sexist behaviours and
discourses in social media platforms is considered an important and challenging task [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Discriminatory
information on women, which is unethical, is common in such posts. It is challenging to locate sexist
content like dominance, misogyny, and inequality which can come out in diverse forms [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. The main
platforms for social complaint, activism, etc. are considered to be the Social Networks where movements
like #MeTwoo, #8M or #Time’sUp have spread rapidly [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>
        EXIST 2024 aims to capture sexism in a broad sense, from explicit misogyny to other subtle expressions
that involve implicit sexist behaviors. The shared task EXIST 2024 was a part of CLEF 2024, based on
English and Spanish comments. The shared task intended to spot diferent categories of sexist content
[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ][
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. The task contained five subtasks namely Task 1 to Task 5 in which we as a team participated
in two subtasks namely Task 1 and Task 2. Task 1 - Sexism Identification The first task is a binary
classification, the system has to decide whether or not a given tweet contains or describes sexist
expressions or behaviors.
      </p>
      <p>Task 2 - Source Intention</p>
      <p>This task aims to categorize the sexist messages according to the intention of the author in one of
the following categories:
(i) Direct sexist message
(ii) Reported sexist messages and
(iii) Judgemental sexist message</p>
      <p>The second task is a multiclass classification problem, where the system needs to identify the intention
behind the tweet. The possible intentions are directly addressing sexism, reporting sexism conditions
about women, and judging/condemning sexism.</p>
      <p>Various models including a MPL classifier with Language Agnostic Sentence Embeddings, XGBoost,
and ensemble CNN were used for implementing the subtasks namely Task 1 and Task 2. The results
of all these were submitted for ranking. Considering the two tasks the training and evaluation of the
proposed models were carried out using the corresponding dataset provided by the EXIST 2024 task
organizers. This model was then tested with the testing dataset provided for the shared task, based on
which the task was evaluated.</p>
      <p>This paper is organized as follows: Section 2 explains the related work, Section 3 describes the dataset,
the methodology used is described in Section 4, the results and discussions are provided in Section 5
and Section 6 provides the Conclusion.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Works</title>
      <p>
        A machine learning model based on a bidirectional LSTM architecture was used for the classification of
sexist and non sexist tweets by [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. The model had efectively captured contextual information and
achieved an F1-Score of 0.6355. As part of IberLEF 2022 Language agnostic model and multilingual
BERT classification model were used to identify sexist and non-sexist text from English and Spanish
text. It had been found that the Language agnostic model performed better with an F1 score of 0.753 [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] had applied transfer learning from a pre-trained multilingual DeBERTa (mDeBERTa) model and
easits zero classification. The Concept of majority voting was used to combine the methods by which
mDeBERTa achieved an accuracy of 76.09% and 66.26% for Task 1 and Task2 respectively. Diferent
tranformer models like BERT, DistilBERT, and RoBERTa had been used for implementing the three
tasks shared by SEMEVAL 2023. The BERT model, had shown a macro F1-score of 0.8073, 0.5876 and
0.3729 for Task A, Task B and Task C respectively [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>
        Seoul metropolitan ciKNN, Naïve Bayes, SVM and GBDTvil complaint dataset in Korean language
had been classified using Random forest and XGBoost, the result had proven that XGBoost Classifier
outperformed Random forest classifier [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. For crime prediction, after applying TF-IDF (Term
frequencyinverse document frequency) the machine language models XGBoost, KNN (K-Nearest Neighbor), Naïve
Bayes, SVM(Support Vector Machine), and GBDT(Gradient Boost Decision Tree) were implemented
and found XGBoost Outperformed other Machine Learning algorithms with 0.923, 0.916 and 0.919 for
Precision, Recall, and F1-score respectively [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
      </p>
      <p>
        BLSTM-C, a hybrid model of BLSTM and Convolutional Neural Network performed well with the
Chinese language dataset for text classification. The BLSTM-C had been coded with two layers of LSTM
and one layer of CNN to obtain the accuracy of 0.962 [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
      </p>
      <p>Few research works were carried out on sexism identification and related text classification tasks had
been explored. It is found that continuous research is being carried out in related fields like identifying
insulting comments, hate speech, toxic comments, and intent classification which can be used as a base
for identifying comments representing sexism from social media text. It could also be observed that
the tweet and its contents have inconsistent structure, data preprocessing will helps to improve the
accuracy of the training model.
@ultimonomada_ Si comicsgate se parece en algo a gamergate pues muy bien por Spanish
el acoso. Y si se está haciendo un sabotaje porque hay personajes que no os gustan
entonces gracias por darme la razón. Sois unos lloricas ofendidos.
$@Geek @ℎ@ ′
work for women who get assaulted at home or work. Also would give the government English
the ability to track anyone for any reason.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Dataset</title>
      <p>The dataset used to implement Task 1 and Task 2 of EXIST 2024 was the training, evaluation and the
test dataset, that were provided by the organizers of the shared task. All the datasets for the shared
task Exist 2024 were given in the JSON format from which the important features required for
implementing Task 1 and Task 2 were selected. This includes features like id_EXIST, tweet, annotators, and
labels_task1 for Task 1 and id_EXIST, tweet, annotators and labels_task2 for Task 2. Other features like
gender_annotators, age_annotators, ethnicities study_levels_annotators, countries were identified as
unimportant features and were eliminated. Table 1 shows sample instances from the dataset considering
both the languages English and Spanish. Twitter is the source of all the instances in the dataset.</p>
      <p>The data distribution in the training, evaluation and testing dataset is represented by Table 2 and 3.
The training dataset for Task 1 and Task 2 had 6920 instances of which 3260 tweets were in English and
3660 instances were in Spanish. Considering Task 1 there were 3553 instances under the Sexist category
and 3367 instances under Non Sexist category. Considering the Task 2, the number of instances was
3141, 1298 and 1035 under the categories Direct, Reported, and Judgemental respectively. The test
dataset had 12456 instances of which 5868 were in English and 6588 were in Spanish.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Methodology</title>
      <p>The proposed system uses XGBoost Classifier, LSTM-CNN Classifier, MLP classifier with Language
Agnostic embeddings for Task 1 which is a binary classification problem to detect Sexist and Non Sexist
comments. The Task 2 was implemented using XGBoost Classifier and MLP classifier with Language
Agnostic embeddings, a multi class classification problem with three class labels namely Direct sexist
message, Reported sexist message and Judgemental sexist message. The proposed architecture of the
system is shown in Figure 1.</p>
      <sec id="sec-4-1">
        <title>4.1. Preprocessing</title>
        <p>The dataset instances in JSON format, was read and cleaned by preprocessing techniques. Preprocessing
is the technique of removing unimportant information from texts, which are not used during the
classification process. It is performed by removing stop words, symbols, and special characters in
addition to that root words are extracted using stemmer and lemmatization algorithms before the
dataset is fed to the model.</p>
        <p>The class label associated with each of the tweets was not provided directly. Instead the labels are
provided by six diferent annotators as Hard Labels and Soft Labels. We chose Hard Label for our
implementation. As a part of preprocessing, the approach of majority voting was applied to the provided
information to decide the class label associated with the tweet. This was done for both Task 1 and Task
2.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. MLP classifier with Language Agnostic Embeddings</title>
        <p>The proposed system used a MLP classifier for which custom generated embedding was provided as
input. Language agnostic sentence transformer was used to generate text embeddings. As the Language
agnostic sentence transformer is multilingual in nature and support both English and Spanish languages,
the same model was used to generate the embeddings for all the given tweets. Similarly Laser encoder
pipeline was used to generate LASER embeddings for all the tweets. Both these embeddings were
concatenated to generate a final set of embeddings using which the MLP classifier was trained. The
hyper parameters associated with the MLP classifier are: random state was set as 42, the maximum
iteration was set as 300, relu activation function was used, the parameter alpha was set as 0.05, learning
rate as adaptive and solver as adam. The working of this model is represented in Figure 2.</p>
        <p>The proposed model when evaluated using the evaluation dataset, it provided an accuracy and Macro
F1 Score 0.77.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. XGBoost Classifier</title>
        <p>
          XGBoost is an optimized distributed gradient boosting library designed to be highly eficient, flexible,
and portable. It implements machine learning algorithms under the Gradient Boosting framework.
XGBoost provides a parallel tree boosting to solve many data science problems in a fast and accurate way.
The system uses XGBoost Classifier which gets the output from the TF-IDF(Term frequency-inverse
document frequency) model. The preprocessed text is fed to the TF-IDF model to find the term frequency
and document inverse frequency. TF-IDF algorithm [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] works on the frequency of the occurrence of
the word in the document. The importance of a word is determined by the number of times a word
appears in a document and is inversely proportional to the number of times it appears in the entire
document set. Term Frequency is calculated by the formula:
, denotes occurrence of  in document 
∑︀ , denotes sum of all entries in document
        </p>
        <p>,
 , ≡ ∑︀ ,</p>
        <p>||
 ≡ 2 |{ :   }| + 1
TF-IDF of the word  is calculated by the formula
  −  =  , *</p>
        <p>The XGBoost Classifier model is tuned by the hyperparameters learning_rate, max_depth,
n_estimators, use_label_encoder, eval_metric with the values 0.7,10,80, False, rmse respectively.</p>
      </sec>
      <sec id="sec-4-4">
        <title>4.4. Ensemble CNN Classifier</title>
        <p>
          CNN model is one of the baseline models in Natural Language Processing and can be used to classify
sentences and text. It processes the data sequences and enables them to evaluate the perspective of a
given sentence and classify it based on the predefined labels [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]. The ensemble CNN model is used to
classify the EXIST 2024, shared Task 1. After performing sequence padding the data is fed to the LSTM
and CNN model, tuning the hyperparameters optimizer, loss with values Adam, binary_crossentropy
respectively. LSTM model is coded by activating one LSTM layer, one Embedding layer and two dense
layers. CNN model is coded by activating one Embedding layer, Conv1D layer and GlobalMaxPooling1D
layer, and two dense layers with activation function relu and sigmoid respectively. This ensemble model
is trained with epochs=10 and batch_size=32. The evaluation dataset of Task 1 when evaluated using
ensemble CNN, achieved an accuracy and Macro F1 Score of 0.56.
        </p>
        <p>The performance metrics associated with the evaluation of the diferent models using the evaluation
dataset are represented in Table 4 and Table 5.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Results and Discussions</title>
      <p>The metrics considered for the evaluation of Task 1 were ICM-Hard, ICM-Hard Norm and F1_Yes. The
metrics considered to evaluation Task 2 are ICM-Hard, ICM-Hard Norm and Macro F1. The values of
these performance metrics for the diferent models submitted are shown in Table 6 and Table 7.</p>
      <p>On testing the model with the test dataset the MLP classifier with language agnostic embedding
provided an ICM-Hard value of 0.3220, ICM-Hard Norm value of 0.6623 and F1_YES value of 0.7044
for Task 1. The same model applied for task2, it achieved a value of -2.0626 for ICM-Hard, 0.2115 for
ICM-Hard Norm and 0.1200 for Macro F1. It could be found that the MLP classifier with Language
Agnostic Embeddings outperformed the other models.</p>
      <p>When the XGBoost Classifier model was tested using the test dataset, the model provided an
ICMHard value of 0.2905, ICM-Hard Norm value of 0.6460 and F1_YES value of 0.6946 for Task 1. Considering
Task 2, the same model achieved an ICM-Hard value of -0.8873, ICM-Hard Norm value of 0.2115 and
Macro F1 value of 0.3148. The XGBoost Classifier outperformed other models for Task2.</p>
      <p>When Task 1 is implemented using ensemble CNN model, it achieves an ICM-Hard value of -0.3410,
ICM-Hard Norm value of 0.3286 and F1_YES value of 0.4922.</p>
      <p>The MLP classifier with language agnostic embedding resulted in an F1_Yes score of 0.7044 based on
which Task 1 was evaluated and we were ranked 48 on the leader board. Task 2 resulted in a macro-F1
score of 0.32 using the XGBoost Classifier, by which we were ranked 37 on the leader board.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>Sexism detection has become a current research area as it is interlinked with diferent applications like
sentiment analysis, opinion mining, ofensive and hate speech detection. Having this in mind CLEF
2024 had come up with the task of sexism detection, EXIST 2024. As per the requirement of shared
task by EXIST 2024, the proposed system implemented the MLP classifier with Language Agnostic
Embeddings, XGBoost Classifier, and ensemble CNN classification model for Task 1 and MLP classifier
with Language Agnostic Embeddings and XGBoost Classifier for Task 2. It was found that MLP classifier
with Language Agnostic Embeddings performed well for Task 1 compared to the other models with an
F1 score of 0.70. In Task2 XGBoost Classifier model performed well with an F1 score of 0.32. Usage
of hybrid approaches where diferent deep learning models are combined can also facilitate eficient
detection of sexism from the text. Often it could be observed that sexism is not in the text, but could be
detected from the intonation or facial expression, which has made multimodel sexism detection also a
promising research area.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>R.</given-names>
            <surname>Briandana</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. M.</given-names>
            <surname>Doktoralina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Hassan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. N. W.</given-names>
            <surname>Hasan</surname>
          </string-name>
          ,
          <article-title>Da'wah communication and social media: The interpretation of millennials in southeast asia</article-title>
          ,
          <source>International Journal of Economics and Business Administration</source>
          <volume>8</volume>
          (
          <year>2020</year>
          )
          <fpage>216</fpage>
          -
          <lpage>226</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>G.</given-names>
            <surname>Shimi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Mahibha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Thenmozhi</surname>
          </string-name>
          ,
          <article-title>Sexism identification in social media using deep learning models (</article-title>
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>L.</given-names>
            <surname>Plaza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Carrillo-de Albornoz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Morante</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Amigó</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gonzalo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Spina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <article-title>Overview of exist 2023-learning with disagreement for sexism identification and characterization</article-title>
          ,
          <source>in: International Conference of the Cross-Language Evaluation Forum for European Languages</source>
          , Springer,
          <year>2023</year>
          , pp.
          <fpage>316</fpage>
          -
          <lpage>342</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Sheth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. L.</given-names>
            <surname>Shalin</surname>
          </string-name>
          , U. Kursuncu,
          <article-title>Defining and detecting toxicity on social media: context and knowledge are key</article-title>
          ,
          <source>Neurocomputing</source>
          <volume>490</volume>
          (
          <year>2022</year>
          )
          <fpage>312</fpage>
          -
          <lpage>318</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D.</given-names>
            <surname>Felmlee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Inara Rodis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , Sexist slurs:
          <article-title>Reinforcing feminine stereotypes online</article-title>
          ,
          <source>Sex roles 83</source>
          (
          <year>2020</year>
          )
          <fpage>16</fpage>
          -
          <lpage>28</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>C. J.</given-names>
            <surname>Mahibha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Swaathi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Jeevitha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. P.</given-names>
            <surname>Martina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Thenmozhi</surname>
          </string-name>
          , Brainstormers_msec at semeval-2023 task 10:
          <article-title>Detection of sexism related comments in social media using deep learning</article-title>
          ,
          <source>in: Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>1114</fpage>
          -
          <lpage>1120</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>L.</given-names>
            <surname>Plaza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Carrillo-de-Albornoz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ruiz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Maeso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Chulvi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Amigó</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gonzalo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Morante</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Spina</surname>
          </string-name>
          , Overview of EXIST 2024 -
          <article-title>Learning with Disagreement for Sexism Identification and Characterization in Social Networks and Memes, in: Experimental IR Meets Multilinguality, Multimodality, and Interaction</article-title>
          .
          <source>Proceedings of the Fifteenth International Conference of the CLEF Association (CLEF</source>
          <year>2024</year>
          ),
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>L.</given-names>
            <surname>Plaza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Carrillo-de-Albornoz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ruiz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Maeso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Chulvi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Amigó</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gonzalo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Morante</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Spina</surname>
          </string-name>
          , Overview of EXIST 2024 -
          <article-title>Learning with Disagreement for Sexism Identification and Characterization in Social Networks and Memes (Extended Overview)</article-title>
          , in: G. Faggioli,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Galuščáková</surname>
          </string-name>
          , A. G. S. de Herrera (Eds.),
          <source>Working Notes of CLEF 2024 - Conference and Labs of the Evaluation Forum</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.</given-names>
            <surname>Chaudhary</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <article-title>Sexism identification in social networks</article-title>
          ,
          <source>Working Notes of CLEF</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>H. T.</given-names>
            <surname>Ta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. B. S.</given-names>
            <surname>Rahman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Najjar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. F.</given-names>
            <surname>Gelbukh</surname>
          </string-name>
          ,
          <article-title>Transfer learning from multilingual deberta for sexism identification</article-title>
          ., in: IberLEF@ SEPLN,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>J.-E. Ha</surname>
          </string-name>
          , H.
          <string-name>
            <surname>-C. Shin</surname>
            ,
            <given-names>Z.-K.</given-names>
          </string-name>
          <string-name>
            <surname>Lee</surname>
          </string-name>
          ,
          <article-title>Korean text classification using randomforest and xgboost focusing on seoul metropolitan civil complaint data</article-title>
          ,
          <source>The Journal of Bigdata</source>
          <volume>2</volume>
          (
          <year>2017</year>
          )
          <fpage>95</fpage>
          -
          <lpage>104</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Qi</surname>
          </string-name>
          ,
          <article-title>The text classification of theft crime based on tf-idf and xgboost model</article-title>
          ,
          <source>in: 2020 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA)</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>1241</fpage>
          -
          <lpage>1246</lpage>
          . doi:
          <volume>10</volume>
          .1109/ICAICA50127.
          <year>2020</year>
          .
          <volume>9182555</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <article-title>Chinese text classification model based on deep learning</article-title>
          ,
          <source>Future Internet</source>
          <volume>10</volume>
          (
          <year>2018</year>
          )
          <fpage>113</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>R.</given-names>
            <surname>Sujatha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Nimala</surname>
          </string-name>
          ,
          <article-title>Classification of conversational sentences using an ensemble pre-trained language model with the fine-tuned parameter</article-title>
          .,
          <string-name>
            <surname>Computers</surname>
          </string-name>
          ,
          <source>Materials &amp; Continua</source>
          <volume>78</volume>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>