<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>R. Zatarain);</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>in Mexican Memes*</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ramón Zatarain Cabada</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>María Lucía Barrón Estrada</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Víctor Manuel Bátiz Beltrán</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Aldair</string-name>
          <email>aldair.gr@culiacan.tecnm.mx</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>González Robles</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Néstor Leyva López</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Tecnológico Nacional de México: Instituto Tecnológico de Culiacán, Juan de Dios Bátiz 310 Pte</institution>
          ,
          <addr-line>Guadalupe, 80220 Culiacán Rosales, Sinaloa</addr-line>
          ,
          <country country="MX">México</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>This article presents the work done for the detection of inappropriate content, hate speech, or neither of them in Mexican memes within the DIMEMEX competition as part of IberLEF 2025. In contemporary society, the employment of memes as a medium for conveying ideas or messages has become a prevalent practice across a wide range of social networks utilized by users worldwide. The automatic detection of inappropriate content or hate speech has become a subject of significant interest for the scientific community. In this study, we propose an approach that utilizes paraphrasing to augment data, employing Transformers-based models for the classification of messages within memes. The proposal that is the subject of this study is a BETO-based model. This model obtained an f1-score of 0.52, which placed it in fourth place in the final phase for task 1. It is concluded that, despite the encouraging results, the task is quite complex. This conclusion is based on the analysis of the evaluated metrics, which revealed that all of the results fell below 0.60.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;hate speech</kwd>
        <kwd>inappropriate content</kwd>
        <kwd>data augmentation</kwd>
        <kwd>transformers</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        This research demonstrates the approach utilized and the results obtained by the ITC team in the
DIMEMEX 2025 competition [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], which was organized as part of IberLEF 2025 [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The competition
comprised three tasks with the objective of identifying the presence of hate speech, inappropriate
content, or neither of them in Mexican memes.
      </p>
      <p>The contemporary significance of social networks in modern life is indisputable; they have
evolved into an immediate and universal medium of communication. The present study analyzes
the information transmitted through text messages, voice, and images or combinations of these.
The case study of this research focuses on memes, which are combinations of images and text that
seek to express an idea or message. Despite their seemingly innocuous nature, memes have been
known to contain hate speech or inappropriate messages.</p>
      <p>In this study, the approach employed by the ITC team to address Task 1 is presented. Task 1
entailed the classification of the message contained in Mexican memes into one of the following
categories: inappropriate content, hate speech, or neither of them. The methodology employed in
the present study is delineated, and the ensuing results are hereby presented.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related work</title>
      <p>
        The task of recognizing hate speech presented in memes requires understanding of the context in
which they are used, as well as the visual content of the memes. Therefore, in works such as the
one presented in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], they address the identification of language in writings and hate detection
using machine learning (ML) and deep learning (DL). For which they explore various approaches
such as Support Vector Machines (SVM) and Random Forest. Different labeled datasets from the
CHiPSAL 2025 challenge [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] were used for language classification and hate recognition in Asian
languages. In the work, it was obtained that the hybrid Convolutional Neural Networks (CNN)
model with BiLSTM achieved the best accuracy (F1-score of 0.9941) in writing identification, while
MuRIL-BERT obtained an F1-score of 0.6832 in hate speech detection. The authors conclude that
the approaches employed outperform traditional approaches in these areas, identifying limitations
such as data imbalance and the need for more refined models for this specific task. Similarly, the
work described in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] addresses the analysis of sentiment in memes in social networks, using
textual and visual information. In this work, a Bi-modal model was proposed to take full advantage
of meme data to improve classification. The proposed approach uses feature extraction with the
VGG-19 model and textual analysis with a GRU (Gate Recurrent Unit) network. This was
integrated into a CNN model to classify sentiments by testing various text and image models to
evaluate the classification efficiency. For the tests they used the Memotion dataset, which is
composed of 6,992 memes labeled as positive, negative and neutral. Finally, they obtained that the
Bi-modal model (GRU + VGG19 with CNN) achieved an accuracy of 60% and an F1-score of 0.3904.
Evidencing that the SVM analysis achieved better accuracy with up to 88% and only image analysis
with a performance of 58.56%. This paper concludes that DL models improve the interpretation of
feelings in memes, reporting the same limitation as the previous work, where it is explained that
unbalanced data make it difficult to classify emotions in memes. Similarly, the research in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]
addressed the classification of memes with hate content using ML techniques. For this, they
proposed two approaches: image-to-text conversion and text-to-image-to-vector conversion. For
this, the dataset of the Facebook Hateful Memes Challenge [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] competition was used in
combination with a new dataset of memes randomly extracted from the internet. Models such as
Long-Short Term Memory (LSTM), SBERT; Xception and CNN were used, focusing on multimodal
strategies to analyze the relationship between image and text, implementing image segmentation
methods and advanced models for feature extraction and classification. After the tests they found
that what they call “generic models” without specific knowledge about hate speech can learn the
definition of it. Concluding that multimodal models improve accuracy in detecting offensive
content in memes, but unbalanced data and subjectivity in the definition of “hate speech” remain
important challenges to solve.
      </p>
      <p>
        Regarding specific classification tasks of the type of hate speech (racism, sexism, classism, etc.),
there are also works that have addressed these issues, as is the case of the work in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. In this
paper, the authors proposed a system for detecting racism in social networks using ML and DL
techniques. For this purpose, a model based on LSTM was developed to analyze and classify racist
content, with the objective of reducing detection time and improving accuracy in the identification
of discriminatory speech. BERT and Lemmatizer were used to extract the features, and a web
application was built in Flask for implementation, allowing users to enter content and the system
to analyze and classify them as racist or non-racist. They found that their proposed LSTM model
had better accuracy than other tested networks (such as BERT, GNN and RNN), identifying data
imbalance in the ensemble as one of the major challenges for the future. On a similar topic, Nguyen
et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] propose the detection of sentiment and hate speech in social networks using multimodal
ML models, integrating text and image to improve classification, focusing on posts related to
racism and sexual preference discrimination. To do this, approximately 56 million tweets and a
little more than 3 million posts on Meta platforms were collected. To process this data, models such
as BERT for text and VGG-16 for images were used, in addition to some multimodal models such as
CLIP and VisualBERT. They report that CLIP obtained the best accuracy, reaching up to 96% in
hate speech, while VisualBERT excelled in sentiment analysis. The authors emphasize that the
combination of multimodal models outperforms unimodal models, commenting that adding text
and image provides greater context and accuracy. Like the other authors, they report that data
imbalance is a limiting factor in improving detection. While, in problems related only to misogyny,
in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] the authors propose a multimodal approach to detect misogynistic memes, for which they
use XLM-RoBERTa to extract textual features and VisionTransformer for visual features. Once they
acquired the features, they concatenated the textual and visual attributes to build a multimodal
representation. This was tested with ML algorithms such as KNN and SVM, and DL algorithms
such as LSTM and GRU. At the end of their study, they report that the GRU model achieves an
F1score of 0.88, indicating that it generalizes best in language comprehension. They conclude that
multimodal models outperform unimodal models, especially in the classification of memes with
discriminative content, highlighting the problem of data imbalance.
      </p>
      <p>
        In a similar vein, though detecting hate speech using Large Language Models (LLM), we find
works such as the one in [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], which proposes to test the performance of the LLM GPT-2 for
detecting hate speech. For this, the authors explored two approaches: Fine-Tuning of the model
with a specific hate speech dataset and the application of Few-Shot-Learning (FSL) techniques,
seeking to leverage the language capabilities of GPT-2. The results showed that the GPT-2
FineTuning obtained the best performance with an F1-score of 0.6164 with test data, although an
overfit is observed compared to the validation F1-score which has an F1-score 0.7364. On the other
hand, in the FSL approach with 0 shots was the most effective, achieving an F1-score of 0.58. With
this, they conclude that providing more labeled data to a model is more beneficial than just
instructions and examples for GPT-2 to learn hate speech features. A work with a different
approach to LLMs is presented in [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], which describes a framework for evaluating hate speech
detectors against LLM-generated content. This study seeks to understand the effectiveness of
existing models in the face of such content and to reveal the potential of LLM-driven hate
campaigns. For this purpose, the authors built a dataset of 7,838 hate speech samples generated by
6 different LLMs for 34 identity groups and subsequently evaluated the effectiveness of 8 hate
speech detectors on this dataset. After testing, they found that hate speech detectors showed a
decrease in performance when confronted with content generated by newer versions of LLMs.
They also reveal that LLMs possess significant potential to drive hate campaigns, representing a
new threat, underlining the need to develop more robust and adaptive detectors. Finally, the work
in [13] was reviewed, which proposes the detection of offensive memes for online content
moderation, especially in a culturally diverse context. For this, they propose a pipeline that
integrates Multimodal Large Language Models (VLM) and Fine-Tuning techniques with a dataset
annotated by GPT-4V, seeking to handle languages with few resources and local social biases. The
dataset created consisted of 112,000 memes, including Optical Character Recognition (OCR) to
extract text, translation for low-resource languages, and a VLM of 7 billion parameters for final
classification. The proposed solutions achieved an accuracy of 80.62%, demonstrating the
effectiveness of the VLMs fitted with the created dataset. The authors conclude that their pipeline
can significantly help human moderators, especially in specific cultural contexts.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Task description and dataset</title>
      <sec id="sec-3-1">
        <title>The competition explained in more detail in [1] was divided into 3 subtasks:</title>
        <p> Subtask 1: detection of hate speech, inappropriate content, and neither of them.
 Subtask 2: finer-grained detection of hate speech.</p>
        <p> Subtask 3: same categories from subtask 1 but participants are restricted to focus
exclusively on the use of Large Language Models (LLMs) for detecting the specified
categories.</p>
        <p>For this work we focus on subtask 1, which involves 3-way classification, where each meme
must exclusively and uniquely adhere to one of the following classes: hate speech, inappropriate
content, and neither of them. In order to evaluate the proposed solutions for subtask 1, the
competition organizers established that the macro-averages of precision, recall and F1-score would
be reported. However, the macro-average of F1-score would be the primary evaluation metric for
each subtask. The challenge site available at Codalab platform [14] was used for the submission of
proposals and their evaluation.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Methodology</title>
      <sec id="sec-4-1">
        <title>4.1. Dataset description</title>
        <p>The competition organizers provided a training dataset, which is composed of 2,263 records. Each
element is structured in three fields, the first is “MEME-ID”, which corresponds to the unique
identifier or file name of the image. The second field stores the text present in the meme, while the
third provides a textual description of the meme's visual content. In addition, a link to download
the associated images was provided. Table 1 presents an example of this dataset.</p>
        <p>La imagen es un meme que presenta
un primer plano de un hombre con
piel oscura y cabello corto. Tiene una
expresión facial que combina una
sonrisa contenta con un guiño, lo que
sugiere un tono ligero o cómico. El
fondo es de un color marrón suave,
posiblemente una pared o cortina. En
la parte superior de la imagen, hay un
texto en letras mayúsculas que dice
\"NO ESTÁ MAL\". El estilo del meme
es característico de aquellos que
utilizan expresiones faciales para
transmitir humor o ironía.</p>
        <p>In addition, a file containing the corresponding labels to each record was provided. These labels
are represented using One-Hot Encoding, indicating the presence or absence of the elements to be
classified. The assignment of tags follows a specific order from left to right, defined by the
following values: hate speech, inappropriate content and neither. For example, if a record has the
tag “1, 0, 0”, it means that it has been classified as hate speech.
4.2. Data analysis
The analysis of the dataset was crucial in the early stages, verifying the consistency between the
data provided and the original images. The content was reviewed to avoid external elements (such
as links or watermarks) that could affect the integrity of the set.</p>
        <p>For this purpose, a random sample was taken, and the extracted text was compared with the
text present in each image. The results confirmed that the text in the dataset corresponded to that
of the image, excluding irrelevant elements. In addition, the recorded description accurately
reflected the meaning of the meme (see Figure 1).
4.3. Implemented workflow
Figure 2 shows the workflow used from the preprocessing of the original dataset to the generation
of the result files to be uploaded to the challenge platform.</p>
        <sec id="sec-4-1-1">
          <title>The following sections detail this workflow.</title>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>4.3.1. Preprocessing and features merging</title>
        <p>In order to simplify the data structure and avoid label separation, the training set with its labeling
was integrated into a single column. The process started with the conversion of the corpus from
the train_data file, from JSON format to a dataframe using Pandas. Before unifying the features,
“MEME-ID” was removed, since it does not provide relevant information for training. Then
“description” was added, followed by the phrase “there is a text in the image that contains the
following”, and finally the content “text” was added. This resulted in a Corpus with a single column
of unlabeled data.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3.2. Dataset concatenation and labeling</title>
        <p>For labeling, the training label file for the 3 subtasks was processed, transforming its content into a
single column called “labels”. Using the One-Hot Encoding format, the representation was
simplified to a single numeric value: 0 for hate speech, 1 for inappropriate content and 2 for
neither.</p>
        <p>This column was incorporated into the unlabeled corpus, maintaining the correspondence
between records and labels, resulting in the labeled corpus. Finally, it was divided into two sets:
training (80% of the total, with 1,810 records) and test (20%, with 453 records).
4.3.3. Class splitting
The original corpus was separated into three subsets according to the tags: hate speech,
inappropriate content and neither, thus obtaining three independent corpora, each containing
exclusively records from a single category.</p>
      </sec>
      <sec id="sec-4-4">
        <title>4.3.4. Corpus augmentation and class balancing</title>
        <p>We used generative AI to expand and balance the corpora, ensuring that each category had the
same number of records. To this effect, key parameters such as the amount of data per corpus,
number of beams, penalties, temperature, among others, were established. In addition, input and
output routes, the Humarin paraphrase model [15] and the language (Spanish) were defined.
Although several hyperparameter configurations were reviewed, the best results were obtained
with those proposed by the creators of the Humarin model [15]. Finally, the generated corpora
were stored in an array, and with the data obtained a total corpus with balanced classes was
constructed.</p>
        <p>The process was applied to each corpus individually. First, the number of elements was counted
and the difference with the desired total number was calculated. Then, the minimum number of
repetitions per item was determined using generative AI and recorded in the “times_repeat”
feature.</p>
        <p>Subsequently, the residual amount needed to complete the total increase was adjusted by
randomly selecting some elements and increasing their frequency in “times_repeat”. This ensured a
proper balance between classes. Finally, an array was created that integrated the original corpus
with the AI-generated data.</p>
        <p>Each element of the corpus is introduced into the text generation transformers-based model
Humarin [15]. This model allows us to specify the number of similar texts to be generated,
paraphrasing the input text and returning an array with the desired number of reformulated texts.
Each new item was tagged with its corresponding original text and stored as new rows in the
AIgenerated corpus.</p>
        <p>All items in the base corpus were added to the new corpus, which was stored in an array
together with the sets generated for each label. The previous steps were then repeated for each
corpus, until a total of 1,500 records for each of the three categories was reached, resulting in a
final corpus of 4,500 records.
4.3.5. Data Cleaning
This process aims to prepare the textual data of the augmented corpus for training. Cleaning and
normalization were applied to the “text” column, eliminating stopwords and lemmatizing the
words to reduce them to their base form. Finally, punctuation marks were removed, and the text
was tokenized. The result was stored in a new corpus with the processed data.
4.3.6. Classification models training
Transformers-based classification models were used, including BERT, BETO and RoBERTa. The
same hyperparameters were used in all of them (epochs=5, train batch size = 8, optimizer =
“AdamW” and learning rate = 0.0001). In addition, the corpus was divided into 80% for training and
20% for validation. Each of the training iterations stored the model in a folder for later use and
generated accuracy, f1-score, recall and precision metrics, which were stored in a “txt” file. The
training process was repeated 10 times to ensure variability and improve results through
randomized runs. The models were trained using two versions of the augmented dataset: one
without cleaning and the other with cleaning, applying the same processes in both for training the
models.
4.3.7. Best model selection
For each transfomers-based model (BERT, BETO and RoBERTa), accuracy, f1-score, recall and
precision metrics were recorded. They were then evaluated with the test corpus, comparing the
results with those obtained in training. The average of the four metrics was calculated for each
model and the one with the best value was selected. The remaining models were discarded.</p>
        <p>After completing the process with both data sets (corpus with and without cleaning), it was
determined that BETO-based model without cleaning achieved the best results.
4.3.8. Prediction of validation and test dataset labels
Finally, the data from the validation dataset (development phase) and test dataset (final phase) were
preprocessed for labeling. Each item of the provided datasets was processed using the selected
BETO-based model, generating the predicted labels and storing them in a “.csv” file for subsequent
submission to the challenge platform.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <p>The results obtained in the two phases of the competition for subtask 1 are shown below. As
illustrated in Table 2, the results of the development phase are presented. During this stage, the
unlabeled dataset provided was utilized. The submissions of the team are collectively referred to as
ITC. In this phase, the proposed solution was ranked sixth, with a value of 0.48 in the f1 metric. It is
noteworthy that the first six positions obtained values in a range of 11 hundredths, varying from
0.48 to 0.59 in the f1 metric, 10 hundredths in the precision metric with values from 0.49 to 0.59,
and 12 hundredths in the recall metric, taking values from 0.47 to 0.59.</p>
      <sec id="sec-5-1">
        <title>User/Team</title>
      </sec>
      <sec id="sec-5-2">
        <title>Ryuan</title>
        <p>HoracioJarquin
hugojair</p>
        <p>ITC (Vickbat)
dmoctezuma
csuazob</p>
        <p>The results of the final phase are presented in Table 3. In this stage, the proposed solution was
ranked fourth among the three metrics evaluated. It is noteworthy that the initial four positions
were the only ones that attained values above 0.50, with a mere 0.06 separating the first and fourth
positions. The observation that all teams obtained values below 0.6 in all metrics indicates the
complexity of the task.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusions</title>
      <p>This article presents the methodology and results of the ITC team's participation in the DIMEMEX
competition as part of IberLEF 2025. Our team's efforts were concentrated on subtask 1, which
entailed the classification of memes in Mexican Spanish into three categories: hate speech,
inappropriate content, and neither of these categories. The most effective approach was identified
as an incremental augmentation of elements within each class, employing generative AI to
optimize the balance to 1,500 elements per class. Subsequently, a model based on Transformers was
utilized. Specifically, the BETO-based model provided the most optimal result.</p>
      <p>Our proposal was recognized for its noteworthy performance, securing fourth place in the final
phase of subtask 1. However, it is important to note that the results generally demonstrate the
complexity of the task and that, although the results are encouraging, it is essential to continue
refining the classification techniques to enhance the performance of the models.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgements</title>
      <p>We want to express our gratitude to SECIHTI and the Tecnológico Nacional de México: Instituto
Tecnológico de Culiacán for supporting our team to participate in the DIMEMEX@IberLEF 2025
challenge.</p>
    </sec>
    <sec id="sec-8">
      <title>Declaration on Generative AI</title>
      <p>The authors have not employed any Generative AI tools during the preparation of this work.
[13] C. Yuxuan, W. Jiayang, A. C. L. Chuen, B. S. Guanrong, B. S. Jen, S. C. Z. Shen, Detecting
Offensive Memes with Social Biases in Singapore Context Using Multimodal Large Language
Models, 2025. doi:10.48550/arXiv.2502.18101.
[14] DIMEMEX, Challenge Website, 2025. URL: https://codalab.lisn.upsaclay.fr/competitions/22012.
[15] V. Vorobev, M. Kuznetsov, A paraphrasing model based on ChatGPT para-phrases, 2023. URL:
https://huggingface.co/humarin/chatgpt_paraphraser_on_T5_base.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>H.</given-names>
            <surname>Jarquín-Vásquez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Tlelo-Coyotecatl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.I.</given-names>
            <surname>Hernández-Farías</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.J.</given-names>
            <surname>Escalante</surname>
          </string-name>
          , L. VillaseñorPineda, M.
          <article-title>Montes-y-</article-title>
          <string-name>
            <surname>Gomez</surname>
          </string-name>
          ,
          <article-title>DIMEMEX@IberLEF2025: Detection of Inappropriate Memes from Mexico</article-title>
          .
          <source>Procesamiento del Lenguaje Natural</source>
          <volume>75</volume>
          (
          <year>2025</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J. A.</given-names>
            <surname>González-Barba</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Chiruzzo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Jiménez-Zafra</surname>
          </string-name>
          ,
          <source>In Proceedings of the Iberian Languages Evaluation Forum (IberLEF</source>
          <year>2025</year>
          ),
          <article-title>co-located with the 41st Conference of the Spanish Society for Natural Language Processing (SEPLN 2025), CEUR-WS</article-title>
          .org,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Refaj Hossan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Sakib</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Alam Miah Jawad Hossain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. Moshiul</given-names>
            <surname>Hoque</surname>
          </string-name>
          ,
          <source>CUET_Big_O@NLU of Devanagari Script Languages</source>
          <year>2025</year>
          :
          <article-title>Identifying Script Language and Detecting Hate Speech Using Deep Learning</article-title>
          and
          <string-name>
            <given-names>Transformer</given-names>
            <surname>Model</surname>
          </string-name>
          ,
          <year>2025</year>
          . URL: https://github.com/vikaskumarjha9/hindi_.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>K.</given-names>
            <surname>Sarveswaran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Thapa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Shams</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Vaidya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. K.</given-names>
            <surname>Bal</surname>
          </string-name>
          ,
          <source>Proceedings of the First Workshop on Challenges in Processing South Asian Languages (CHiPSAL</source>
          <year>2025</year>
          ),
          <year>2025</year>
          . URL: https://sites.google.com/view/chipsal/.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M.</given-names>
            <surname>Velmala</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Rajiakodi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Pannerselvam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Sivagnanam</surname>
          </string-name>
          ,
          <article-title>Multimodal Sentiment Analysis of Online Memes: Integrating Text and Image Features for Enhanced Classification</article-title>
          .
          <source>Procedia Computer Science</source>
          ,
          <volume>258</volume>
          , (
          <year>2025</year>
          )
          <fpage>355</fpage>
          -
          <lpage>364</lpage>
          . doi:
          <volume>10</volume>
          .1016/j.procs.
          <year>2025</year>
          .
          <volume>04</volume>
          .272.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Badour</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Brown</surname>
          </string-name>
          ,
          <source>Hateful Memes Classification using Machine Learning</source>
          .
          <source>2021 IEEE Symposium Series on Computational Intelligence</source>
          ,
          <source>SSCI 2021 - Proceedings</source>
          ,
          <year>2021</year>
          . doi:
          <volume>10</volume>
          .1109/SSCI50451.
          <year>2021</year>
          .
          <volume>9659896</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>D.</given-names>
            <surname>Kiela</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Firooz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mohan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Goswami</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Ringshia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Testuggine</surname>
          </string-name>
          ,
          <source>The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes</source>
          ,
          <year>2020</year>
          . doi:
          <volume>10</volume>
          .48550/arXiv.
          <year>2005</year>
          .
          <volume>04790</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>L.</given-names>
            <surname>Sukanya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Aniketh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. Abhiman</given-names>
            <surname>Sathwik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. Sridhar</given-names>
            <surname>Reddy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. Hemanth</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <article-title>Racism detection using deep learning techniques</article-title>
          .
          <source>E3S Web of Conferences</source>
          ,
          <volume>391</volume>
          ,
          <year>2023</year>
          . doi:
          <volume>10</volume>
          .1051/e3sconf/202339101052.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>T. T.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Yue</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Mane</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Seelman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. S. P.</given-names>
            <surname>Mullaputi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Dennard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. S.</given-names>
            <surname>Alibilli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. S.</given-names>
            <surname>Merchant</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Criss</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Hswen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q. C.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          ,
          <article-title>Decoding Digital Discourse Through Multimodal Text and Image Machine Learning Models to Classify Sentiment and Detect Hate Speech in Race-</article-title>
          and
          <string-name>
            <surname>Lesbian</surname>
          </string-name>
          , Gay, Bisexual, Transgender, Queer, Intersex, and
          <article-title>Asexual CommunityRelated Posts on Social Media: Quantitative Study</article-title>
          .
          <source>Journal of Medical Internet Research</source>
          ,
          <volume>27</volume>
          (
          <issue>1</issue>
          ) (
          <year>2025</year>
          ). doi:
          <volume>10</volume>
          .2196/72822.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Chauhan</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Kumar</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2025</year>
          ). MNLP@
          <article-title>DravidianLangTech 2025: Transformer-based Multimodal Framework for Misogyny Meme Detection</article-title>
          . https://github.com/stopwords-iso/.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M.</given-names>
            <surname>Choudhary</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Agarwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Goyal</surname>
          </string-name>
          , Hate Speech Detection:
          <article-title>Leveraging LLM-GPT2 with Fine-Tuning and Multi-Shot Techniques</article-title>
          .
          <source>Procedia Computer Science</source>
          ,
          <volume>258</volume>
          , (
          <year>2025</year>
          )
          <fpage>2817</fpage>
          -
          <lpage>2825</lpage>
          . doi:
          <volume>10</volume>
          .1016/j.procs.
          <year>2025</year>
          .
          <volume>04</volume>
          .542.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>X.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Qu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Backes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zannettou</surname>
          </string-name>
          ,
          <string-name>
            <surname>Y. Zhang,</surname>
          </string-name>
          <article-title>HateBench: Benchmarking Hate Speech Detectors on LLM-Generated Content</article-title>
          and
          <string-name>
            <given-names>Hate</given-names>
            <surname>Campaigns</surname>
          </string-name>
          ,
          <year>2025</year>
          . doi:
          <volume>10</volume>
          .48550/arXiv.2501.16750.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>