<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <article-id pub-id-type="doi">10.1007/s13278-024-01393-9</article-id>
      <title-group>
        <article-title>A Transformer-based approach to Multimodal Hateful Meme Classification</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sharad Prakash</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Upkar Kumar Kedia</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kirti Kumari</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kushagra Prakash</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Trishant Kumar</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Amity University</institution>
          ,
          <addr-line>Ranchi, Jharkhand</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Indian Institute of Information Technology</institution>
          ,
          <addr-line>Ranchi, Jharkhand</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2026</year>
      </pub-date>
      <volume>1</volume>
      <fpage>1</fpage>
      <lpage>2</lpage>
      <abstract>
        <p>In today's digital era, social media has become a common platform where people freely share their views, thoughts, ideas, and opinions. In such a scenario, it is seen that this has brought an increase in dissemination of hate speech, ofensive content, derogatory remarks, etc. on such platforms. One of such hateful content is meme, which can be quickly generated and rapidly shared across online social media by users. The spread of such hateful contents has detrimental impact on people and society at large. Detecting hateful memes on such platforms is challenging due to its multimodal nature. This paper presents our approach using transformer-based model to classify the multimodal memes in four Indian languages i.e. Hindi, Gujarati, Bangla, and Bodo, for the Hate Speech and Ofensive Content Identification (HASOC) 2025 shared task challenge. We participated in all the four shared subtasks and our team KK_NLP_AI_IIIT_Ranchi, achieved rank 3rd, 5th, 6th and 9th in Hindi, Gujarati, Bangla, and Bodo language with macro F1-score of 0.59597, 0.61409, 0.60595, and 0.57184 respectively. Further, we have discussed and compared the eficacy of the various models and architectures that we employed for the multi-task classification as presented in the challenge.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Multimodal meme classification</kwd>
        <kwd>HASOC 2025</kwd>
        <kwd>Hateful meme</kwd>
        <kwd>BERT</kwd>
        <kwd>CLIP</kwd>
        <kwd>IndicBERT</kwd>
        <kwd>Feature Fusion</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Due to the widespread use of social media platforms, there is an exponential growth in multimodal
content, making human moderation of such information no longer feasible. In recent years, memes
consist of an image with embedded text have become a very popular type of multimodal content on
social networks. Memes have a significant impact on user communication styles, social culture, and
content consumption patterns. Although memes are humorous in nature, they have also helped in
spread of online harassment and abuse. Such hateful memes are propagated on social networks in
multiple languages including English.</p>
      <p>In South Asian countries, languages such as Hindi, Gujarati, Bangla, and other low-resource languages
are prevalent. The detection of hateful memes in such languages presents unique complexities. This
study aims to address these challenges by developing a model tailored to the linguistic and cultural
contexts mainly of four languages i.e. Hindi, Gujarati, Bangla and Bodo by employing Natural Language
Processing (NLP) techniques.</p>
      <p>
        The Hate Speech and Ofensive Content Identification (HASOC) shared task provides a platform
for advancing research in identifying ofensive and harmful content online [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The seventh edition,
HASOC 2025, focuses on multimodal memes, which contain image with embedded text [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] to convey
nuanced or implicit messages, and hence this year’s challenge is organized into four main subtasks as
under:
1. Sentiment detection – classifying memes as positive, negative, or neutral.
2. Sarcasm detection – identifying whether content is sarcastic or straightforward.
3. Vulgarity detection – distinguishing vulgar from non-vulgar memes.
4. Abuse detection – detecting abusive versus non-abusive content.
      </p>
      <p>
        In this paper, we present our solution for HASOC 2025, which leverages CLIP embeddings for multimodal
visual-text [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] features and BERT-based embeddings for textual content [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. We employ a simple
concatenation-based fusion strategy to combine these representations for classification across all
subtasks. Our architecture achieved competitive results, ranking 3rd in Hindi, 5th in Gujarati, 6th in
Bangla, and 9th in Bodo language on the public test dataset shared with us. In addition, we present
a comparison between multilingual and unilingual feature extraction, providing insights into their
relative efectiveness for meme analysis.
      </p>
      <p>An example of a meme provided with each of the four languages, i.e., Hindi, Gujarati, Bangla, and
Bodo dataset, is depicted in Figure 1a, 1b, 1c, and 1d respectively. A significant number of the images
used both english and the indic language for text, which further complicated the training procedure.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Works</title>
      <p>Since meme classification is a multimodal task involving both visual and textual information, it has
attracted considerable attention in recent years. Previous research has explored a variety of strategies
to address the challenges of capturing the complex and context-dependent relationships between text
and images, which are crucial for deciphering the intended meaning of memes.</p>
      <p>
        In parallel, the HASOC (Hate Speech and Ofensive Content Identification) shared tasks at FIRE (Forum
for Information Retrieval and Evaluation) have established themselves as an important benchmark
series for evaluating methods in hate speech detection across multiple languages. Starting with HASOC
2019 [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], which covered Indo-Euporian languages followed by HASOC 2020 with other languages such
as Tamil, Malayalam, Hindi, English, and German [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], the track has expanded to include more languages
and new subtracks. HASOC 2021 introduced Indo-Aryan languages and conversational hate speech
[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], while HASOC 2022 further extended coverage of English and Indo-Aryan languages [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. The 2023
edition incorporated low-resource languages such as Assamese, Bengali, Bodo, Gujarati, and Sinhala [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
A Bengali meme dataset was created to establish an efective benchmark for classifying abusive Bangla
memes using multimodal models [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Most recently, HASOC 2024 focused on hate speech detection in
English and Bengali [12]. Ghosh et al. [13] explored a novel three-module pipeline, SafeSpeech, for
Hate Speech Classification, Hate Intensity Identification, and Hate Intensity Mitigation using publicly
available datasets in five Indic languages (Hindi, Marathi, Tamil, Telugu, and Bengali). Recently, Ghosh
et al. [14] developed models on machine learning and deep learning methods for the detection of Hate
speech in four low-resource languages (Assamese, Bengali, Bodo, and English) datasets, and analyzed
their performance across standard evaluation metrics parameters. Their findings ofered challenges and
progress in the detection of hate speech in low-resource multilingual languages. Furthermore, Vijay et
at. [15] develop a hate speech detection framework by employing TF-IDF word embedding technique
for feature extraction and a lexicon-based hierarchical approach to address challenges of linguistic and
cultural intricacies in Hinglish (a mix of Hindi and English) and Bangla languages.
      </p>
      <p>Our work builds on this line of research in HASOC Track at FIRE 2025 on abusive Meme identification
[16], [17], extending the scope to multimodal meme classification and advancing hate speech detection
in multilingual and multimodal settings.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Dataset Description</title>
      <p>The HASOC 2025 shared task released training and test datasets for four languages: Hindi, Gujarati,
Bangla, and Bodo (see Table 1). The distribution of training datasets across the four languages was
highly unbalanced, ranging from just 378 memes for Bodo to 2693 dataset points for Bangla. Due to this
disparity in the size of the data sets, we adopted language-specific training strategies to better account
for the varying levels of data availability.</p>
      <p>The datasets included an additional column named ’Target’, which was eventually omitted from the
sample submission and hence removed from our training data. The Bodo training dataset also had a
typographical error in the ’Vulgar’ column, which was coerced to ’Non-Vulgar’ class and corrected.</p>
      <p>It was also observed that for Bodo, the images generally did not provide much semantic context to
the actual meaning of the meme, and even worsened the results when used with the Optical Character
Recognition (OCR) text for training. In contrast to this, the Bangla images improved significantly when
the images were coupled with the OCR text during training.</p>
      <p>There was no separate validation dataset provided, hence we used a 90:10 Train:Validation split on
our train set.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Methodology</title>
      <p>This section describes an overview of feature extraction techniques for text-only model and Contrastive
Language-Image Pre-Training (CLIP) [18] based model with text and the methods used to train these
models.</p>
      <sec id="sec-4-1">
        <title>4.1. Text-only Models</title>
        <p>This section deals with our initial attempts at the classification problem with text-only, low-resource
models that used only the OCR text for feature extraction.</p>
        <sec id="sec-4-1-1">
          <title>4.1.1. Multilingual Backbone (IndicBERT)</title>
          <p>We first experimented with a multilingual backbone by fine-tuning the IndicBERT model, trained on
a large corpus of Indian languages by AI4Bharat. This purely text-based architecture, as in Figure 2,
surpassed the baseline by a large margin.</p>
          <p>For the Bodo, Hindi, and Gujarati datasets, this setup achieved macro F1-scores greater than 0.55.
However, the Bangla dataset, which had more data points than the other three languages combined,
performed poorly, with the macro F1-score dropping well below the baseline of 0.53. We attribute this
to the presence of a large number of images in the Bangla dataset that provided crucial semantic context
to the memes, which was not captured by a text-only architecture.</p>
          <p>This multilingual, text only model architecture achieved a test macro F1-score of 0.56541 for Hindi,
0.56775 for Gujarati, 0.54691 for Bodo and 0.45601 for Bangla.</p>
          <p>Figure 2: Text only model architecture</p>
        </sec>
        <sec id="sec-4-1-2">
          <title>4.1.2. Monolingual Backbones</title>
          <p>In order to improve the performance of the models and to ensure that the poor results on Bangla were
not due to limitations of the IndicBERT backbone, we also experimented with monolingual BERT-based
models specific to each language [ 19]. We used bengali-bert for Bangla, hindi-bert-v2 for Hindi,
gujarati-bert for Gujarati, and pretrained-bodo-legal-bert for Bodo.</p>
          <p>These monolingual models did not result in any significant improvements for the Bangla, Hindi or
Gujarati models, each of them staying close the previous results achieved by IndicBERT. The bodo
model however, worsened even further, potentially due to poor training corpus of the backbone model
used.</p>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. CLIP with Text models</title>
        <p>
          In order to ensure the usage of the image features with the text features extracted using the BERT
based backbone, we incorporated the CLIP Vision model [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] as shown in Figure 3. For simplicity, we
performed a simple feature concatenation to fuse the two features; however, later investigation using
cross-attention-based feature concatenation only worsened the results.
        </p>
        <p>The CLIP model being trained primarily on the English language failed to correlate and extract join
image-text features for the Indic languages, so cross-attention-based feature fusion was omitted from
the architecture and only simple concatenation-based feature fusion was used in all the subsequent
models.</p>
        <p>As was the case with the text-only models, we experimented with both multilingual and monolingual
backbones for text feature extraction. For Bangla, the multilingual, IndicBERT-based backbones coupled
with CLIP vision model, improved the results significantly, improving the test macro F1-score to 0.58
from 0.45. The Gujarati model too, improved its macro F1-score to 0.59 from 0.56 with this multimodal
model. However, Hindi did not benefit from IndicBert and CLIP, with results staying close to the
previous text only models, and Bodo worsened on using the CLIP features. This poor performace for
Bodo remained consistent across both monolingual and multilingual text backbones, hence the image
features were dropped for Bodo during final submission. One of the potential reasons for this was the
very low number of data points for Bodo (378), where the majority of the images did not provide any
semantic context to the meme. The final submission for Bodo was made using the IndicBert-v2 model
trained using the text only architecture, with a Test macro F1-score of 0.57184.</p>
        <p>The monolingual backbones did improve the results on Hindi, Gujarati and Bangla, surpassing the
macro F1 threshold of 0.60 for both Bangla and Gujarati, and achieving a macro F1-score of 0.59597 for
Hindi. The final scores for the submission were achieved using the CLIP and monolingual models for
Bangla, Gujarati and Hindi, with test macro F1-scores of 0.60595, 0.61409, and 0.59597 respectively.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Training</title>
        <p>All models were trained on Kaggle servers using a P100 GPU. During training, we observed that the
training curves behaved quite aberrantly, even with the Adam optimizer. To mitigate this and ensure
stable convergence, we employed several techniques. We utilized weight decay and a warm-up based
learning rate scheduler. Furthermore, early stopping was implemented to prevent overfitting and select
the best-performing models for final prediction.</p>
        <p>Two distinct training strategies were employed based on language-specific performance:
1. Cumulative Training: In this method, all classification heads were trained simultaneously,
sharing a common feature pool from the backbone. This approach proved most efective for the
Bodo model, which benefited from the shared feature representation. When trained separately,
the Bodo model’s loss and macro F1 scores oscillated and failed to converge.
2. Sequential Training: Here, each classification head was trained separately while the other
layers were frozen. This sequential process yielded better results for Hindi, Gujarati, and Bangla.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Results and Discussion</title>
      <p>This section presents the results and discussion of our multimodal hateful meme classification approach,
analyzing performance of our model across the four shared languages, i.e. Hindi, Gujarati, Bangla, and
Bodo. The macro F1 scores of our model for the four languages compared with the first-ranked models
and our ranking in the participation are mentioned in Table 2. Plots for Loss and macro F1-score are
illustrated in Figure 4 and Figure 5 respectively.</p>
      <p>The Gujarati model converged quite well, as can be seen from the macro F1-score and loss plots
(Figure 4 and 5). However, Bodo, Bangla, and Hindi models required the careful scheduling of learning
rates to achieve convergence. Despite these eforts, the Bodo model did not converge properly; even
with learning rates as low as 5e-6, the loss values continued to oscillate during training. Hindi, Bangla
and Bodo, all overfit quite quickly, in contrast to Gujarati, which achieved the highest test macro F1
score across all the languages.
(c) Bangla
(d) Bodo</p>
      <p>When comparing the training and validation macro F1-scores as seen in the Figure 5, the Bodo model
had a huge diference between the training and validation macro F1-scores compared to the rest of the
languages. The Bodo model quickly overfit on the small corpus of training set, and hence the poor
result on the validation set.</p>
      <p>It was also observed that the "Vulgar" and "Abuse" classification heads converged the fastest, generally
achieving validation macro F1-scores above 0.70 across all languages. This correlation was equally
present with the "Sentiment" and "Sarcasm" heads which converged slowly and had validation macro
F1-scores that stayed close to each other.</p>
      <p>Our models ranked 3rd out of 18 submissions for Hindi, 5th out of 15 submissions for Gujarati, 6th out
of 17 submissions for Bangla, and 9th out of 15 submissions for Bodo.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion and Future Work</title>
      <p>In this work, we presented a comprehensive approach to multi-label ofensive meme classification for
four Indic languages. Our findings underscore the critical importance of multimodality, where fusing
(c) Bangla
(d) Bodo
visual features from CLIP with monolingual BERT backbones yielded significant performance gains for
Hindi, Gujarati, and Bangla. Conversely, for the low-resource Bodo language, a text-only model proved
superior, highlighting that the optimal architecture is highly dependent on the dataset’s characteristics.
Our models achieved competitive ranks in the HASOC 2025 challenge, validating our methodology.</p>
      <p>Based on the challenges that we observed regarding multimodal feature fusion and low resource
languages, we propose moving beyond simple feature concatenation by integrating inherently multilingual
vision-language models like mclip [20], which could provide better-aligned representations and enable
more sophisticated fusion mechanisms. This would be especially crucial for improving performance
on low-resource languages like Bodo, where techniques such as few-shot learning and advanced data
augmentation could also be explored. Furthermore, to capture a richer semantic context from memes,
future models could incorporate visual text features beyond simple OCR, leveraging layout-aware
architectures like LayoutLM [21] to understand the stylistic nuances of text within the image.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Acknowledgments</title>
      <p>As participants in the HASOC 2025 Challenge, we fully comply with the competition rules as outlined
by the organizers on the challenge website. Our methods have used the training and test data sets
provided in the oficial release of the datasets to report the results of the challenge.</p>
      <p>Declaration on Generative AI</p>
      <p>During the preparation of this work, the authors used Grammarly in order to: Grammar and spelling
check. After using these tools/services, the authors reviewed and edited the content as needed and take
full responsibility for the publication’s content.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D.</given-names>
            <surname>Kiela</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Firooz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mohan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Goswami</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.-E.</given-names>
            <surname>Mazaré</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ju</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Galuba</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Musat</surname>
          </string-name>
          , et al.,
          <article-title>The hateful memes challenge: Detecting hate speech in multimodal memes</article-title>
          ,
          <source>in: Advances in Neural Information Processing Systems</source>
          , volume
          <volume>33</volume>
          ,
          <year>2020</year>
          , pp.
          <fpage>2626</fpage>
          -
          <lpage>2639</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>E.</given-names>
            <surname>Hossain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Sharif</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. M. Hoque</surname>
            ,
            <given-names>S. M.</given-names>
          </string-name>
          <string-name>
            <surname>Preum</surname>
          </string-name>
          , Deciphering hate:
          <source>Identifying hateful memes and their targets</source>
          ,
          <year>2024</year>
          . URL: https://arxiv.org/abs/2403.10829. arXiv:
          <volume>2403</volume>
          .
          <fpage>10829</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S. B.</given-names>
            <surname>Shah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Shiwakoti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chaudhary</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wang</surname>
          </string-name>
          , Memeclip:
          <article-title>Leveraging clip representations for multimodal meme classification</article-title>
          ,
          <year>2024</year>
          . URL: https://arxiv.org/abs/2409.14703. arXiv:
          <volume>2409</volume>
          .
          <fpage>14703</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M. V.</given-names>
            <surname>Koroteev</surname>
          </string-name>
          ,
          <article-title>Bert: a review of applications in natural language processing and understanding</article-title>
          ,
          <source>arXiv preprint arXiv:2103.11943</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , Bert:
          <article-title>Pre-training of deep bidirectional transformers for language understanding</article-title>
          , arXiv preprint arXiv:
          <year>1810</year>
          .
          <volume>04805</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Majumder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Patel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dave</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Mandlia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Patel</surname>
          </string-name>
          ,
          <article-title>Overview of the hasoc track at fire 2019: Hate speech and ofensive content identification in indo-european languages</article-title>
          ,
          <source>in: Proceedings of the 11th Annual Meeting of the Forum for Information Retrieval Evaluation</source>
          , FIRE '19,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2019</year>
          , p.
          <fpage>14</fpage>
          -
          <lpage>17</lpage>
          . URL: https://doi.org/10.1145/3368567.3368584. doi:
          <volume>10</volume>
          .1145/3368567.3368584.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. Kumar</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>B. R.</given-names>
            <surname>Chakravarthi</surname>
          </string-name>
          ,
          <article-title>Overview of the hasoc track at fire 2020: Hate speech and ofensive language identification in tamil, malayalam, hindi, english and german, in: Proceedings of the 12th annual meeting of the forum for information retrieval evaluation</article-title>
          ,
          <year>2020</year>
          , pp.
          <fpage>29</fpage>
          -
          <lpage>32</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. K.</given-names>
            <surname>Shahi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Madhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Ranasinghe</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Zampieri, Overview of the hasoc subtrack at fire 2021: Hate speech and ofensive content identification in english and indo-aryan languages and conversational hate speech, in: Proceedings of the 13th annual meeting of the forum for information retrieval evaluation</article-title>
          ,
          <year>2021</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>3</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Majumder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Madhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Ranasinghe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zampieri</surname>
          </string-name>
          , K. North,
          <string-name>
            <given-names>D.</given-names>
            <surname>Premasiri</surname>
          </string-name>
          ,
          <article-title>Overview of the hasoc subtrack at fire 2022: Hate speech and ofensive content identification in english and indo-aryan languages, in: Proceedings of the 14th annual meeting of the forum for information retrieval evaluation</article-title>
          ,
          <year>2022</year>
          , pp.
          <fpage>4</fpage>
          -
          <lpage>7</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>T.</given-names>
            <surname>Ranasinghe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Ghosh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. S.</given-names>
            <surname>Pal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Senapati</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. E.</given-names>
            <surname>Dmonte</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zampieri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <article-title>Overview of the hasoc subtracks at fire 2023: Hate speech and ofensive content identification in assamese, bengali, bodo, gujarati and sinhala, in: Proceedings of the 15th annual meeting of the forum for information retrieval evaluation</article-title>
          ,
          <year>2023</year>
          , pp.
          <fpage>13</fpage>
          -
          <lpage>15</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>M. Das</surname>
            ,
            <given-names>A. Mukherjee,</given-names>
          </string-name>
          <article-title>BanglaAbuseMeme: A dataset for Bengali abusive meme classification</article-title>
          , in: H.
          <string-name>
            <surname>Bouamor</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Pino</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          Bali (Eds.),
          <source>Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing</source>
          , Association for Computational Linguistics, Singapore,
          <year>2023</year>
          , pp.
          <fpage>15498</fpage>
          -
          <lpage>15512</lpage>
          . URL: https://aclanthology.org/
          <year>2023</year>
          .emnlp-main.
          <volume>959</volume>
          /. doi:
          <volume>10</volume>
          .18653/v1/
          <year>2023</year>
          . emnlp-main.
          <volume>959</volume>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>