<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Zero-Shot and Multitask Learning Synergy for Robust Hate Speech Detection across English and Bangla</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Kavya G</string-name>
          <email>kavyamujk@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Asha Hegde</string-name>
          <email>hegdekasha@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sonith D</string-name>
          <email>sonithksd@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Subrahmanya</string-name>
          <email>subrahmanyapoojary789@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>H L Shashirekha</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, Mangalore University</institution>
          ,
          <addr-line>Mangalore, Karnataka</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Hate speech content is the textual content that disparages or targets someone because of their race, religion, or gender, whereas ofensive content includes any material that may cause discomfort or insult but may not necessarily be hateful. Detecting such content on social media is crucial to prevent harm, ensure a safe online environment and uphold community standards. The ever-evolving and diversified characteristics of online media language, the need for context-aware analysis, and the delicate balance between moderation and free speech are the main obstacles to identify Hate Speech and Ofensive Content (HASOC). In this direction, "HASOC - Hate Speech and Ofensive Content Detection" - a shared task organized at Forum for Information Retrieval Evaluation (FIRE) 2024, invites the research community to address the challenges of HASOC on social media in English and Bangla language. This task consists of two subtasks: Task 1: Binary Classification in English: focused on HASOC identification in Hinglish, and Task 2: Code-mixed classification in Bangla. To explore the strategies for HASOC detection on social media platforms, in this paper, we - team MUCS, proposed Zero_CS_KW+LD - a Zero-Shot Learning (ZSL) approach for Task 1, and the challenges of Task 2 are addressed by implementing Multi Task Learning (MTL) using Transfer Learning (TL) approach with two transformer models (Bidirectional Encoder Representations from Transformers (BERT) and Distilled version of Multilingual BERT (DistilMBERT)). Among the submitted models, Zero_CS_KW+LD model obtained macro F1 score of 0.5653 securing 7th rank in Task 1 and the proposed MTL model using DistilMBERT obtained macro F1 scores of 0.6761 and 0.3975 securing 4th and 1st ranks for Ofensive_gold task and Target_gold task, respectively in Task 2.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Hate and Ofensive content</kwd>
        <kwd>Zero-Shot Learning</kwd>
        <kwd>Cosine Similarity</kwd>
        <kwd>Multi-task Learning</kwd>
        <kwd>Label Description</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        People’s interactions and communication have drastically changed as a result of social media platforms
like Facebook and Twitter, which allow users to freely express their opinions about anything. Some
users are exploiting this unprecedented level of openness and the anonymity of users on social media
platforms to spread harmful content in the form of hate speech, and ofensive and abusive language
[
        <xref ref-type="bibr" rid="ref1 ref2 ref3">1, 2, 3</xref>
        ]. HASOC targeting people or group based on characteristics like race, gender, religion, or
nationality, are among the most troubling forms of content. While hate speech refers to statements that
encourage violence or prejudice against the targeted groups [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], ofensive content includes hurtful or
disparaging remarks that foster a hostile atmosphere which may not be violent. The spread of HASOC
on social media may harm mental health of people leading to depression or in severe cases, suicidal
thoughts [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ]. Hence, to maintain a secure social media, it is essential to identify and handle such
content. Detecting HASOC manually on social media platforms presents several challenges due to the
vast volume of user-generated content necessitating eficient and accurate automated systems.
      </p>
      <p>India is a diverse multilingual country with each state having its own oficial language and most of
the Indians particularly who are actively engaged in social media are comfortable in using at least two
languages including English. As social media platforms do not impose any standards for language the
users’ use for their communication, most of the content generated by the users will be code-mixed
which includes a blend of sentences, words or sub-words in more than one language and more than
one script, out of which one language will be prominently English [7]. This linguistic complexity adds
a layer of challenges for conventional models to accurately identify HASOC in user-generated text.
Further, many of the existing HASOC detection systems are designed primarily for major languages like
English, Spanish leaving gaps in support for less-represented languages like Bangla, Punjabi, Telugu,
Tamil, Malayalam, Urdu and so on. These languages being under-resourced lack annotated datasets and
computational tools. This highlights the need for resources and robust computational tools to identify
HASOC on social media.</p>
      <p>"HASOC 2024"1 shared task organized at FIRE2 2024, invites researchers to develop models to address
the challenges of detecting HASOC in code-mixed Hinglish and Bangla text. The shared task consists
of two subtasks: i) Task 1 - Binary Classification in English: a task focused on HASOC identification
ofered for Hinglish is a coarse-grained binary classification to classify tweets into one of two classes:
hate and ofensive (HOF) and non- hate and ofensive (NOT) and ii) Task 2 - Code-mixed classification
in Bangla: this task utilizes Ofensive Language Identification Dataset’s (OLID’s) 3 taxonomy to classify
code-mixed Bangla tweets in romanized script into one of two categories: Ofensive (O) - includes
any form of unacceptable language or targeted ofenses, or Non-ofensive (N) - contains no ofensive
language, in level A. Additionally, the given tweet also has to be classified as targeted at an Individual
(I), a Group (G), or Untargeted (U), in levels B and C which are integrated. In this paper, we describe
the proposed learning models to address challenges of the shared task. As no task-specific training
data is provided by the organizers, and participants were allowed to use any external resources and
datasets to train the models for Task 1, this task is modeled as ZSL problem. ZSL is a Machine Learning
(ML) approach that leverages the general knowledge and relationships between known and unknown
categories and enables the models to classify the given unlabeled samples into categories/classes they
haven’t been explicitly trained on [8]. Task 2 involves classifying the given Bangla tweet at two levels:
level A and levels B and C (integrated as one level) and the training set for both the levels remain the
same. Hence, this task is addressed as two sub-tasks with the same training data and modeled as a
MTL problem, allowing the model to learn from multiple related tasks simultaneously. MTL is a ML
approach where a single model is trained to perform multiple tasks simultaneously, leveraging shared
information across tasks to improve learning eficiency and performance. This approach allows the
model to generalize better by learning common representations and features that are beneficial for the
two sub-tasks involved [9]. ZSL and MTL techniques allow for more robust and adaptable systems to
address the diverse and evolving nature of harmful content across diferent languages and contexts.</p>
      <p>The rest of paper is organized as follows: Section 2 describes the recent literature on HASOC detection
and Section 3 focuses on the description of the proposed models followed by the experiments and
results in Section 4. The paper concludes with future works in Section 5.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>Identifying HASOC in social media involves recognizing the content that threatens individuals based
on race, disability, or socioeconomic status. This task is challenging due to the complexity of such
content, which can manifest in many forms, including indirect phrases and suggestive language, and
can difer significantly across various cultural contexts. Researchers have explored various approaches
including ZSL and MTL to identify HASOC. Some notable works are described below:</p>
      <sec id="sec-2-1">
        <title>2.1. Zero-Shot Learning</title>
        <p>Goldzycher and Schneider [10] have explored Natural Language Inference (NLI) models for zero-shot text
classification by combining multiple hypotheses to improve NLI-based zero-shot hate speech detection
in English. They fine-tuned BERT-based hate speech detection models on HateCheck and ETHOS
datasets and obtained accuracies of 79.4% and 69.6% respectively. The zero-shot multilingual NLI model
1https://hasocfire.github.io/hasoc/2024/
2http://fire.irsi.res.in/fire/2024/home
3https://github.com/LanguageTechnologyLab/TB-OLID
(Cross-lingual Language Model - Robustly optimized BERT approach (XLM-RoBERTa), Multilingual
Decoding-enhanced BERT with attention v3 (mDeBERTa-v3), Multilingual BERT (mBERT), Multilingual
Text-to-Text Transfer Transformer (MT5), mDistillBERT) using Siamese network-based contrastive
training on multilingual data (English, Hindi, Spanish etc.) to achieve universal zero-shot NLI proposed
by Kowsher et al. [11], efectively captures meaningful semantic relationships. Among the proposed
multilingual zero-shot models, NLI XLM-RoBERTa model outperformed other models with macro F1
score of 0.83. Kumar and Albuquerque [12] implemented cross-lingual XLM-RoBERTa classification
model by fine-tuning English language sentiment analysis Twitter dataset and subsequently used
zeroshot TL to evaluate the classification model on two Hindi sentence-level sentiment analysis (IITP-Movie
and IITP-Product review) datasets. Their proposed model achieved an accuracy of 60.93 on both datasets.
Yadav et al. [13] proposed zero-shot classification using the Bidirectional Auto-Regressive Transformers
(BART) large model, one-shot and few-shot prompting using Generative Pre-trained Transformer-3
(ChatGPT-3) for hate speech detection in code-mixed Hinglish language. Among their proposed models,
Zero-shot with BART model exhibited comparably good macro F1 score of 0.5245.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Multi-Task Learning</title>
        <p>By jointly modeling hate and ofensive content detection with related concepts like sentiment and
emotion analysis, researchers have achieved significant performance improvements. Dai et al. [14]
presented BERT-Based MTL models to tackle ofensive language detection in English demonstrating
improvements in handling the task through hierarchical model architecture. A novel MTL formulation
for identifying four types of hate speech - religion, race, disability, and sexual orientation, through a
fuzzy ensemble approach proposed by Liu et al. [15] utilizes single-labeled data for semi-supervised
multi-label learning. Also two new metrics - detection rate and irrelevance rate, were proposed to
measure the performance of this kind of learning tasks more efectively. The authors’ experimental
study on identifying four types of hate speech demonstrated that fuzzy ensemble approach (based on
the mixed fuzzy rule formation algorithm) significantly outperformed popular probabilistic (Support
Vector Machines (SVM) and multiple Deep Neural Network DNNs) methods with an overall detection
rate of 0.93. Kapil and Ekbal [16] presented Convolution Neural Network (CNN) based MTL models
(Random word vectors-CNN, Word-CNN, Char-CNN, Hybrid-CNN, CNN-Word-Attention,
Word-CNNFully Shared MTL, Soft Sharing CNN-Word-MTL) for hate speech detection in English twitter dataset.
Their proposed Word-CNN-Fully shared MTL model obtained comparably good macro F1 score of
0.8737. Mishra et al. [17] trained a MTL model with separate task heads using back-translation and
multi-lingual approaches, for HASOC identification in Indo-European languages. The authors compared
the performances of their models with models of the participants in HASOC 2019 task, and showed
improvements in performances of their proposed MTL models by obtaining macro F1 scores of 0.765,
0.814, 0.612 for English, Hindi, German datasets respectively.</p>
        <p>The above studies highlight a range of techniques for hate speech detection, with some approaches
yielding lower macro F1 scores, indicating that there is still potential for further improvements. Further,
the creativity of users in generating code-mixed content underscores the need for continued research
and development to further refine these techniques.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <p>Task 1 shared task is modeled as ZSL and Task 2 as MTL and the proposed methodologies are described
in the following subsections:</p>
      <sec id="sec-3-1">
        <title>3.1. Task 1 - Binary Classification in English: Zero-Shot Learning Approach</title>
        <p>ZSL is the task of predicting a class label that was not seen by the model during training. This method
which leverages a pre-trained language model can be thought of as an instance of TL which generally
refers to using a model trained for one task in a diferent application than what it was originally trained
for. This is particularly useful for situations where the amount of labeled data is small or not available.
As there is no specific training data for Task 1, the proposed ZSL models are built on label descriptions
for ’Hate and Ofensive’ (HOF) and ’Non-hate and Ofensive’ (NOT), the predefined classes in Task
1. Label descriptions are semantic representations of categories used in the classification tasks to
help models understand the meaning of each labels. In ZSL, label description is beneficial as it allows
the model to infer and categorize new, unseen data based on predefined labels, even without explicit
examples during training [18]. Providing well-defined label descriptions helps the model generalize
better and make accurate predictions on novel data by leveraging its understanding of the semantic
diferences between the classes.</p>
        <p>In this study, we used the dataset of Subtask 2: Identification of Conversational Hate-Speech in
CodeMixed Languages (ICHCL) of HASOC 2021 shared task [19], to create label descriptions for the classes of
Task 1. Subtask 2 dataset consisting of Hindi-English (Hinglish) code-mixed text samples categorized into
two classes: ’Hate and Ofensive’ (HOF) and ’Non- hate and ofensive’ (NOT), aligns perfectly with the
predefined labels of Task 1. Building a ZSL model typically involves creating label description for each
class in the given task and obtaining the semantic representations for the label descriptions. Semantic
representations can be obtained through pre-trained models like GloVe, Word2Vec, or transformer
models like BERT or RoBERTa.</p>
        <p>To facilitate ZSL, efective pre-processing of the data is essential. Pre-processing is the process of
cleaning and transforming raw data into a format that is suitable for analysis. In this study, emojis
are converted to their textual descriptions, and emoji characters that fall outside the specific Unicode
ranges are removed. Further, user mentions and URLs are eliminated in addition to removing retweet
indicators, newline characters, and excessive whitespaces. By following these pre-processing steps, the
text is refined to a uniform format suitable for subsequent analysis.</p>
        <p>The Subtask 2 dataset undergoes pre-processing and is randomly split into: i) Train set - consisting
of 100 samples each for the two classes to create label descriptions for the classes and ii) Validation
set - consisting of 250 samples each for the two classes to evaluate the model. The crux of ZSL lies in
creating label descriptions for the classes and three methods proposed to create label descriptions are
given below:
• Zero_CS_LD - To create label descriptions for each class, the samples belonging to each class are
grouped into ten clusters using k-means algorithm and then the samples belonging to each cluster
are merged to get ten label descriptions. HingBERT-Mixed4 model is used to represent these label
descriptions and the mean of these representations are computed to get the label embeddings for
each class. HingBERT-Mixed model available on Hugging Face has been trained on a bilingual
corpus, making it well-suited for extracting meaningful features from Hinglish text [20].
• Zero_CS_KW+LD - The steps involved in creating label description for each class is as follows:
– All samples belonging to the class are merged and Hindi and English stop words are removed
– Keywords are extracted automatically based on term frequency
– A list of ten keywords describing the class as shown in Table 1 is manually curated
– Both the keywords are merged together to get label description
– HingBERT-Mixed model is used to obtain the semantic representation for the label
description
Keywords ensure the inclusion of essential terms related to each label enriching the texts semantic
representation. Further, it enhances the models ability to generalize and accurately classify new
data based on the provided definitions.
• Zero_NLI_KW+LD - The procedure used to obtain label descriptions and representation for the
label descriptions is the same as in Zero_CS_KW+LD. However, it difers in label prediction for
the given input.
4https://huggingface.co/l3cube-pune/hing-mbert-mixed</p>
        <p>The given Test set is preprocessed as mentioned earlier and HingBERT-Mixed model is used to get
the semantic representation of the preprocessed Test set. During inference using Zero_CS_LD and
Zero_CS_KW+LD models, cosine similarity measure of the semantic representation of the test instance
and that of the label descriptions of classes is computed and name of class with the highest similarity
score is assigned to the test instance. This efectively allows the model to classify the instance into
categories it has never explicitly trained on. Further, this approach enables the model to generalize
from seen classes to unseen ones by leveraging the shared semantic space of the embeddings [18]. For
inference using Zero_NLI_KW+LD model, sentence transformer based on
distilbert-base-multilingualcased5 - a variant of DistilBERT fine-tuned for NLI in a multilingual setting, is used to classify the
test instances. The Test instance is given as input into a fine-tuned NLI model that outputs scores for
hypotheses based on the class labels "HOF" and "NOT", enabling the selection of the label with the
highest score as the predicted classification. Thus, by leveraging the label description for each class in
the given task, the ability of the model is enhanced to classify unseen instances by efectively utilizing
the semantic relationships between the classes and the unseen instances.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Task 2 - Code-mixed Classification in Bangla: Multi-task Learning Approach</title>
        <p>This task focuses on identifying ofensive comments in code-mixed Bangla tweet in the first level and
then classifying type of target of the ofensive comment as ‘Individual’, ‘Group’, or ‘Untargeted’ in the
next level. Classification of the given tweet at two levels is addressed as MTL problem with a single
model performing both the tasks simultaneously [21]. MTL is a ML technique where a model is trained
to perform multiple related tasks simultaneously by sharing certain network layers and parameters.
This shared learning approach enhances model performance, particularly when individual tasks lack
suficient data for training. MTL architecture leverages shared lower-level features across tasks while
learning higher-level features that are specific to each task. This enables the model to generalize better
and eficiently utilize limited data for related tasks. By learning both tasks in a unified framework, the
model benefits from shared representations that enhance its ability to detect ofensive content and
accurately classify the target type. Figure 1 depicts the framework of the proposed MTL model and the
steps involved in implementing this model is given below</p>
        <sec id="sec-3-2-1">
          <title>3.2.1. Pre-processing</title>
          <p>The given dataset contains noisy and unstructured text in the form of irregularities and inconsistencies
in the words, frequent use of user mentions, hashtags, and informal language. As user mentions,
hashtags, urls, digits and punctuation does not contribute to classification task, they are removed from
the given dataset. Further, English stopwords available in Natural Language Tool Kit (NLTK)6 are used
as references to remove English stopwords from the given dataset to retain only the content words.
This step helps to improve the model’s ability to learn from the data by reducing noise and capturing
relevant linguistic patterns in code-mixed Bangla text.</p>
        </sec>
        <sec id="sec-3-2-2">
          <title>3.2.2. Text Representation</title>
          <p>Text representation directly impacts the model’s performance on tasks like classification [ 22], sentiment
analysis [23, 24], and language understanding [25]. Efective text representation methods, such as
5https://huggingface.co/pritamdeka/distilbert-base-multilingual-cased-indicxnli-random-negatives-v1
6https://www.nltk.org/
word embeddings or transformer-based models, aim to capture the contextual and semantic nuances of
the text. By converting text into meaningful feature vectors, these representations provide valuable
information that helps learning models to understand and generalize patterns in the given data. This
work makes use of BERT and DistilMBERT models for text representation and description of these
models is given below:
• BERT7 - utilizes ‘BertTokenizer’ and ‘TFBertModel’ from Hugging Face’s Transformers library
for tokenization and loading the pre-trained BERT language model for English text respectively.
‘BertTokenizer’ is trained on a vast corpus of English text and uses WordPiece tokenization to
split words into sub-word units. ‘TFBertModel’ class loads the pre-trained BERT model, which
captures contextual information from both the left and right sides of a given text input. This
allows the model to efectively predict the next word in a sentence by understanding the full context.
• DistilMBERT8 - is a compact and faster mBERT variant, created using knowledge distillation
and trained on Wikipedia content in 104 languages including Bangla. It is designed to be more
eficient, ofering roughly twice the speed of mBERT-base while maintaining strong multilingual
performance. This makes DistilMBERT an efective choice for resource-constrained environments.</p>
          <p>Although BERT model is pre-trained on English text, it efectively captures the relevant features to
train the learning models as the given dataset contains romanized Bangla text and more content words
in English.</p>
        </sec>
        <sec id="sec-3-2-3">
          <title>3.2.3. Model Building</title>
          <p>MTL approach is particularly efective when labels are hierarchical and designed to be inclusive from
top to bottom [26]. This hierarchical structure allows the model to leverage shared information across
labels, improving learning eficiency and performance. As illustrated in Figure 1, the proposed MTL
architecture leverages contextualized representations through a shared model and then divides into
two distinct modules for individual subtasks. The architecture employs BERT/DistilMBERT to generate
contextualized embeddings from the input sequence, which are subsequently processed by separate
Recurrent Neural Network (RNN) modules equipped with Long Short-Term Memory (LSTM) cells for
each subtask. Each module uses these embeddings to produce probability distributions for its respective
target labels. The overall loss  is computed as  = ∑︀
=1 , where  denotes the number of labels
7google-bert/bert-base-uncased
8distilbert/distilbert-base-multilingual-cased
and  represents the loss weights for each subtask. This design allows for efective learning across
multiple tasks by leveraging shared feature vectors. The hyperparameters and their values used to train
the MTL classifiers are shown in Table 2.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experiments and Results</title>
      <p>Various learning models were implemented with ZSL and MTL for Task 1 and Task 2 respectively.
The Test set for Task 1 provided by the organizers consists of 888 samples and performances of the
proposed models on Validation (taken from Subtask 2 dataset) and Test sets are shown in Tables 3 and
4 respectively. Among all the submitted models, Zero_CS_KW+LD model (using keywords for label
description and cosine similarity for comparison) outperformed other models with macro F1 score of
0.5653 securing 7th rank.</p>
      <p>The shared task organizers provided 4,000, 1,000, and 500 code-mixed Bangla text9 samples for
Training, Development, and Testing, respectively for Task 2 and the class-wise distribution of the
dataset is shown in Figure 2. Several experiments were conducted using diferent combinations of
features and classifiers and the models that demonstrated best performances on the Development sets
were subsequently applied to predict labels for the Test sets. Performances of the models submitted
by the participants for Task 2 were evaluated by the organizers in terms of macro F1 scores and
performances of our proposed MTL models on Development and Test sets are shown in Table 5. The
proposed model using DistilMBERT obtained macro F1 scores of 0.6761 for Ofensive_gold task securing
4th rank and 0.3975 for Target_gold task securing 1st rank, in Task 2. Figures 3 and 4 give a comparison
of macro F1 scores of all the participating teams in Task 1 and Task 2 respectively.
9https://github.com/LanguageTechnologyLab/TB-OLID
(a) Ofensive_gold
(b) Target_gold</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion and Future Work</title>
      <p>In this paper, we - team MUCS, describe the models submitted to "HASOC - Hate Speech and Ofensive
Content Detection" - a shared task at "FIRE 2024", to distinguish between the categories of HASOC
English and Bangla comments. The shared task consists of two subtasks Task 1 and Task 2. As no
training data is given for Task 1, this task is addressed as ZSL with label descriptions and the models: i)
Zero_CS_LD - label descriptions for the class labels obtained from the HASOC 2021 Subtask 2 dataset,
representing them using HingBERT-Mixed model, and assigning the class labels to the test set based on
the cosine similarity between the semantic representations of the test sample and the label descriptions
of the predefined classes, ii) Zero_CS_KW+LD - label descriptions for the class labels obtained through
keywords from the HASOC 2021 Subtask 2 dataset and manually curated keywords, representing
them using HingBERT-Mixed model, and assigning the class labels to the test set based on the cosine
similarity between the semantic representations of the test sample and the label descriptions of the
predefined classes, and iii) Zero_NLI_KW+LD - using the same procedure as in Zero_CS_KW+LD for
label descriptions and their representations, it uses a sentence transformer based on
distilbert-basemultilingual-cased NLI model, generating scores for hypotheses based on the class labels ‘HOF’ and
‘NOT’ and selecting the label with the highest score as the predicted classification. Among all the
submitted models, Zero_CS_KW+LD model obtained macro F1 score of of 0.5653 securing 7th rank in
Task 1. The challenges of Task 2 are addressed by implementing MTL models utilizing TL approach with
two transformer models (BERT and DistilMBERT) for identifying HASOC in romanized code-mixed
Bangla text. The proposed MTL model using DistilMBERT obtained macro F1 scores of 0.6761 for
Ofensive_gold task securing 4 th rank and 0.3975 for Target_gold task securing 1st rank, outperforming
the other model. Investigating diverse label description methods will be explored further to enhance the
performance of zero-shot learning models for detecting HASOC in code-mixed Hindi text. Additionally,
various loss functions will be explored for MTL models to identify HASOC in Bangla text and other
languages.</p>
    </sec>
    <sec id="sec-6">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the author(s) used ChatGPT in order to: Grammar and spelling
check. After using this tool/service, the author(s) reviewed and edited the content as needed and take(s)
full responsibility for the publication’s content.
[7] V. Pathak, M. Joshi, P. Joshi, M. Mundada, T. Joshi, Kbcnmujal@
HASOC-Dravidian-Codemixifre2020: Using Machine Learning for Detection of Hate Speech and Ofensive Code-mixed Social
Media Text, 2021.
[8] P. K. Pushp, M. M. Srivastava, Train Once, Test Anywhere: Zero-shot Learning for Text
Classification, in: arXiv preprint arXiv:1712.05972, 2017.</p>
      <p>[9] R. Caruana, Multitask Learning, in: Machine learning, volume 28, Springer, 1997, pp. 41–75.
[10] J. Goldzycher, G. Schneider, Hypothesis Engineering for Zero-shot Hate Speech Detection, in:
arXiv preprint arXiv:2210.00910, 2022.
[11] M. Kowsher, M. S. I. Sobuj, N. J. Prottasha, M. S. Arefin, Y. Morimoto, Contrastive Learning for
Universal Zero-Shot NLI with Cross-Lingual Sentence Embeddings, in: Proceedings of the 3rd
Workshop on Multi-lingual Representation Learning (MRL), 2023, pp. 239–252.
[12] A. Kumar, V. H. C. Albuquerque, Sentiment Analysis using XLM-R Transformer and Zero-shot
Transfer Learning on Resource-poor Indian Language, in: Transactions on Asian and Low-Resource
Language Information Processing, volume 20, ACM New York, NY, 2021, pp. 1–13.
[13] S. Yadav, A. Kaushik, K. McDaid, Leveraging Weakly Annotated Data for Hate Speech Detection
in Code-Mixed Hinglish: A Feasibility-Driven Transfer Learning Approach with Large Language
Models, in: arXiv preprint arXiv:2403.02121, 2024.
[14] W. Dai, T. Yu, Z. Liu, P. Fung, Kungfupanda at SemEval-2020 task 12: Bert-based Multi-task</p>
      <p>Learning for Ofensive Language Detection, in: arXiv preprint arXiv:2004.13432, 2020.
[15] H. Liu, P. Burnap, W. Alorainy, M. L. Williams, Fuzzy Multi-task Learning for Hate Speech Type</p>
      <p>Identification, in: The world wide web conference, 2019, pp. 3006–3012.
[16] P. Kapil, A. Ekbal, Leveraging Multi-domain, Heterogeneous Data using Deep Multitask Learning
for Hate Speech Detection, in: arXiv preprint arXiv:2103.12412, 2021.
[17] S. Mishra, S. Prasad, S. Mishra, Exploring Multi-task Multi-lingual Learning of Transformer Models
for Hate Speech and Ofensive Speech Identification in Social Media, in: SN Computer Science,
volume 2, Springer, 2021, pp. 1–19.
[18] Y. Xian, C. H. Lampert, B. Schiele, Z. Akata, Zero-shot Learning—a Comprehensive Evaluation of
the Good, the Bad and the Ugly, in: IEEE transactions on pattern analysis and machine intelligence,
volume 41, IEEE, 2018, pp. 2251–2265.
[19] T. Mandl, S. Modha, G. K. Shahi, H. Madhu, S. Satapara, P. Majumder, J. Schäfer, T. Ranasinghe,
M. Zampieri, D. Nandini, et al., Overview of the HASOC Subtrack at fire 2021: Hate Speech
and Ofensive Content Identification in English and Indo-Aryan Languages, in: arXiv preprint
arXiv:2112.09301, 2021.
[20] R. Nayak, R. Joshi, L3Cube-HingCorpus and HingBERT: A Code Mixed Hindi-English Dataset and
BERT Language Models, in: Proceedings of the WILDRE-6 Workshop within the 13th Language
Resources and Evaluation Conference, European Language Resources Association, Marseille,
France, 2022, pp. 7–12. URL: https://aclanthology.org/2022.wildre-1.2.
[21] Y. Zhang, Q. Yang, A Survey on Multi-task Learning, in: IEEE transactions on knowledge and
data engineering, volume 34, IEEE, 2021, pp. 5586–5609.
[22] A. Hegde, F. Balouchzahi, K. G, H. L. Shashirekha, Trigger Detection in Social Media Text, in:
CLEF 2023 – Conference and Labs of the Evaluation Forum, 18-21 September 2023, Thessaloniki
Greece, 2023.
[23] B. Prathvi, K. Manavi, K. Subrahmanyapoojary, A. Hegde, G. Kavya, H. Shashirekha, MUCS@
DravidianLangTech-2024: A Grid Search Approach to Explore Sentiment Analysis in Code-mixed
Tamil and Tulu, in: Proceedings of the Fourth Workshop on Speech, Vision, and Language
Technologies for Dravidian Languages, 2024, pp. 257–261.
[24] S. Coelho, A. Hegde, P. Lamani, G. Kavya, H. L. Shashirekha, MUCSD@ DravidianLangTech2023:
Predicting Sentiment in Social Media Text using Machine Learning Techniques, in: Proceedings of
the Third Workshop on Speech and Language Technologies for Dravidian Languages, 2023, pp.
282–287.
[25] K. Girish, A. Hegde, F. Balouchzahi, H. L. Shashirekha, Profiling Cryptocurrency Influencers with</p>
      <p>Sentence Transformers., in: CLEF (Working Notes), 2023, pp. 2599–2607.
[26] M. Zampieri, S. Malmasi, P. Nakov, S. Rosenthal, N. Farra, R. Kumar, SemEval-2019 task 6:
Identifying and Categorizing Ofensive Language in Social Media (ofenseval), in: arXiv preprint
arXiv:1903.08983, 2019.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>N.</given-names>
            <surname>Raihan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Ghosh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Gaur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Dave</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zampieri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Jaki</surname>
          </string-name>
          , T. Mandl,
          <source>Overview of the HASOC Track at FIRE</source>
          <year>2024</year>
          :
          <article-title>Hate-Speech Identification in English and Bengali</article-title>
          , in: K. Ghosh,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Majumder</surname>
          </string-name>
          , D. Ganguly (Eds.),
          <source>Forum for Information Retrieval Evaluation (Working Notes) (FIRE 2024) December</source>
          <volume>9</volume>
          -13, Gandhinagar, India, CEUR-WS.org,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>K.</given-names>
            <surname>Ghosh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Raihan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Gaur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Dave</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zampieri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Jaki</surname>
          </string-name>
          , T. Mandl,
          <source>Overview of the HASOC Track at FIRE</source>
          <year>2024</year>
          :
          <article-title>Hate-Speech Identification in English and Bengali</article-title>
          ,
          <source>in: FIRE '24: Proceedings of the 16th Annual Meeting of the Forum for Information Retrieval Evaluation. December</source>
          <volume>9</volume>
          -13, Gandhinagar, India, Association for Computing Machinery (ACM), New York, NY, USA,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Madhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Majumder</surname>
          </string-name>
          ,
          <article-title>Overview of the HASOC Subtrack at FIRE 2021: Conversational Hate Speech Detection in Code-mixed language</article-title>
          .,
          <source>in: FIRE (Working Notes)</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>20</fpage>
          -
          <lpage>31</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. K.</given-names>
            <surname>Shahi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Madhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Ranasinghe</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Zampieri, Overview of the HASOC subtrack at fire 2021: Hate Speech and Ofensive Content Identification in English and Indo-Aryan Languages and Conversational Hate Speech</article-title>
          ,
          <source>in: Proceedings of the 13th Annual Meeting of the Forum for Information Retrieval Evaluation</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>3</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>V.</given-names>
            <surname>Dikshitha Vani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Bharathi</surname>
          </string-name>
          ,
          <article-title>Hate Speech and Ofensive Content Identification in Multiple Languages using Machine Learning Algorithms</article-title>
          ,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Anusha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Shashirekha</surname>
          </string-name>
          ,
          <article-title>An Ensemble Model for Hate Speech and Ofensive Content Identification in Indo-European Languages</article-title>
          .,
          <source>in: FIRE (Working Notes)</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>253</fpage>
          -
          <lpage>259</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>