<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>OfenSwitch: Decoding Toxicity in Dravidian Code-Mixing with Transformers</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Krishna Tewari</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Supriya Chanda</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>K Abhinay Paul</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Bennett University</institution>
          ,
          <addr-line>Greater Noida</addr-line>
          ,
          <country country="IN">INDIA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Indian Institute of Technology (BHU)</institution>
          ,
          <addr-line>Varanasi</addr-line>
          ,
          <country country="IN">INDIA</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2026</year>
      </pub-date>
      <abstract>
        <p>Ofensive language detection is a vital task in natural language processing, especially given the rise of multilingual code-mixed text on social media platforms. This paper presents a shared task on ofensive language identification in Dravidian code-mixed languages: Tamil-English, Malayalam-English, Kannada-English, and TuluEnglish organized as part of FIRE 2025. The task aims to address the unique challenges posed by code-switching, morphological richness, and the use of non-native scripts endemic to Dravidian languages. Participants are required to classify social media comments into categories including ofensive, not-ofensive, targeted ofensive, and untargeted ofensive (group / individual). We provide a gold-standard dataset curated from YouTube comments on diverse topics and encourage the development of models capable of robust ofensive language detection across these low-resource, multilingual settings. Our team, IReL@IIT-BHU, participated using fine-tuned XLMRoBERTa models with and without early stopping. Across the four languages, our systems consistently ranked within the top-6, achieving 4th place in Kannada and Tamil, 5th in Malayalam, and 6th in Tulu, thereby establishing multilingual transformers as a strong baseline for Dravidian code-mixed ofensive language identification.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Code-Mixed</kwd>
        <kwd>Ofensive</kwd>
        <kwd>Hate</kwd>
        <kwd>NLP</kwd>
        <kwd>Dravidian</kwd>
        <kwd>Tamil</kwd>
        <kwd>Kannada</kwd>
        <kwd>Tulu</kwd>
        <kwd>Malayalam</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Ofensive language detection has become a critical task in Natural Language Processing (NLP),
particularly in the era of rapid digital communication where user-generated content proliferates on social
media platforms. Online spaces such as YouTube, Facebook, and Twitter host millions of daily
discussions that range from entertainment to politics and social issues. While these platforms foster open
dialogue, they also provide fertile ground for the spread of ofensive content, ranging from abusive
language and hate speech to cyberbullying and targeted harassment. Left unchecked, such behavior can
escalate, harming individuals, polarizing societies, and undermining the safety of online communities.</p>
      <p>The growing concern over ofensive language has motivated the development of automatic
detection systems. However, this task is far from trivial due to several challenges. Ofensive expressions
can be subtle, context-dependent, and highly variable across communities. Furthermore, multilingual
societies introduce an additional layer of complexity: code-mixing, a phenomenon in which speakers
switch between two or more languages within the same sentence or discourse. Code-mixed data is
widespread in multilingual regions such as South India, where languages like Tamil, Malayalam,
Kannada, and Tulu are often mixed with English in informal communication. Detecting ofensive language
in such data is challenging for traditional NLP models, which are predominantly trained on
monolingual corpora and fail to capture the dynamics of code-switching across linguistic levels (lexical,
morphological, and syntactic). The issue is compounded when users write these languages in non-native
scripts, further complicating text normalization and understanding.</p>
      <p>To address these challenges, the FIRE 2025 shared task focuses on ofensive language identification
in code-mixed Dravidian languages: Tamil-English, Malayalam-English, Kannada-English, and
Tulu</p>
      <p>
        English [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Participants are provided with a gold-standard dataset curated from YouTube comments
spanning news, entertainment, and socio-political discussions. The task requires classifying each
instance into categories such as ofensive, not-ofensive, targeted ofensive, and untargeted ofensive
(group / individual). The intended applications are wide-ranging, including assisting social media
platforms in content moderation, supporting law enforcement agencies in identifying online abuse, and
enabling brands to monitor consumer sentiment while maintaining a civil discourse. Beyond
immediate use cases, the task aims to foster the development of robust multilingual and code-mixed NLP
models that can generalize efectively across languages and cultures.
      </p>
      <p>The rest of this paper is organized as follows: Section 2 discusses related work; Section 3 describes
the dataset; Section 4 presents our proposed methodology; Section 5 reports results and analysis; and
Section 6 concludes with key findings and future directions.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        Initial eforts in ofensive language detection focused on monolingual English texts, employing
techniques such as machine learning on tweets and forum discussions to model hate and ofensive content
[
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ]. As social media environments have become increasingly multilingual, researchers have
recognized the unique linguistic challenges presented by code-mixed text. The HASOC shared task and
subsequent FIRE challenges demonstrated that systems designed for monolingual data often
underperform on Hindi-English and Bengali-English code-mixed datasets, underscoring the need for methods
tailored to linguistic mixing and complex orthographic patterns [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ].
      </p>
      <p>
        For Hindi-English code-mixing, Bohra et al. developed a dedicated dataset for hate speech detection
and experimented with methods including CRFs, SVMs, and neural models, highlighting the necessity
of code-mixed resources and hybrid features [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Mathur et al. extended this line of work by focusing on
ofensive tweet classification with both deep learning and attention-based approaches, observing that
classical and neural architectures each capture diferent facets of Hinglish data [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Transformer-based
and multilingual models such as BERT, mBERT, and XLM-R have since demonstrated state-of-the-art
results in Hindi-English code-switching scenarios, with studies confirming the critical advantages of
transfer learning and contextual embeddings [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>
        Barman et al. studied language identification as a precursor for downstream classification, finding
that code-mixed Bengali-English presents further segmentation and tokenization hurdles, which can
be partially addressed through subword models and character-level representations [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. More recently,
Chakravarthi et al. investigated multilingual embeddings and transfer learning, finding consistent
improvements in both sentiment analysis and hate speech detection in low-resource Indo-Aryan and
Dravidian code-mixed settings [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
      </p>
      <p>Within the Dravidian language context, Chakravarthi et al. developed new benchmarks for
TamilEnglish, Malayalam-English, and Kannada-English code-mixed data, as part of initiatives such as
DravidianCodeMix and FIRE shared tasks [11, 12, 13]. These works established protocols for ofensive
content annotation and showcased the potential of deep learning, including CNNs and MPNet, for
handling morphological and script variation in these under-resourced languages [14].</p>
      <p>Chanda and Pal’s early system at FIRE 2020 addressed sentiment analysis for code-mixed social
media text using both classical and neural models, identifying data sparsity and script diversity as
core bottlenecks [15]. Building upon this, Chanda et al. demonstrated that fine-tuning pre-trained
transformer models yielded superior results for hate speech detection in Indo-Aryan and code-mixed
contexts, motivating wider adoption of multilingual architectures [16]. Their later work leveraged
multilingual embeddings for fine-grained conversational hate speech detection, successfully
distinguishing nuanced and context-dependent ofenses in Dravidian social streams [ 17]. The group also
tackled related tasks such as sarcasm detection in code-mixed Tamil-English and Malayalam-English,
and highlighted preprocessing innovations like efective stopword removal [ 18, 19, 20]. Furthermore,
their deep learning-based approaches for hate speech detection in Tulu-English and other low-resource
Dravidian pairs expanded both the data and methodological frontiers for safer digital spaces [21, 22].</p>
      <p>The FIRE 2025 shared task extends these advances by releasing curated, gold-standard datasets for
Dravidian code-mixed ofensive language detection and by promoting benchmark evaluation of neural
models for Tulu, Tamil, Malayalam, and Kannada paired with English [23, 24]. These benchmark eforts
include advanced methods based on MPNet and CNNs [14], and motivate continued research in robust
code-mixed NLP for South Asian languages.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Dataset</title>
      <p>The organizers of the shared task on Ofensive Language Identification in Dravidian Code-Mixed
Languages at FIRE 2025 provided annotated datasets for four language pairs: Tulu-English,
KannadaEnglish, Malayalam-English, and Tamil-English. For each language, the data was divided into training,
development, and testing splits. Each instance in the dataset consists of a code-mixed social media
sentence and a corresponding ofensive language label.</p>
      <p>For the Tulu-English dataset, four categories were defined. The detailed statistics for this dataset are
reported in Table 1. It can be observed that the class distribution is imbalanced, with the not_ofensive
category being the majority class.</p>
      <p>For the other three language pairs Kannada-English, Malayalam-English, and Tamil-English the label
set was more fine-grained. Multiple types of ofensive expressions were included. The statistics of these
datasets are summarized in Table 2. Here too, the data is skewed towards the Not_ofensive category,
especially in the case of Malayalam and Tamil.</p>
      <p>The datasets reflect the natural distribution of code-mixed conversations on social media, where
nonofensive instances dominate but ofensive targeted insults are of particular interest for downstream
classification. This imbalance motivated us to adopt transformer-based methods that are robust to such
skewed label distributions.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Methodology</title>
      <p>Kannada-English</p>
      <p>Malayalam-English</p>
      <p>Tamil-English
Train
In this work, we focus on the task of ofensive language identification in Tulu-English code-mixed
data as part of the FIRE 2025 shared task. The objective is to classify each sentence into one of four
categories: ofensive , not-ofensive , targeted ofensive , or untargeted ofensive (group / individual) . To
address this, we adopt a transformer-based architecture leveraging the multilingual capabilities of
XLMRoBERTa.</p>
      <sec id="sec-4-1">
        <title>4.1. Data Preprocessing</title>
        <p>The dataset provided by the shared task organizers was divided into training, validation, and test splits.
Each sentence was paired with a corresponding label indicating its ofensive language category. We
employed the LabelEncoder from the scikit-learn library to map the textual labels into numerical
representations. The text data was then tokenized using the XLMRobertaTokenizer, with a maximum
sequence length of 128 tokens. Tokenization produced input_ids and attention_mask, which were
padded and truncated as necessary.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Model Architecture</title>
        <p>We fine-tuned the xlm-roberta-base model released by Facebook AI Research. The model was
extended with a classification head to predict across the four output labels. We implemented a
PyTorchbased dataset wrapper to eficiently handle tokenized sequences and labels during training and
evaluation.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Training Strategy</title>
        <p>The model was fine-tuned using the AdamW optimizer with a learning rate of 1 × 10−6 and a batch size
of 32. The loss function used was cross-entropy loss, suitable for multi-class classification. We trained
the model for up to 100 epochs while applying early stopping with a patience of 10 epochs to prevent
overfitting. At each epoch, training and validation losses, along with accuracies, were recorded. The
best performing model checkpoint on the validation set was retained for final evaluation.</p>
      </sec>
      <sec id="sec-4-4">
        <title>4.4. Evaluation and Prediction</title>
        <p>The trained model was evaluated on the held-out test set, and standard classification metrics such as
precision, recall, and F1-score were reported using the scikit-learn classification_report function.
Additionally, the fine-tuned model was used to generate predictions on the unlabeled data provided
in the shared task. The model outputs were mapped back to the original label names using the fitted
label encoder. We also plotted the training and validation curves for loss and accuracy across epochs
to visualize the learning behavior of the model.</p>
        <p>Overall, this methodology leverages the multilingual contextual representations of XLM-RoBERTa,
combined with careful preprocessing and early stopping, to robustly address the challenge of ofensive
language identification in code-mixed Tulu-English social media text.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <p>The performance of our submitted systems was evaluated on four Dravidian code-mixed languages,
namely Tulu, Kannada, Malayalam, and Tamil. The organizers provided the oficial leaderboard results,
as shown in Tables 3-6. Our team participated under the name IReL@IIT-BHU with two submitted
runs: Run 1 (without early stopping) and Run 2 (with early stopping).</p>
      <p>In the Tulu ofensive language identification task, our system achieved an mF1 score of 0.710,
securing the 6th rank overall (Table 3). We observed that applying early stopping helped to stabilize
validation performance during training, but the improvement in the final test score was not substantial
compared to the baseline run.</p>
      <p>For the Kannada dataset, our submission with early stopping (Run 2) obtained an mF1 score of 0.430,
placing us at the 4th rank (Table 4). This indicates that our approach generalized reasonably well to
Kannada, with performance comparable to other top systems.</p>
      <p>In the Malayalam task, our system (Run 1) achieved an mF1 score of 0.667, ranking 5th among all
participating teams (Table 5). While the best-performing team achieved 0.778, our result
demonstrates that multilingual pre-trained transformers such as XLM-RoBERTa can provide a competitive
baseline even for less-resourced Dravidian languages.</p>
      <p>For Tamil, our early-stopped model (Run 2) obtained an mF1 score of 0.448, securing the 4th rank
on the leaderboard (Table 6). This result is close to the top-performing systems, where the leading
team achieved an mF1 score of 0.465, showing that our method is efective for Tamil code-mixed text
as well.</p>
      <p>Across the four languages, our systems consistently ranked within the top-6, with the best relative
performance achieved for Kannada (4th rank) and Tamil (4th rank). The results demonstrate the
efectiveness of fine-tuned XLM-RoBERTa for ofensive language detection in Dravidian code-mixed
settings. Furthermore, we observed that incorporating early stopping generally improved model
robustness, although the efect varied across languages. These results suggest that multilingual
transformer models can serve as a strong baseline for ofensive language identification in under-resourced
code-mixed languages.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>In this study, we tackled the issue of detecting ofensive language in code-mixed Dravidian languages,
specifically Tulu-English, Tamil-English, Malayalam-English, and Kannada-English. We used
finetuned multilingual transformers (XLM-RoBERTa) with and without early stopping. Our systems
consistently ranked in the top six across all tasks. We achieved 6th place in Tulu (mF1 = 0.710), 5th in
Malayalam (mF1 = 0.667), and 4th place in both Kannada (mF1 = 0.430) and Tamil (mF1 = 0.448). These
results show that transformer-based models are efective in handling linguistic variety, non-native
scripts, and frequent code-switching. They also highlight the potential of multilingual pre-trained
models as strong starting points for Dravidian languages. We found that early stopping generally
improved training stability and robustness, but the benefits varied by language. Despite these
encouraging results, we still face challenges in capturing subtle contextual cues, handling ambiguous or sarcastic
expressions, and managing limited annotated data. Future work can examine how to adapt
multilingual transformers for specific tasks, use data augmentation techniques to address resource shortages,
and implement cross-lingual transfer strategies to take advantage of structural similarities among
Dravidian languages. Additionally, incorporating outside knowledge sources and developing explainable
methods will be crucial for ensuring transparency and cultural sensitivity in classification. Overall, this
project provides valuable resources and benchmarks while creating opportunities for more efective
and inclusive ofensive language detection in underrepresented multilingual contexts.</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors used ChatGPT, Grammarly in order to: Grammar and
spelling check, Paraphrase and reword. After using these tools, the authors reviewed and edited the
content as needed and take full responsibility for the publication’s content.
track on sentiment analysis for dravidian languages at fire 2020, in: Proceedings of the Forum
for Information Retrieval Evaluation, 2020, pp. 21–27.
[11] B. R. Chakravarthi, K. Priyadharshini, V. Muralidaran, J. P. McCrae, M. Arunmozhi, T. Mandl,
Findings of the shared task on ofensive language identification in dravidian languages at fire
2021, in: Proceedings of the Forum for Information Retrieval Evaluation, 2021, pp. 32–43.
[12] B. R. Chakravarthi, R. Priyadharshini, V. Muralidaran, N. Jose, S. Suryawanshi, E. Sherly, J. P.</p>
      <p>McCrae, Dravidiancodemix: Sentiment analysis and ofensive language identification dataset for
dravidian languages in code-mixed text, Language Resources and Evaluation 56 (2022) 765–806.
[13] B. R. Chakravarthi, R. Priyadharshini, N. Jose, T. Mandl, P. K. Kumaresan, R. Ponnusamy, J. P.</p>
      <p>McCrae, E. Sherly, et al., Findings of the shared task on ofensive language identification in
tamil, malayalam, and kannada, in: Proceedings of the first workshop on speech and language
technologies for Dravidian languages, 2021, pp. 133–145.
[14] B. R. Chakravarthi, M. B. Jagadeeshan, V. Palanikumar, R. Priyadharshini, Ofensive language
identification in dravidian languages using mpnet and cnn, International Journal of Information
Management Data Insights 3 (2023) 100151. URL: https://www.sciencedirect.com/science/article/
pii/S2667096822000945. doi:https://doi.org/10.1016/j.jjimei.2022.100151.
[15] S. Chanda, S. Pal, Irlab@ iitbhu@ dravidian-codemix-fire2020: Sentiment analysis for
dravidian languages in code-mixed text, in: Working Notes of FIRE (Forum for Information Retrieval
Evaluation), 2020, pp. 535–540.
[16] S. Chanda, S. Ujjwal, S. Das, S. Pal, Fine-tuning pre-trained transformer based model for hate
speech and ofensive content identification in english indo-aryan and code-mixed (english-hindi)
languages, in: Working Notes of FIRE, 2021, pp. 446–458.
[17] S. Chanda, S. Sheth, S. Pal, Coarse and fine-grained conversational hate speech and ofensive
content identification in code-mixed languages using fine-tuned multilingual embedding, in:
Working Notes of FIRE, 2022, pp. 502–512.
[18] S. Chanda, A. Mishra, S. Pal, Sarcasm detection in tamil and malayalam dravidian code-mixed
text, in: Working Notes of FIRE, 2023, pp. 336–343.
[19] S. Chanda, S. Pal, The efect of stopword removal on information retrieval for code-mixed data
obtained via social media, volume 4, 2023, p. 494.
[20] S. Chanda, K. Tewari, A. Mukherjee, S. Pal, Leveraging chatgpt and xlm-roberta for sarcasm
detection in dravidian code-mixed languages, in: Proceedings of FIRE (Working Notes), Forum for
Information Retrieval Evaluation, 2024, India, 2024. URL: https://ceur-ws.org/Vol-4054/T4-14.pdf.
[21] S. Chanda, A. Mishra, S. Pal, Advancing language identification in code-mixed tulu texts:
Harnessing deep learning techniques, in: Working Notes of FIRE, 2023, pp. 223–230.
[22] S. Chanda, A. Dhaka, S. Pal, Towards safer online spaces: Deep learning for hate speech detection
in code-mixed social media conversations, in: Companion Publication of the 16th ACM Web
Science Conference, 2024.
[23] S. N, B. R. Chakravarthi, T. Durairaj, B. Bharathi, S. C. Navaneethakrishnan, P. K. Kumaresan,
A. M D, P. R. Hegde, D. Vikram, Overview of the shared task on ofensive language identification
in dravidian code-mixed languages, in: Forum of Information Retrieval and Evaluation FIRE-2025,
2025.
[24] A. M. D, D. Vikram, B. R. Chakravarthi, P. R. Hegde, Overcoming low-resource barriers in tulu:
Neural models and corpus creation for ofensive language identification, 2025. URL: https://arxiv.
org/abs/2508.11166. arXiv:arXiv:2508.11166.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>N.</given-names>
            <surname>Sripriya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. R.</given-names>
            <surname>Chakravarthi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Durairaj</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Bharathi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. C.</given-names>
            <surname>Navaneethakrishnan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. K.</given-names>
            <surname>Kumaresan</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. M D</surname>
            , P. R. Hegde,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Vikram</surname>
          </string-name>
          ,
          <article-title>Overview of the shared task on ofensive language identiifcation in dravidian code-mixed languages, in: Forum of Information Retrieval and Evaluation FIRE-</article-title>
          <year>2025</year>
          ,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Waseem</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Hovy</surname>
          </string-name>
          ,
          <article-title>Hateful symbols or hateful people? predictive features for hate speech detection on twitter</article-title>
          ,
          <source>in: Proceedings of the NAACL Student Research Workshop</source>
          , Association for Computational Linguistics, San Diego, California,
          <year>2016</year>
          , pp.
          <fpage>88</fpage>
          -
          <lpage>93</lpage>
          . URL: https://aclanthology.org/ N16-2013/.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>T.</given-names>
            <surname>Davidson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Warmsley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Macy</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Weber</surname>
          </string-name>
          ,
          <article-title>Automated hate speech detection and the problem of ofensive language</article-title>
          ,
          <source>in: Proceedings of the International AAAI Conference on Web and Social Media</source>
          , volume
          <volume>11</volume>
          ,
          <year>2017</year>
          , pp.
          <fpage>512</fpage>
          -
          <lpage>515</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>G. K.</given-names>
            <surname>Shahi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>M. Shah, Overview of the hasoc track at fire 2019: Hate speech and ofensive content identification in indo-european languages</article-title>
          ,
          <source>in: Proceedings of the 11th Forum for Information Retrieval Evaluation (FIRE</source>
          <year>2019</year>
          ), Kolkata, India,
          <year>2019</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. K.</given-names>
            <surname>Shahi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. K.</given-names>
            <surname>Jaiswal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Nandini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Patel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Majumder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schäfer</surname>
          </string-name>
          ,
          <article-title>Overview of the hasoc track at fire 2020: Hate speech and ofensive content identification in indo-european languages</article-title>
          ,
          <source>in: CEUR Workshop Proceedings, Forum for Information Retrieval Evaluation</source>
          , Hyderabad, India,
          <year>2020</year>
          , pp.
          <fpage>87</fpage>
          -
          <lpage>111</lpage>
          . URL: https://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>2517</volume>
          /
          <fpage>T3</fpage>
          -4.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A.</given-names>
            <surname>Bohra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Vijay</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. S.</given-names>
            <surname>Akhtar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Shrivastava</surname>
          </string-name>
          ,
          <article-title>A dataset of hindi-english code-mixed social media text for hate speech detection</article-title>
          ,
          <source>in: Proceedings of the Second Workshop on Computational Modeling of People's Opinions</source>
          , Personality, and
          <article-title>Emotion's in Social Media</article-title>
          ,
          <source>Association for Computational Linguistics</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>36</fpage>
          -
          <lpage>41</lpage>
          . URL: https://aclanthology.org/W18-5118/.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>P.</given-names>
            <surname>Mathur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sawhney</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ayyar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Shah</surname>
          </string-name>
          ,
          <article-title>Did you ofend me? classification of ofensive tweets in hinglish language</article-title>
          ,
          <source>in: Proceedings of the 2nd Workshop on Abusive Language Online (ALW2)</source>
          ,
          <source>Association for Computational Linguistics</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>138</fpage>
          -
          <lpage>148</lpage>
          . URL: https://aclanthology. org/W18-5118/.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>T.</given-names>
            <surname>Ghosal</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. Das</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Kumar</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Jonnalagadda</surname>
            ,
            <given-names>M. R.</given-names>
          </string-name>
          <string-name>
            <surname>Singh</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Chakraborty</surname>
          </string-name>
          ,
          <article-title>Sentiment analysis of hindi-english code-mixed social media text</article-title>
          ,
          <source>in: Proceedings of the 7th Workshop on South Asian Languages and Linguistics (WSAL2019)</source>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>U.</given-names>
            <surname>Barman</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. Das</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Silva</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Bandyopadhyay</surname>
          </string-name>
          ,
          <article-title>Code-mixing: A challenge for language identification in the language of social media</article-title>
          ,
          <source>in: Proceedings of The First Workshop on Computational Approaches</source>
          to Code Switching,
          <source>Association for Computational Linguistics</source>
          ,
          <year>2014</year>
          , pp.
          <fpage>13</fpage>
          -
          <lpage>23</lpage>
          . URL: https://aclanthology.org/W14-3902/.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>B. R.</given-names>
            <surname>Chakravarthi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Muralidaran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Priyadharshini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>McCrae</surname>
          </string-name>
          , Overview of the
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>