<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Transformer-based Model for Detecting Multilingual Sarcasm in Social Media Posts</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Shraddha Chauhan</string-name>
          <email>shraddha76830@gmail.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Abhinav Kumar</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science and Engineering, Motilal Nehru National Institute of Technology Allahabad</institution>
          ,
          <addr-line>Prayagraj, Uttar Pradesh, 211004</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Electronics and Communication Engineering, Motilal Nehru National Institute of Technology Allahabad</institution>
          ,
          <addr-line>Prayagraj, Uttar Pradesh, 211004</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>The rapid progress of social media like Facebook, Instagram, Linkedin, YouTube, and X has enabled people from diferent linguistic and cultural backgrounds to engage in global conversations. However, this multicultural digital landscape poses various challenges in detecting sarcasm. Sarcasm is characterized by its use of irony to convey mockery and depends on contextual and cultural nuances that can vary dramatically across languages. Sarcastic posts have the potential to invert the overall meaning of phrases, and that's why there is a need to make an accurate sarcasm detection system that detects sarcasm in multilingual languages. Sarcasm detection gained considerable attention, particularly in the English language. However, the exploration of sarcasm detection in Dravidian languages like Tamil and Malayalam remains significantly underdeveloped. These languages present unique challenges due to their morphology, agglutinative nature, and diverse syntactic structures. This paper aims to bridge the gap by exploring sarcasm detection in Dravidian languages, focusing on the challenges posed by code-mixing, dialectal variations. This paper examines the use of three diferent transformer-based models: (i) Distil-mBERT, (ii) mBERT, and (iii) RoBERTa to efectively capture the nuances of sarcasm in these languages. Basically, these transformer models help in detecting subtle expressions of sarcasm by surmounting the challenges posed by code-switching and incorporating cultural context to eventually enhance the performance of systems targeted at detecting sarcasm. The experimental results show the potential of transformers in achieving promising performance in multilingual sarcasm identification, forming the path for further research in this less-explored domain.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Sarcasm</kwd>
        <kwd>NLP</kwd>
        <kwd>Transformers</kwd>
        <kwd>Multilingual</kwd>
        <kwd>Deep Learning</kwd>
        <kwd>Social Media</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The arrival of social media has expanded communication across linguistic and cultural boundaries,
highlighting the need for advanced NLP tools capable of navigating this multilingual landscape [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2, 3</xref>
        ].
Among the innumerable challenges determined by this diversity, sarcasm detection stands out due to
its reliance on complicated contextual and cultural cues. It employs irony to convey a meaning opposite
to the literal interpretation, making it dificult to detect in languages with rich cultural diversity [ 4, 5].
In the context of Dravidian languages, sarcasm detection becomes very complex as these languages
are very diferent from English. People of southern India mostly speak these languages and have
unique linguistic features, structures, and cultural references that are not always straightforwardly
interpretable by standard NLP models. As a result, sarcasm expressed in these languages may be
misinterpreted by conventional sentiment analysis systems, leading to inaccuracies in understanding
and responding to user interactions [6].
      </p>
      <p>Most of the current methods for sarcasm detection depend on statistical and rule-based models,
leveraging linguistic and pragmatic features and elements like sentiment shifts, interjections, and
punctuation to identify sarcasm in posts. Various machine learning models are also used in the literature
to detect sarcasm in various languages efectively [ 7]. Sarcasm detection gained considerable attention,
particularly in the English language, but very little attention is given to code-mixed multilingual</p>
      <p>Dravidian languages. Sarcasm detection in text poses diferent challenges compared to other media
types like images, videos, and speech due to the absence of contextual information and indicators such
as tones and physical gestures [8]. This dificulty is further increased when the text is multilingual, i.e.,
consists of more than one language. Code-mixed text can mix words, phrases, or clauses from diferent
languages. For example, “Edutha vecchathu nokki vaayikkunna pole undu dialoges... very bad trailer."
This comment is a mix of Malayalam and English languages. Its English translation is “The dialogues
feel like they’re just reading what they picked up... very bad trailer.". This comment criticizes the quality
of the dialogue in the trailer, suggesting that it sounds unnatural or forced, leading to an overall
negative impression of the trailer. Humans can easily detect whether the comment’s literal meaning is
sarcastic because we can understand the sentence’s context better than machine learning models.</p>
      <p>Recent advancements in deep learning techniques [9] have drawn significant attention from
researchers due to their remarkable ability to detect sarcasm in social media post. Despite having that
much importance of this task, only a few number of studies have explored deep learning models for
multilingual sarcasm detection in text. social media plays a vital role in daily communication, with
platforms like Facebook and X heavily featuring images and videos. However, text remains the primary
mode of communication, despite its inherent limitations, such as the absence of non-verbal cues. Deep
learning models can learn and understand hierarchical representations of language data, enabling
them to capture complex patterns and relationships that traditional models cannot capture. Therefore,
this work uses deep learning techniques for sarcasm detection. Among these deep learning models,
Transformers have emerged as a robust architecture, with models like Distil-mBERT, mBERT and
RoBERTa setting new benchmarks in diferent NLP tasks [ 10]. The Pre-Trained Transformer-based
models for detecting multilingual sarcasm in social media posts require careful preprocessing of Tamil
and Malayalam text and fine-tuning on our dataset to adapt the pre-trained knowledge to the specific
nuances of sarcastic and non-sarcastic expressions in these languages [11]. Therefore, this paper
explores three diferent deep learning-based models- Distil-mBERT, mBERT and RoBERTa to identify
sarcastic social media posts from Tamil-English and Malayalam-English posts.</p>
      <p>The rest of the paper is organized as follows: Section 2 lists related work for sarcasm identification,
Section 3 discusses proposed methodology. The outcome of the proposed model is listed in Section 4,
error analysis on test data is listed in 5 and the paper is concluded in Section 6.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>As sarcasm detection is gaining significant importance in natural language processing, various
approaches have been explored in the field, but Dravidian languages did not get suficient attention
[12, 13, 14]. Most of the work till now has been done in the English language, but the models that are
good at detecting sarcasm in English did not work efectively in other languages like Tamil, Malayalam,
etc., because of their distinct and unique features and diferences. A new method is needed to detect
sarcasm in the Dravidian language accurately. Many methods mainly rely on rule-based systems, and
lexicons are limited in capturing sarcasm’ complexity efectively. A number of conventional machine
learning models uses bag-of-words and syntactic features to identify sarcastic expressions. However,
these methods also struggle with the contextual and complex nature of sarcasm.</p>
      <p>The introduction of deep learning and transformer-based models marked an important advancement
in sarcasm detection [15, 16, 17]. Models like mBERT have demonstrated significant improvements in
efectively understanding context and semantics. Its bidirectional attention mechanism allows it better
to grasp the irony and contradictions in sarcastic posts. Limited work has been reported in the literature
to identify sarcastic posts from code-mixed social media posts [18]. The models trained on English
datasets fail to generalize eficiently in multilingual settings due to linguistic and cultural diferences.
Recent studies have addressed this gap by exploring sarcasm detection in languages like Arabic, Tamil,
Malayalam, and Marathi [19]. These studies highlight the importance of language-specific models and
the need for diverse training datasets to capture unique sarcasm patterns across diferent languages.</p>
      <p>The research work for identifying Tamil and Malayalam code-mixed sarcastic posts is limited [20].
Existing work in sentiment analysis for these languages focuses on general sentiment classification
rather than sarcasm [21]. Many eforts are made to develop Tamil and Malayalam sentiment analysis
tools and resources. However, these resources lack the granularity required for efective sarcasm
detection [22]. The use of pre-trained transformer models can ofer promising avenues for addressing
these challenges [23]. By adapting models like Distil-mBERT, mBERT and RoBERTa to Dravidian
languages, we aim to handle contextual information and improve sarcasm detection.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <sec id="sec-3-1">
        <title>3.1. Task Description</title>
        <p>The overall flow diagram of the proposed model can be seen in Figure 1. The detailed description of the
tasks, dataset statistics, and methodology can be seen in the subsequent subsections.
Tamil and Malayalam are under resource Dravidian languages, where a few resources are available
mainly for multilingual sarcasm detection. Some of the data samples for Tamil-English and
MalayalamEnglish provided by in the FIRE-2024 workshop with their translation in English can be seen in Table
1. The task is to classify Tamil-English and Malayalam-English social media posts into sarcastic and
not-sarcastic classes.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Dataset Description</title>
        <p>The dataset used to detect sarcasm is imbalanced and code-mixed in Tamil and Malayalam and is
collected from social media [24]. The posts contain more than one sentence, but the average sentence
length of the corpora is 1. Each post is labeled with sentiment polarity (Sarcastic or Non-sarcastic).
The dataset is organized with a significant amount of data for both training and validation purposes.
For training, the Malayalam dataset consists of 13,188 training samples, while the Tamil dataset
includes 29,570 training samples. The validation set contains 2,826 samples for Malayalam and
6,336 samples for Tamil. The Test set contains 2,826 samples for Malayalam and 6,338 samples
for Tamil. The overall data statistics for the Tamil-English and Malayalam-English can be seen in Table 2.</p>
        <p>Three diferent pre-trained multilingual transformer-based models, (i) Distil-mBERT (ii) mBERT and
(iii) RoBERTa were utilized to identify sarcastic social media posts. These models are pre-trained on
vast text datasets, such as Wikipedia, and have an excellent performance when fine-tuned for a wide
range of downstream tasks.</p>
        <p>• Distil-mBERT is a small, fast, cheap and lighter version of BERT. It is highly efective in NLP
tasks, including sarcasm detection, and is trained using the self-supervised learning method,
which helps the model understand patterns, context, and word meanings, which are very critical
for detecting sarcasm. It uses a task called Masked Language Modeling in which some words</p>
        <p>are hidden, and the model is trained to predict those words based on the context [25]. It is
bidirectional and captures the contextual information of words in a sentence, which can help to
understand when a phrase is sarcastic. Fine-tuning Distil-mBERT on the dataset can help the
model learn to identify sarcasm patterns, such as irony, exaggeration, and tone shifts.
• mBERT is pre-trained on massive amounts of text data using two key tasks, which are masked
language modeling and next sentence prediction.[26]. It processes text in a deeply contextual
manner, allowing it to recognize hyperbole, irony, or contrast between expectations and reality.
For example, in the sentence “I love being ignored" mBERT can detect that the word ’love’ is
being used sarcastically because the context provided by "being ignored" negates the usual
positive meaning of ’love’. The self-attention mechanism of mBERT allows the model to weigh
the importance of diferent words in a sentence when predicting the overall meaning. This helps
mBERT to focus on the keywords or phrases that might point to sarcasm. Its deep contextual
understanding helps it to capture the nuances even in short texts. By fine-tuning mBERT on our
datasets and leveraging its bidirectional architecture, mBERT can detect sarcasm in diverse forms,
from subtle irony to exaggerated statements, even in challenging multilingual and code-mixed
settings like social media posts.
• RoBERTa stands for robustly optimized BERT approach[27]. It uses self-attention, which helps
to focus more on an essential part of the sentence that is important in the identification of sarcasm.
It is pre-trained using a self-supervised approach and uses a masking technique during training
to make the model more robust. Sarcasm detection needs not just only individual words but their
contextual meaning in sentences also. Roberta’s bidirectional nature can enable it to consider
both the preceding and following context of each word in a sentence, and it can be crucial for
sarcasm detection. It performs better on social media-like text where informal language, slang,
and symbols abound. It’s large-scale pretraining helps it adapt to nonstandard inputs like emojis,
hashtags, or even informal punctuation. It handles noisy inputs, mixed modalities much better.</p>
        <p>Our work is setting up a classification pipeline using Ktrain library [ 28] and transformer-based
models like Distil-mBERT, mBERT and RoBERTa. After that, we perform data preprocessing in which
we initialize the Ktrain text preprocessing transformer for the model selected. We set a maximum length
of 30 tokens per text input and convert text and labels into a format that the model requires for training.
We trained the pre-trained Transformer model with Adam optimizer, set it up for classification, and
trained all three models for 50 epochs. The learning rate is 5− 5, and the batch size is 32.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Result</title>
      <p>To determine the efectiveness of the proposed transformer-based model, several evaluation metrics
such as confusion matrix, precision, recall, 1-score, and AUC (Area under the ROC curve) and ROC
(Receiver Operating Characteristic) are used.</p>
      <sec id="sec-4-1">
        <title>4.1. Evaluation</title>
        <p>• Confusion Matrix: is used to describe the performance of a classification model by comparing
the actual labels with the predicted labels. The rows in the matrix represent the actual classes and
the columns in the matrix represent the predicted classes. The elements in diagonal represents
the correct predictions. while the of diagonal elements represent the incorrect predictions made
by the model (see Table 6).</p>
        <p>The results of Tamil-English and Malayalam-English for Distil-mBERT, mBERT and RoBERTa can
be seen in Tables 3, 4, and 5, respectively. The confusion matrix for the Distil-mBERT model of
TamilEnglish and Malayalam-English can be seen in Figures 2 and 3, respectively. The confusion matrix for
the mBERT model of Tamil-English and Malayalam-English can be seen in Figures 4 and 5, respectively.
Similarly, the confusion matrix for the RoBERTa model of Tamil-English and Malayalam-English can
be seen in Figures 6 and 7, respectively. The ROC curve for the Distil-mBERT model of Tamil-English
and Malayalam-English can be seen in Figures 8 and 9, respectively. The ROC curve for the mBERT
Precision =</p>
        <p>+  
Recall =</p>
        <p>+</p>
        <p>P × R</p>
        <p>F1 Score = 2 × P + R
model of Tamil-English and Malayalam-English can be seen in Figures 10 and 11, respectively. The
ROC curve for the RoBERTa model of Tamil-English and Malayalam-English can be seen in Figures 12
and 13, respectively. All the results reported here were obtained after the release of the labeled testing
dataset to show the class-wise performance of the models.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Error Analysis</title>
      <p>Analyzing code-mixing and language switching within text-data where model predict wrong label can
provide deeper insights into the complexities that models face in real-world social media text. For
example:- Consider the text: Shavakallarayile Kuzhimaadathile Peril Oru Letter marach pedich lalettan
ninnappo Onnu Nadungi
Tokens : [‘Shavakallarayile’, ‘Kuzhimaadathile’, ‘Peril’, ‘Oru’, ‘Letter’, ‘marach’, ‘pedich’, ‘lalettan’,
‘ninnappo’, ‘Onnu’, ‘Nadungi’]
Language Sequence of tokens: [‘tr’, ‘sw’, ‘id’, ‘de’, ‘no’, ‘pl’, ‘it’, ‘it’, ‘fi’, ‘fr’, ‘id’]
It is a Non-Sarcastic text but our model misinterpret it as Sarcastic. As there are 9 language switches
in this text and this frequent switching creates a highly dynamic context where the model needs to
constantly adjust between diferent linguistic representations.</p>
      <p>In the error analysis of three transformer models for the sarcasm detection task on code-mixed
social media posts, mBERT, Distil-mBERT, and RoBERTa perform diferently because of their handling
capabilities of code-mixing, language switching, and emojis. The mBERT does well in handling language
switching but is not robust enough in informal components such as the usage of emojis and slang found
typically in social media and X. The RoBERTa performs very well in informal language understanding
and social media text but with poor performance regarding code-mixed data due to monolingual
pretraining. Though Distil-mBERT was eficient, it lost the capability to handle code-mixing and mixed
modalities together, hence bringing less accuracy to sarcasm detection. In summary, the models face
varied challenges with the complexities of the social media text, especially with frequent language
shifting and the use of emojis.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>Despite the increasing popularity of social media and the limited work on Dravidian code-mixed sarcasm
identification, we developed a framework for sarcasm detection in two code-mixed Dravidian corpora,
Malayalam and Tamil. Our proposed approach of fine-tuning the multilingual Transformer model
yields a comparative performance, achieving competitive scores in DravidianCodeMix@FIRE-2024.
This paper shows a comparative study of three transformer-based models—Distil-mBERT, mBERT
and RoBERTa in detecting sarcasm in two Dravidian languages: Malayalam and Tamil. For Sarcasm
detection in Tamil, RoBERTa performance is better than the other two models with accuracy and macro
average precision, recall, and 1-score of 80%, 74%, 73%, and 73%, respectively. For Sarcasm detection in
Malayalam, mBERT performance is better than the other two models with accuracy and macro average
precision, recall, and 1-score of 86%, 77%, 68%, and 72%, respectively.</p>
      <p>Our finding demonstrates that RoBERTa performs well on the Tamil dataset but struggles with the
Malayalam dataset. For Malayalam, mBERT proved to be more efective, outperforming the other
models, whereas in Tamil, mBERT achieved a competitive performance. This variation highlights the
challenges presented by the intricacies of social media text, where frequent language switching, the
use of emojis, and code-mixing add layers of complexity. Notably, our dataset contains predominantly
monolingual text, with only a few instances of code-mixing. These findings emphasize the importance
of optimizing models for specific languages to handle the nuanced demands of diverse real-world
code-mixed datasets.</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>The author(s) have not employed any Generative AI tools.
[3] A. Kumar, J. P. Singh, N. P. Rana, Y. K. Dwivedi, Multi-channel convolutional neural network
for the identification of eyewitness tweets of disaster, Information Systems Frontiers 25 (2023)
1589–1604.
[4] R. Pandey, A. Kumar, J. P. Singh, S. Tripathi, Hybrid attention-based long short-term memory
network for sarcasm identification, Applied Soft Computing 106 (2021) 107348.
[5] R. Pandey, A. Kumar, J. P. Singh, S. Tripathi, A hybrid convolutional neural network for sarcasm
detection from multilingual social media posts, Multimedia Tools and Applications (2024) 1–29.
[6] T. Yue, X. Shi, R. Mao, Z. Hu, E. Cambria, Sarcnet: A multilingual multimodal sarcasm detection
dataset, in: Proceedings of the 2024 Joint International Conference on Computational Linguistics,
Language Resources and Evaluation (LREC-COLING 2024), 2024, pp. 14325–14335.
[7] Y. Kumar, N. Goel, Ai-based learning techniques for sarcasm detection of social media tweets:</p>
      <p>State-of-the-art survey, SN Computer Science 1 (2020) 318.
[8] G. H. Aleryani, W. Deabes, K. Albishre, A. E. Abdel-Hakim, Impact of emoji exclusion on the
performance of arabic sarcasm detection models, 2024. URL: https://arxiv.org/abs/2405.02195.
arXiv:2405.02195.
[9] S. Lakshmi, 10 - sarcasm detection using deep learning in natural language processing, in:
D. J. Hemanth (Ed.), Computational Intelligence Methods for Sentiment Analysis in Natural
Language Processing Applications, Morgan Kaufmann, 2024, pp. 187–205. URL: https://www.
sciencedirect.com/science/article/pii/B9780443220098000136. doi:https://doi.org/10.1016/
B978-0-443-22009-8.00013-6.
[10] M. A. Galal, A. Hassan Yousef, H. H. Zayed, W. Medhat, Arabic sarcasm detection: An enhanced
ifne-tuned language model approach, Ain Shams Engineering Journal 15 (2024) 102736. URL:
https://www.sciencedirect.com/science/article/pii/S2090447924001114. doi:https://doi.org/
10.1016/j.asej.2024.102736.
[11] E. Hashmi, S. Y. Yayilgan, S. Shaikh, Augmenting sentiment prediction capabilities for code-mixed
tweets with multilingual transformers, Social Network Analysis and Mining 14 (2024) 86.
[12] R. Bhukya, S. Vodithala, Deep learning based sarcasm detection and classification model, Journal
of Intelligent &amp; Fuzzy Systems (2024) 1–14.
[13] Y. Liu, M. Chi, Q. Sun, Sarcasm detection in hotel reviews: a multimodal deep learning approach,</p>
      <p>Journal of Hospitality and Tourism Technology (2024).
[14] C. Thaokar, J. K. Rout, M. Rout, N. K. Ray, N-gram based sarcasm detection for news and social
media text using hybrid deep learning models, SN Computer Science 5 (2024) 163.
[15] A. Nandi, K. Sarkar, A. Mallick, A. De, A survey of hate speech detection in indian languages,</p>
      <p>Social Network Analysis and Mining 14 (2024) 70.
[16] J. Dai, A bert-based with fuzzy logic sentimental classifier for sarcasm detection, in: 2024 7th
International Conference on Advanced Algorithms and Control Engineering (ICAACE), 2024, pp.
1275–1280. doi:10.1109/ICAACE61206.2024.10548550.
[17] M. Amal, R. Boujelbane, M. Ellouze, ANLP RG at StanceEval2024: Comparative evaluation of
stance, sentiment and sarcasm detection, in: N. Habash, H. Bouamor, R. Eskander, N. Tomeh,
I. Abu Farha, A. Abdelali, S. Touileb, I. Hamed, Y. Onaizan, B. Alhafni, W. Antoun, S. Khalifa,
H. Haddad, I. Zitouni, B. AlKhamissi, R. Almatham, K. Mrini (Eds.), Proceedings of The Second
Arabic Natural Language Processing Conference, Association for Computational Linguistics,
Bangkok, Thailand, 2024, pp. 788–793. URL: https://aclanthology.org/2024.arabicnlp-1.90. doi:10.
18653/v1/2024.arabicnlp-1.90.
[18] M. E. Hassan, M. Hussain, I. Maab, U. Habib, M. A. Khan, A. Masood, Detection of sarcasm in
urdu tweets using deep learning and transformer based hybrid approaches, IEEE Access 12 (2024)
61542–61555. doi:10.1109/ACCESS.2024.3393856.
[19] A. Ameur, S. Hamdi, S. B. Yahia, Domain adaptation approach for arabic sarcasm detection in
hotel reviews based on hybrid learning, Procedia Computer Science 225 (2023) 3898–3908.
[20] H. Ghous, M. H. Malik, J. Altaf, S. Nayab, I. Sehrish, S. A. Nawaz, Navigating sarcasm in multilingual
text: An in-depth exploration and evaluation, Journal of Computing &amp; Biomedical Informatics
(2024).
[21] A. Rawat, S. Kumar, S. S. Samant, Hate speech detection in social media: Techniques, recent trends,
and future challenges, Wiley Interdisciplinary Reviews: Computational Statistics 16 (2024) e1648.
[22] L. S. Kumar, A. Hegde, B. R. Chakravarthi, H. Shashirekha, R. Natarajan, S. Thavareesan, R.
Sakuntharaj, T. Durairaj, P. K. Kumaresan, C. Rajkumar, Overview of second shared task on sentiment
analysis in code-mixed tamil and tulu, in: Proceedings of the Fourth Workshop on Speech, Vision,
and Language Technologies for Dravidian Languages, 2024, pp. 62–70.
[23] O. Nimase, S. Hong, When do "more contexts" help with sarcasm recognition?, 2024. URL: https:
//arxiv.org/abs/2403.12469. arXiv:2403.12469.
[24] B. R. Chakravarthi, S. N, B. B, N. K, T. Durairaj, R. Ponnusamy, P. K. Kumaresan, K. K.</p>
      <p>Ponnusamy, C. Rajkumar, Overview of sarcasm identification of dravidian languages in
DravidianCodeMix@FIRE-2024, in: Forum of Information Retrieval and Evaluation FIRE - 2024,
DAIICT , Gandhinagar, 2024.
[25] V. Sanh, L. Debut, J. Chaumond, T. Wolf, Distilbert, a distilled version of bert: smaller, faster,
cheaper and lighter, ArXiv abs/1910.01108 (2019).
[26] J. Devlin, M. Chang, K. Lee, K. Toutanova, BERT: pre-training of deep bidirectional transformers
for language understanding, CoRR abs/1810.04805 (2018). URL: http://arxiv.org/abs/1810.04805.
arXiv:1810.04805.
[27] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, V. Stoyanov,
Roberta: A robustly optimized BERT pretraining approach, CoRR abs/1907.11692 (2019). URL:
http://arxiv.org/abs/1907.11692. arXiv:1907.11692.
[28] A. S. Maiya, ktrain: A low-code library for augmented machine learning, 2022. URL: https://arxiv.
org/abs/2004.10703. arXiv:2004.10703.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. K.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <article-title>Explainable bert-lstm stacking for sentiment analysis of covid-19 vaccination</article-title>
          ,
          <source>IEEE Transactions on Computational Social Systems</source>
          (
          <year>2023</year>
          )
          <fpage>1</fpage>
          -
          <lpage>11</lpage>
          . doi:
          <volume>10</volume>
          .1109/TCSS.
          <year>2023</year>
          .
          <volume>3329664</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <article-title>Deep neural networks for location reference identification from bilingual disaster-related tweets</article-title>
          ,
          <source>IEEE Transactions on Computational Social Systems</source>
          <volume>11</volume>
          (
          <year>2024</year>
          )
          <fpage>880</fpage>
          -
          <lpage>891</lpage>
          . doi:
          <volume>10</volume>
          .1109/TCSS.
          <year>2022</year>
          .
          <volume>3213702</volume>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>