=Paper=
{{Paper
|id=Vol-3681/T5-5
|storemode=property
|title=Unmasking Sarcasm: Sarcastic Language Detection with BiLSTMs
|pdfUrl=https://ceur-ws.org/Vol-3681/T5-5.pdf
|volume=Vol-3681
|authors=Anusha M D,Parameshwar R. Hegde
|dblpUrl=https://dblp.org/rec/conf/fire/DH23
}}
==Unmasking Sarcasm: Sarcastic Language Detection with BiLSTMs==
<pdf width="1500px">https://ceur-ws.org/Vol-3681/T5-5.pdf</pdf>
<pre>
                                Unmasking Sarcasm: Sarcastic Language Detection
                                with BiLSTMs
                                Anusha M D1,∗,† , Parameshwar R. Hegde1,†
                                1
                                 Department of Computer Science, Yenepoya Institute of Arts Science Commerce and Management, Yenepoya (Deemed to
                                be University),Balmata, Mangalore


                                                                         Abstract
                                                                         Across the globe, there’s a clear and growing tendency to incorporate sarcasm into everyday interactions.
                                                                         This rising trend can be primarily attributed to the widespread adoption of sarcasm in daily life, with
                                                                         a particular emphasis on its prevalence on social media and the Internet. Sarcasm poses a significant
                                                                         challenge for sentiment analysis systems as it communicates opinions indirectly, often deviating from
                                                                         literal meanings. There is a growing demand for detecting sarcasm and sentiment in code-mixed social
                                                                         media content written in Dravidian languages. This paper describes the Bidirectional Long Short Term
                                                                         Memory(BiLSTM) model and obtained weighted F1-scores of 0.58 and 0.63 for the Ta-En and Ma-En
                                                                         pairs respectively. These results have been submitted to “Sarcasm Identification of Dravidian Code-
                                                                         Mixed@FIRE-2023” to analyze the sarcasm in Malayalam-English (Ma-En) and Tamil-English (Ta-En)
                                                                         code-mixed texts.

                                                                         Keywords
                                                                         Bi-LSTM, Code-Mixed, Deep Learning, Dravidian Languages, Sarcasm


                                1. Introduction
                                The rise of social media over the past ten years has united all regions of the world into a
                                central hub for enhancing communication [1]. One of the most popular channels for people to
                                share information and voice their opinions is social media. These statistics are used by many
                                governments and corporations to measure public opinion on various goods, entertainment
                                options, and political issues. Sarcasm represents a sophisticated way of conveying emotions,
                                in which the speaker expresses their opinions opposite of what they mean. Sarcasm is often
                                characterized as ironic or satirical which is used to insult, mock, or amuse. Sarcasm takes
                                on various definitions in different dictionaries based on their perspectives [2]. As outlined in
                                the Macmillan English dictionary1 , sarcasm involves ”expressing the opposite of one’s true
                                meaning verbally or in writing, often with the intention of making someone appear foolish or
                                revealing anger.” According to The Random House dictionary 2 , sarcasm entails ”biting or harsh

                                Woodstock’23: Symposium on the irreproducible science, December 15-18, 2023, Woodstock, NY
                                ∗
                                    Corresponding author.
                                †
                                    These authors contributed equally.
                                Envelope-Open anushamd@yenpoya.edu.in (A. M. D); parameshwarhegde@yenepoya.edu.inl (P. R. Hegde)
                                Orcid 0009-0000-3644-1260 (A. M. D)
                                                                       © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
                                    CEUR

                                          CEUR Workshop Proceedings (CEUR-WS.org)
                                    Workshop
                                    Proceedings
                                                  http://ceur-ws.org
                                                  ISSN 1613-0073


                                1
                                  http://www.macmillandictionary.com/
                                2
                                  http://www.thefreedictionary.com/.


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
mockery or irony” and ”a cutting taunt or scornful comment.” The Collins English dictionary 3
characterizes it as ”language that mocks, expresses contempt, or uses irony to convey insults or
disdain.” Merriam-Webster 4 presents another interpretation, describing sarcasm as ”a form of
satirical wit that relies on acerbic, biting, and frequently ironic language, typically aimed at an
individual.
   Several online social networking platforms enable users to share and access posts expressing
their views on various subjects such as products, politics, stock markets, and entertainment
[3]. Additionally, users often compose messages with intricate sentence structures, posing
challenges for both machines and humans to grasp their intended meaning. Consequently,
sentiment analysis, along with its practical application like sarcasm detection, has emerged as a
prominent trend in the field of data mining.
   Based on the analysis of Pranali Chaudhari [3] studies, it can be inferred that there are several
distinctive features in text that play a significant role in the identification of sarcasm. These
features include lexical elements, hyperbole, and pragmatic cues[4]. Various methodologies have
been employed to detect sarcasm. In recent years, a majority of research has leveraged supervised
or semi-supervised machine learning approaches for this purpose [5][6] [3]. Additionally, novel
strategies [7],[8], behavioral approaches [9], and bootstrapping techniques [10] have been
applied in sarcasm detection.
   On other hand, Traditional investigations, such as those by Davidov et al. [11] and Riloff et
al. [10], employed rule-based methods to address sarcasm detection. However, more recent
research [12] has shifted towards the utilization of deep learning techniques to automatically
detect the discriminatory features. In this study, a deep learning-based Bidirectional Long
Short-Term Memory (BiLSTM) Neural network(NN) is employed, which consists of LSTM units
that integrate past and future context information because of which they are showing excellent
performance for sequential modelling problems as well as for Text Classification (TC) [13].
NN models expect numeric values as input. Hence, it is necessary to convert the text data to
numeric representation by building an embedding layer before building a BiLSTM model.


2. Literature Review
Sarcasm identification task has been studied by employing different methods, including lexicon-
based, machine learning, deep learning, or even a hybrid approach. Some of the research works
related to Sarcasm detection are discussed below.
   Ibrahim Abu-Farha et.al [14] introduced ArSarcasm, a dataset designed for detecting sarcasm
in the Arabic language. This dataset was constructed by reevaluating existing Arabic sentiment
analysis datasets, resulting in a collection of 10,547 tweets, with approximately 16% of them
being identified as containing sarcasm. Beyond sarcasm, the dataset also underwent annotation
for sentiment and dialect characteristics. The analysis highlights the highly subjective nature of
these tasks, as indicated by the varying sentiment labels influenced by annotators’ biases. Ex-
periments demonstrate the limitations of state-of-the-art sentiment analyzers when confronted


3
    https://www.collinsdictionary.com/
4
    http://www.merriam-webster.com/
with sarcastic content. Finally, a deep learning model using BiLSTM was trained for sarcasm
detection and the model achieved an F1-score of 0.46.
   Priya Goel et.al [15] aims to narrow the gap between human and machine intelligence when
it comes to recognizing and comprehending sarcasm in online behavior and patterns. The study
utilizes neural techniques such as Long Short-Term Memory (LSTM), Gated Recurrent Unit
(GRU), and a Baseline Convolutional Neural Network (CNN) in an ensemble model designed to
detect sarcasm on the internet. The study used the news Headlines dataset and for the Reddit
dataset and to enhance the accuracy of the proposed model, the necessary dataset is prepared
using pre-trained word embedding models such as fastText, Word2Vec, and GloVe, and their
performance is compared. The objective is to assess the overall sentiment of the writer as either
positive or negative, and whether the text is sarcastic or not, to ensure the intended message
is correctly conveyed to the audience. The final findings indicate that the proposed ensemble
model, when combined with word embeddings, outperformed other state-of-the-art models and
deep learning models considered in the study. It achieved an accuracy rate of approximately
96% for the News Headlines dataset, 73% for the Reddit dataset, and among proposed ensemble
models, the Weighted Average Ensemble obtained the highest accuracy at around 99% and
82% for both datasets, respectively. The use of an ensemble model significantly improved the
stability, precision, and predictive capabilities of the proposed approach.
   Abdelkader El Mahdaouy et.al[16] proposed an end-to-end deep Multi-Task Learning model
for Sentiment Analysis(SA) and sarcasm detection. The ArSarcasm Shared Task consists of two
subtasks for sarcasm detection and SA in Arabic language. This study leverages the MARBERT’s
contextualized word embedding with a multi-task attention interaction module.The MTL model’s
architecture consists of a Bidirectional Encoder Representation from Transformers (BERT) model,
a multi-task attention interaction module, and two task classifiers. The aim is to allow task
interaction and knowledge sharing for both SA and sarcasm detection. The model shows very
promising results on both subtasks.
   Mudoor Devadas Anusha et al [13] proposed Bidirectional Long Short Term Memory (BiLSTM)
model submitted to “Sentiment Analysis of Dravidian Languages in Code-Mixed Text” to analyze
the sentiments in Kannada-English (Kn-En), Malayalam-English (Ma-En), and Tamil-English
(Ta-En) code-mixed texts. In the proposed approach, the code-mixed word embeddings’ are
constructed using the training set of the respective code-mixed language pairs’ and these
embeddings are used to build a Deep Learning (DL) model based on BiLSTM. The proposed
model obtained weighted F1-scores of 0.563, 0.604, and 0.365 for code-mixed Ta-En, Ma-En and
Kn-En language pairs respectively.
   C. I. Eke et. al [17] introduces a context-driven feature methodology for sarcasm identification,
employing a combination of the BERT model, deep learning, and conventional machine learning
to tackle the aforementioned challenges. It leverages two benchmark datasets, namely the
Twitter and Internet Argument Corpus, version two (IAC-v2), for classification across the three
learning models. The first model employs an embedding-based approach within a deep learning
framework, utilizing Bidirectional Long Short-Term Memory (Bi-LSTM), a variant of Recurrent
Neural Network (RNN), and Global Vector representation (GloVe) for word embedding and
context comprehension. The second model is founded on Transformer architecture, employing a
pre-trained Bidirectional Encoder representation and Transformer (BERT). In contrast, the third
model adopts feature fusion, incorporating BERT features, sentiment-related features, syntactic
features, and GloVe embedding features within a conventional machine learning framework.
To assess the effectiveness of this methodology, comprehensive evaluation experiments were
conducted. Remarkably, when applied to two Twitter benchmark datasets, the technique
achieved the highest precision of 98.5% and 98.0%, respectively.


3. Methodology
3.1. Pre-processing
To enhance the classifier’s performance, it is essential to undertake data prepossessing, aimed
at eliminating any unwanted noise. Text pre-processing procedures may differ depending on
the task and the dataset used. The following the preprocessing steps employed in our proposed
approach.

    • Converting all text to lowercase, as character case is irrelevant for TC.
    • Removing numeric and punctuation characters, as they hold no importance for TC.
    • Label encoding, which converts class/category labels into numerical values to make them
      machine-readable.

3.2. Feature Engineering
Next phase is converting Pre-processing text into vectors in this proposed approach, First we
created a dictionary where words from the text are mapped to their respective word embeddings
(vectors). Then word’s vector representation from the ’ftw2v’ model which used as pre-trained
word embedding model. dimension of word embeddings. It’s set to 300, indicating that each
word will be represented as a vector in a 300-dimensional space.

3.3. Model Construction
BiLSTM is the neural network model for text classification, including an embedding layer,
bidirectional LSTM layer, and output layer. The model is initialized as a sequential neural
network, which is a linear stack of layers. Then Embedding layer converts integer-encoded
words into dense vectors of fixed size. These vectors are trainable during model training and
capture semantic relationships between words. Spatial dropout helps prevent overfitting by
randomly setting a fraction of input units to zero at each update during training. The benefit of
Bidirectional LSTM is process sequences in both directions, capturing information from past
and future contexts. This can be particularly useful for understanding the context of words in
a sentence. sigmoid activation is used for binary classification tasks. It assigns probabilities
to each class, and the class with the highest probability is predicted as the output. Output
dimensions of the model are configured based on the number of class labels. The structure of
the BiLSTM model is shown in Figure 1. Overall, this neural network architecture is designed
to effectively process and classify text data into two classes
Figure 1: Structure of the BiLSTM Model


Table 1
Distribution of labels in the given dataset
                                                                              Validate
                     Language-Pair              Labels          Train Set                 Test Set
                                                                                Set
                                              Sarcastic           2259          588        2263
                   Malayalam-English
                                            Non-Sarcastic         9798          6186       2427
                                              Sarcastic           7170          1820       2264
                      Tamil- English
                                            Non-Sarcastic         19866         4939       6186

Table 2
Results of the proposed models on the Development sets
                               Language-Pair           Precison      Recall    F1-score
                              Malayalam-English          0.83         0.81       0.82
                                Tamil- English           0.78         0.77       0.77


4. Experimental setup and Results
The dataset5 [18] provided by the organizers of the Sarcasm Identification task for Dravidian
Languages Malayalam and Tamil in DravidianCodeMix contains Train , Development, and Test

5
    https://codalab.lisn.upsaclay.fr/competitions/13540participate-get-data
sets. This task involves Tamil-English and Malayalam-English datasets containing YouTube
video comments. The comments in the dataset are predominantly composed in both the
native script and Roman script, featuring either Tamil/Malayalam grammar alongside English
vocabulary or English grammar combined with Tamil/Malayalam vocabulary. Table 1 shows
the distribution of labels in given dataset. Additionally,it is noteworthy that the training dataset
exhibits a significant class imbalance, as depicted in graphical representations provided in
Figures 2 and 3, which may impact the model’s performance and result interpretation.
   Scikit-learn 6 and Keras 7 which are minimalist libraries for deep learning, were employed for
implementing the Python code. A BiLSTM model with word embedding features was applied
to the Test set of all three language pairs, The model obtained weighted F1-scores of 0.68 and
0.63 for the Ta-En and Ma-En pairs respectively. The outcomes of our experiments on the test
dataset have been made available on the ”Sarcasm Identification” task page 8 . Furthermore,
Table 2 displays the performance of the proposed approach on the Development sets of the
Ta-En and Ma-En language pairs


                              Figure 2: Imbalance Distribution of Ta-En Train set


                             Figure 3: Imbalance Distribution of Ma-En Train set


5. Conclusion
In this research presents the details of proposed operational model designed for detecting
sarcasm in code-mixed text written in Malayalam and Tamil. The results of this model has
been submitted to ”Sarcasm Identification of Dravidian Languages (Malayalam and Tamil) in
DravidianCodeMix” a shared task organized by DravidianLangTech. To address the challenge
of categorizing YouTube video comments into predefined categories, this study introduce a
6
  https://scikit-learn.org/
7
  https://keras.io/api/layers/
8
  https://codalab.lisn.upsaclay.fr/competitions/13540learn𝑡 ℎ𝑒𝑑 𝑒𝑡𝑎𝑖𝑙𝑠 − 𝑟𝑒𝑠𝑢𝑙𝑡𝑠
BiLSTM model that utilizes word embeddings as its features. The proposed model achieved F1-
scores of 0.563 and 0.604 for the Ta-En and Ma-En language pairs respectively. The future work
aim to increase existing dataset and leverage advanced technologies to enhance the model’s
performance.


References
 [1] B. R. Chakravarthi, A. Hande, R. Ponnusamy, P. K. Kumaresan, R. Priyadharshini, How
      can we detect homophobia and transphobia? experiments in a multilingual code-mixed
      setting for social media governance, International Journal of Information Management
      Data Insights 2 (2022) 100119.
 [2] B. R. Chakravarthi, N. Sripriya, B. Bharathi, K. Nandhini, S. Chinnaudayar Navaneethakr-
      ishnan, T. Durairaj, R. Ponnusamy, P. K. Kumaresan, K. K. Ponnusamy, C. Rajkumar,
      Overview of the shared task on sarcasm identification of Dravidian languages (Malayalam
      and Tamil) in DravidianCodeMix, in: Forum of Information Retrieval and Evaluation FIRE
     - 2023, 2023.
 [3] P. Chaudhari, C. Chandankhede, Literature survey of sarcasm detection, in: 2017 In-
      ternational conference on wireless communications, signal processing and networking
     (WiSPNET), IEEE, 2017, pp. 2041–2046.
 [4] F. Kunneman, C. Liebrecht, M. Van Mulken, A. Van den Bosch, Signaling sarcasm: From
      hyperbole to hashtag, Information Processing & Management 51 (2015) 500–509.
 [5] E. Fersini, F. A. Pozzi, E. Messina, Detecting irony and sarcasm in microblogs: The role
      of expressive signals and ensemble classifiers, in: 2015 IEEE international conference on
      data science and advanced analytics (DSAA), IEEE, 2015, pp. 1–8.
 [6] D. Bamman, N. Smith, Contextualized sarcasm detection on twitter, in: proceedings of the
      international AAAI conference on web and social media, volume 9, 2015, pp. 574–577.
 [7] P. Liu, W. Chen, G. Ou, T. Wang, D. Yang, K. Lei, Sarcasm detection in social media based
      on imbalanced classification, in: Web-Age Information Management: 15th International
      Conference, WAIM 2014, Macau, China, June 16-18, 2014. Proceedings 15, Springer, 2014,
      pp. 459–471.
 [8] F. Barbieri, H. Saggion, F. Ronzano, Modelling sarcasm in twitter, a novel approach, in:
      proceedings of the 5th workshop on computational approaches to subjectivity, sentiment
      and social media analysis, 2014, pp. 50–58.
 [9] A. Rajadesingan, R. Zafarani, H. Liu, Sarcasm detection on twitter: A behavioral modeling
      approach, in: Proceedings of the eighth ACM international conference on web search and
      data mining, 2015, pp. 97–106.
[10] E. Riloff, A. Qadir, P. Surve, L. De Silva, N. Gilbert, R. Huang, Sarcasm as contrast between
      a positive sentiment and negative situation, in: Proceedings of the 2013 conference on
      empirical methods in natural language processing, 2013, pp. 704–714.
[11] D. Davidov, O. Tsur, A. Rappoport, Semi-supervised recognition of sarcasm in twitter and
      amazon, in: Proceedings of the fourteenth conference on computational natural language
      learning, 2010, pp. 107–116.
[12] S. Poria, E. Cambria, D. Hazarika, P. Vij, A deeper look into sarcastic tweets using deep
     convolutional neural networks, arXiv preprint arXiv:1610.08815 (2016).
[13] M. D. Anusha, H. L. Shashirekha, Bilstm-sentiments analysis in code-mixed dravidian
     languages (2021).
[14] I. A. Farha, W. Magdy, From arabic sentiment analysis to sarcasm detection: The arsarcasm
     dataset, in: Proceedings of the 4th Workshop on Open-Source Arabic Corpora and
     Processing Tools, with a Shared Task on Offensive Language Detection, 2020, pp. 32–39.
[15] P. Goel, R. Jain, A. Nayyar, S. Singhal, M. Srivastava, Sarcasm detection using deep learning
     and ensemble learning, Multimedia Tools and Applications 81 (2022) 43229–43252.
[16] A. E. Mahdaouy, A. E. Mekki, K. Essefar, N. E. Mamoun, I. Berrada, A. Khoumsi, Deep
     multi-task model for sarcasm detection and sentiment analysis in arabic language, arXiv
     preprint arXiv:2106.12488 (2021).
[17] C. I. Eke, A. A. Norman, L. Shuib, Context-based feature technique for sarcasm identifi-
     cation in benchmark datasets using deep learning and bert model, IEEE Access 9 (2021)
     48501–48518.
[18] B. R. Chakravarthi, Hope speech detection in youtube comments, Social Network Analysis
     and Mining 12 (2022) 75.

</pre>