=Paper= {{Paper |id=Vol-2600/short7 |storemode=property |title=A Study in Practical Solutions to Sarcasm Detection with Machine Learning and Knowledge Engineering Techniques |pdfUrl=https://ceur-ws.org/Vol-2600/short7.pdf |volume=Vol-2600 |authors=Chia Zheng Lin,Michal Ptaszynski,Masui Fumito,Gniewosz Leliwa,Michal Wroczynski |dblpUrl=https://dblp.org/rec/conf/aaaiss/LinPMLW20 }} ==A Study in Practical Solutions to Sarcasm Detection with Machine Learning and Knowledge Engineering Techniques== https://ceur-ws.org/Vol-2600/short7.pdf
 A Study in Practical Solutions to Sarcasm Detection with Machine Learning and
                       Knowledge Engineering Techniques

Chia Zheng Lin*, Michal Ptaszynski*, Masui Fumito*, Gniewosz Leliwa**, Michal Wroczynski**
                     *Graduate School of Computer Science, Kitami Institute of Technology, Japan
                                              **Samurai Labs, Poland
{chiazhenglin}@gmail.com, {ptaszynski,f-masui}@cs.kitami-it.ac.jp, {gniewosz.leliwa, michal.wroczynski}@samurailabs.ai


                            Abstract                                    Researchers’ interest in analysing this profound type of
                                                                     figurative and creative use of language grew along with the
  In this paper we tackle the problem of sarcasm detection with      dramatic increase in the everyday use of social media over
  the use of machine learning and knowledge engineering tech-
  niques. Sarcasm detection is considered a complex and chal-
                                                                     the past decade. Especially, Twitter has become one of the
  lenging task in Natural Language Processing and has been           most popular venues for people to express their opinions,
  studied by various researchers in the past decade. To get a        share their thoughts and report real-time events, etc. More-
  grasp on the present state of the art in sarcasm detection, we     over, the huge amount of data has drawn interest of compa-
  review the important previous research in this field, with a fo-   nies for the purpose of studying the opinion of people to-
  cus on text-based sarcasm detection in English texts. In the       wards different products, facilities and events. It has been
  proposed method, we compare various dataset preprocess-            suggested that the nature of tweets makes them the most
  ing techniques on the proposed Deep Convolutional Neural           suitable for studying sarcasm detection approaches (Bouaz-
  Network model. As a result, the most specific, or least pre-       izi and Otsuki 2016).
  processed dataset ranked as the one with the highest perfor-          However, the lack of empirical investigations into opti-
  mance. However, we observed that some level of data prepro-
  cessing could become useful in the task of sarcasm detection.
                                                                     mal approaches for sarcasm detection is a serious oversight
                                                                     in many related studies carried out throughout the years. Im-
                                                                     portantly, there have been no studies comparing the differ-
                        Introduction                                 ences in the preprocessing and manipulation of the dataset
                                                                     to improve the results of detection.
Sarcasm, often used together or interchangeably with irony,
is considered an important component of human communi-                  To contribute to dealing with the above-mentioned prob-
cation recognized as some of the most prominent and perva-           lems, in this paper we investigate the variations in sar-
sive devices of figurative and creative language widely used         casm detection results caused by differences in applied pre-
from dating back to ancient religious texts to modern times          processing techniques typically used in NLP research but
(Ghosh and Veale 2017).                                              not applied before in works focusing on sarcasm detection.
                                                                     To do that most effectively, we firstly review previous re-
   Van Hee (2017) suggested the important implications of
                                                                     lated research on text-based sarcasm detection from En-
irony and sarcasm for Natural Language Processing (NLP)
                                                                     glish tweets, describe the implemented dataset preprocess-
tasks, which aim to explain construct of human language,
                                                                     ing techniques, and discuss the results of an experiment per-
and the large potential in the domain of text mining. In the
                                                                     formed to compare preprocessing techniques implemented
recent years, there has been an increasing interest in, espe-
                                                                     on the dataset. As a result, we managed to observe the im-
cially, automatic sarcasm detection and classification, which
                                                                     pact contributed by hashtags and labels related to sarcasm.
have been widely studied as a type of sentiment analysis task
                                                                        Finally, Ptaszynski et al. (2010) in their research on devel-
(detecting whether a sentence conveys a positive or negative
                                                                     oping an expert system for Internet Patrol pointed out that,
connotation, or in this case: sarcastic or non-sarcastic). Es-
                                                                     especially with regard to the increased popularity of SNS,
pecially, Kumar et al. (2017) surveyed some representative
                                                                     sarcasm has been often used in personal attacks, such as cy-
work in the related area and categorized most of the popular
                                                                     berbullying and concluded that sarcasm detection is one of
approaches into three types, namely, rule-based, statistical,
                                                                     the important problems in cyberbullying detection. There-
and deep learning-based approaches. We analyse some of
                                                                     fore, as one of the practical applications, in this research we
that research in the next section.
                                                                     will verify how effective is sarcasm detection in the detec-
Copyright c 2020 held by the author(s). In A. Martin, K. Hinkel-     tion of cyberbullying.
mann, H.-G. Fill, A. Gerber, D. Lenat, R. Stolle, F. van Harmelen
(Eds.), Proceedings of the AAAI 2020 Spring Symposium on Com-                        Research Background
bining Machine Learning and Knowledge Engineering in Practice
(AAAI-MAKE 2020). Stanford University, Palo Alto, California,        The word sarcasm originates from an Ancient Greek word
USA, March 23-25, 2020. Use permitted under Creative Commons         sarkasmós and means ”to tear flesh, bite the lip in rage,
License Attribution 4.0 International (CC BY 4.0).                   sneer.” According to Oxford dictionary (2019), sarcasm is a
way of using words that are the opposite of what one means            Deep Learning approaches have been successfully
in order to be unpleasant to somebody or to make fun of            brought into the scene of sarcasm detection when Amir
them. They also described irony to be the use of words that        (2016) used a standard binary classification with Convolu-
say the opposite of what you really mean, often as a joke.         tional Neural Network (CNN) while Poria (2016) imple-
   The relationship between irony and sarcasm has been con-        mented a combination of CNNs trained on different tasks.
fused in many studies. In the literature, two types of irony are   Popular Deep Learning algorithms include CNN (LeCun et
widely considered: verbal irony and situational irony. While       al. 1998) and Long Short Term Memory (LSTM) (Hochre-
situational irony involves an incongruence between two sit-        iter and Schmidhuber 1997). Ghosh and Veale (2017) pro-
uations, verbal irony, although applying verbal, or semantic       posed a network model composed of CNN followed by an
incongruence, is a statement in which the meaning that a           LSTM network which outperformed many other models at
speaker employs is sharply different from the meaning that         that time. They utilized CNN to reduce frequency varia-
is ostensibly expressed. Hence, verbal irony is considered         tion through convolutional filters and extract discriminating
different from situational irony in that it is produced inten-     word sequences as a composite feature map for the LSTM
tionally by the speakers.                                          layer. Then the output of the LSTM layer was passed to a
   When it comes to sarcasm, Van Hee (2017) defines it to          fully connected Deep Neural Network (DNN) layer, produc-
be a form of verbal irony with an aggressive tone, is directed     ing a higher order feature set based on the LSTM output.
at someone or something, and is used intentionally. Hence             Following the Semantic Evaluation 2018 international
the term “irony” and “sarcasm” are used interchangeably in         workshop Task 3: Irony Detection in English Tweets (2018)
many related studies. In this study, we decided to not focus       which received submissions from 43 teams worldwide for
on distinguishing between sarcasm and irony, and instead           the binary classification task A, deep learning algorithms
implement the general term “sarcasm” throughout the paper.         were further explored and optimized for irony detection
                                                                   tasks. The best ranked system submitted by team THU NGN
Previous Research                                                  (2018) consisted of densely connected LSTM network with
Tepperman (2006)’s spoken dialogue system used feature             multi-task learning strategy. Another system from one of the
extraction approach for sarcasm detection as a subtask in          top teams, NTUA-SLP (2018), which used an ensemble of
their system, by which they introduced sarcasm detection           two bi-directional LSTM network-based models, achieved
into the scene of Nature Language Processing. One study            comparable results. The submissions represented a variety
by Davidov (2010) utilized tweets and Amazon reviews for           of neural network-based approaches and other popular clas-
text-based sarcasm detection, and Tsur (2010) proposed one         sification algorithms including SVM, Random Forest, and
of the first attempts to use feature engineering and statistical   Naı̈ve Bayes (Van Hee, Lefever, and Hoste 2018). Over-
classifiers to detect sarcasm.                                     all, the approaches with ensemble learners were the current
   A number of studies have sought to detail the recent            trend to tackle the challenges in sarcasm detection.
trend in sarcasm detection approaches, which can roughly be
classified into three parts: rule-based, statistical, and deep-                       Proposed Method
learning approaches (Kumar, Somani, and Bhattacharyya
2017; Barbieri 2017). Rule-based approaches attempt to
                                                                   Dataset Preprocessing
identify irony through specific evidence which could be cap-        In the majority of recent studies applying machine learn-
tured in terms of rules that rely on indicators of sarcasm.         ing methods to text classification, the datasets are usu-
Barberi (2017) argued that rule-based approaches which re-          ally used in their most basic form, namely, represented
quire no training mostly rely on lexical information and do         as tokens (words, punctuation, etc.), despite a wide vari-
not perform as well as statistical approaches. Riloff (2013)        ety of knowledge-based NLP systems (e.g., stemmers, part-
aimed to recognize positive words in negative sentences             of-speech taggers, etc.) capable of initial preprocessing of
while presenting a bootstrapping algorithm that automati-           datasets, thus providing more informative features to ML
cally learns the rules from certain situations.                     algorithms. Therefore in this research we performed addi-
   Most of the early works on sarcasm detection applied             tional preprocessing to the dataset to verify usefulness of
statistical approaches which varied in terms of features            such knowledge-base systems in ML.
and learning algorithms, basically composed of two phases              For the implemented dataset, each tweet was first trans-
where data were converted into feature vectors before be-           formed into lowercase and emojis were represented with
ing classified using machine learning algorithm. Some of            their corresponding labels (e.g. :smileyface:) using Emoji
the most often used algorithms include Support Vector Ma-           for Python (2019). All tagged users (e.g. @user123) and
chines (SVM), and Naı̈ve Bayes. One of the first attempts           URLs (e.g. http://google.com/) appearing in the text were
in this approach by Tsur (2010) compiled a set of sarcas-           replaced with specific neutral labels, such as ” tagged ” and
tic patterns composed of common combinations of words               ” url .” The first dataset preprocessing technique to be used
extracted from sarcastic examples. Gonzalez-Ibanez (2011)           in this study is shown below.
composed a model with three pragmatic features which were          1. Only basic preprocessing.
positive emoticons, negative emoticons, and users’ tagging.         To verify the depth of dependence of sarcasm detection on
Reyes (2013) proposed another model based on four fea-              hashtags, all of the hashtags (e.g. #sarcasm) in the next 5
tures, signatures, unexpectedness, style and polarity, and          versions of the dataset shown below were replaced with a
emotional scenarios.                                                general label, e.g., “ hashtag .”
2. URLs, tagged users and hashtags replaced with labels.           Veale (2017) and consists of 51,189 tweets (24,453 sarcas-
 Furthermore, we applied the knowledge-based tools for lan-        tic tweets and 26,736 non-sarcastic tweets) in which sarcas-
 guage processing provided by NLTK (2019).                         tic tweets were automatically collected from Twitter using
3. Stemming of all words using Porter Stemmer (2019)               user’s self-declaration of sarcasm/irony with sarcastic and
4. Stopwords removal with NLTK built-in Stopwords Filter-          ironic hashtags (e.g. #irony, #sarcasm) and annotated for
    ing Tool                                                       confirmation. All seven dataset versions were implemented
5. Stemming of all words after stopwords removal                   with different data preprocessing methods.
6. PoS tagging using NLTK Universal Part-of-Speech Tagset
 Finally we have our last dataset 7 to have its social media       Experiment Setup
 markers such as hashtags, URLs, and tagged users removed          All seven separate versions of the dataset (represented with
 instead of being replaced with labels.                            various preprocessing techniques) were analysed in the ex-
7. Tagged users, URLs, and hashtags removed                        periment using the proposed CNN method in the setting of
 Below are three examples of a tweet, with hashtags (dataset       a 10-fold cross validation procedure. The results were cal-
 1), with hashtags replaced with labels (dataset2), and with       culated using standard balanced F-score (F1) which is the
 hashtags removed (dataset7).                                      harmonic mean of Precision and Recall.
 monday morning is my favorite! #sarcasm
 monday morning is my favorite! _hashtag_                          Results and Discussion
 monday morning is my favorite!                                    Table 1 shows the summary of all results from the 7 datasets
                                                                   with different preprocessing techniques applied. Dataset 1
Feature Weighting                                                  which is the dataset with all the hashtags included yielded
                                                                   an F1 score of 0.997. Compared to our previous work (Chia,
Traditional weight calculation scheme was applied to all ver-      Ptaszynski, and Masui 2019) which tested on a smaller data
sions of the dataset. In particular, we used term frequency        set with only 4,618 tweets and attained an F1 score of 0.844
with inverse document frequency (tf*idf). Term frequency           with similar settings (hashtags included), this shows the sig-
tf (t,d) refers here to the traditional raw frequency, which is    nificant increase in the performance of the CNN model with
the number of times a term t (word, token) occurs in a doc-        the increase of the size of the dataset. This suggests that the
ument d. Inverse document frequency idf (t,D) is the log-          model is tied to the size of the implemented dataset and the
arithm of the total number of documents D containing the           number of extracted features.
term t. Finally tf*idf refers to the term frequency multiplied        The results of dataset 1 (hashtags included) also enhance
by inverse document frequency as in equation 1.                    our understanding of the impact of hashtags, which make
                                        |D|                        a great difference in sarcasm and irony detection, especially
                       idf (t, D) = log                     (1)    in Twitter messages. However, due to the natural characteris-
                                         nt
                                                                   tics of deliberate sarcastic hashtags in Twitter, classification
Applied Classifier                                                 of tweets with hashtags included does not contribute much
                                                                   to the study of sarcasm detection from linguistic point of
Based on our previous work (Chia, Ptaszynski, and Masui            view. However, as the results show, hashtags can be a very
2019; Ptaszynski, Eronen, and Masui 2017), in this study           useful practical mean to handle sarcasm detection with high
we propose to use Convolutional Neural Networks (CNN)              performance.
due to it having the best result for classifying tweets without       While the remaining datasets were stripped of their hash-
ironic hashtags when compared to other classifiers.                tags (replaced with labels), data set 2 has no further prepro-
   CNN are a type of feed-forward artificial neural network        cessing while data set 3 to 6 were further processed with
which is an improved neural network model originally de-           different methods. Interestingly, data set 2 still attained the
signed for image recognition. CNN performance has been             highest F1 score among all the data sets without hashtags
proved useful in various classification tasks including sen-       included. This discovery highlights the importance of lin-
tence classification and NLP (Kim 2014; Ptaszynski, Ero-           guistic features in irony detection and shows that increment
nen, and Masui 2017).                                              in data preprocessing does not always provide better results.
   In the proposed CNN we applied Rectified Linear Units           This is due to the oversimplification of data with many vital
(ReLU) as neuron activation function which is a piece-wise         and important features manipulated or removed while clas-
linear function that will output the input directly if positive,   sification tasks such as irony detection heavily depended on
zero if negative. We also applied dropout regularization. The      them.
CNN consisted of two hidden convolutional layers, contain-            However, further preprocessed data sets have their own
ing 20 and 100 feature maps, respectively, with both layers        value despite attaining lower F1 score. From our observation
having 5x5 patch size and 2x2 max-pooling.                         on the attributes extracted from their confusion matrices in
                                                                   Table 1, their true positive rate is higher than the data set 2
                Evaluation Experiment                              which scored the highest F1 score among the datasets. Data
                                                                   set 5 which implemented both stemming and stop-word re-
Dataset Description                                                moval has obtained the highest true positive rate with only
The dataset used in this research was the publicly avail-          290 false positive. This shows the implementation of further
able sarcasm detection dataset collected by Ghosh and              data preprocessing is crucial to the sensitivity of the data.
                    Data set                     True Positive    False Positive   False Negative     True Negative     F-score
 1               With hashtags                      24355               98               72              26664           0.997
 2              Without hashtags                    24055              398              5068             21668           0.898
 3             Stemming applied                     24013              440              5172             21564           0.895
 4            Stopwords removed                     24009              444              5183             21553           0.895
 5     Stemming and Stopwords removed               24163              290              5590             21146           0.892
 6            PoS Tagging applied                   23904              549              5171             21565           0.893
 7    Hashtags, URL, tagged users removed           16509             7944              8677             18059           0.665
                               Table 1: Results from seven datasets with different preprocessing.
 Dataset 1     occ   Dataset 2      occ     Dataset 7     occ       or irony in textual communication especially on social net-
 #sarcasm      71     hashtag      5445       love       1656       work service.
  sarcasm      60     tagged       1639        like      1216
   tagged      51      love        413         not       1211       Application in Automatic Cyberbullying Detection
    love       22      great       275        good       752
    great       8       not        245        great      709        Although the number of research on sarcasm and irony de-
     not        8       best       133        hate       488        tection grows each year, practical implementation of such
                                                                    models have not been widely discussed. Ptaszynski et al.
Table 2: Top 6 error feature occurrences for dataset 1, 2 and
                                                                    (Ptaszynski et al. 2010) mentions, that sarcasm poses a prob-
7 (occ = occurrence)
                                                                    lem in cyberbullying (CB) detection. Therefore, aiming to
   Finally for the last dataset 7 which had all of its social       improve their expert system for automated Internet Patrol,
media markers, such as tagged users (e.g. @user123), URLs,          we propose a practical implementation of sarcasm detection
and hashtags completely removed, Table 1 shows that the re-         in cyberbullying detection.
sult dropped significantly to an F1 score of 0.665 comparing           To quantify the extent to how such model would be use-
to other datasets. This case has shown the impact of the la-        ful, we applied the model trained on sarcastic dataset 2 and
bels which were supposed to be neutral to the classification.       tested on the cyberbullying detection dataset provided by
Comparing to dataset 2 which had the social media markers           Ptaszynski et al. (2018) which consists of 12,728 data sam-
replaced with labels, the significant increase in false nega-       ples. The result attained an F-score of 0.889 which is com-
tives shows that the presence of the labels provides heavy          parable to the result of dataset 2 with an F-score of 0.898
contribution to the precision of the classification.                above. Interestingly, it was also much higher than models
                                                                    trained on purely cyberbullying-related data (Ptaszynski et
Error Analysis                                                      al. 2018). This observation shows the prevalence of sarcasm
                                                                    in cyberbullying, and proves the practical applicability of
Table 2 shows the occurrences of top 6 error features ex-
                                                                    sarcasm detection in other tasks.
tracted from dataset 1 (with hashtags), dataset 2 (hashtags
replaced with labels) and dataset 7 (hashtags, URLs, and
tagged users removed) after removing prepositions, con-                                    Conclusion
junctions, and pronouns which do not contribute much to
the classifications. For dataset 1, the error feature which oc-     In this paper, to find practical solutions for sarcasm detec-
curred the most is the #sarcasm following the word sarcasm.         tion on Twitter, we compared various dataset preprocessing
This shows that even the sarcastic hashtags cannot assist the       methods and observed the impact of the preprocessed labels.
model to achieve 100% sensitivity.                                     We firstly reviewed previous related works on text-based
   For the dataset 2 results in the second column, the label        sarcasm detection, where we covered various types of sys-
 hashtag appeared 5445 times out of the 5466 misclassified          tems, such as rule-based, statistical, or deep learning-based.
instances (99.62%). Coming up next is the label tagged              Next, we compared datasets with various preprocessing on
which appeared 1639 times while the remaining words such            the proposed CNN model.
as “love”, “great”, “not”, and “best”, which are popular er-           The first dataset with hashtags included scored an F1 of
rors in all the 3 implemented datasets. As previously noticed,      0.9965, thus proving the dependence on hashtags in sar-
the supposedly neutral labels, in fact contribute heavily to        casm detection. Next, the dataset with the least preprocess-
the precision of the classification. Therefore, removing them       ing ranked the highest among all datasets without hashtags
does not provide improvement to the results.                        included. However, we observed that data preprocessing is
   The evidences so far provide further support for the hy-         still crucial to the sensitivity of data Lastly, this research
pothesis that deliberate sarcastic hashtags play a significant      serves as a base for future studies on application of sarcasm
role in sarcasm detection in tweets. Taken together, these re-      detection in other tasks, such as cyberbullying detection.
sults also suggest that hashtag is the product of authors who          In the future, we also plan to further improve the proposed
understand that their sarcastic phrases alone may not be suf-       method with more and diverse features and test it on larger
ficient for the audience to figure out the intended irony or        datasets, also with other preprocessing techniques. We will
sarcasm. However, these findings do not completely solve            also focus on optimizing the feature extraction and the clas-
the general sarcasm detection nor do they redefine sarcasm          sifier model.
                        References                                 Oxford.          2019.        Oxford learner’s dictionary.
                                                                   https://www.oxfordlearnersdictionaries.com/.
Amir, I.; Wallace, B. C.; Lyu, H.; Carvalho, P.; and Silva,
M. J. 2016. Modelling context with user embeddings for             Poria, S.; Cambria, E.; Hazarika, D.; and Vij, P. 2016. A
sarcasm detection in social media. In Proceedings of the 20th      deeper look into sarcastic tweets using deep convolutional
SIGNLL Conference on Computational Natural Language Learn-         neural networks. COLING 2016.
ing (CoNLL). Association for Computational Linguistics.            Porter, M.       2019.    The porter stemming algorithm.
Barbieri, F. 2017. Machine learning methods for under-             https://tartarus.org/martin/PorterStemmer/.
standing social media communication: Modeling irony and            Ptaszynski, M.; Dybala, P.; Matsuba, T.; Masui, F.; Rzepka,
emojis. Department DTIC.                                           R.; Araki, K.; and Momouchi, Y. 2010. In the service of
Baziotis, C.; Athanasiou, N.; Papalampidi, P.; Kolovou, A.;        online order: Tackling cyber-bullying with machine learning
Paraskevopoulos, G.; Ellinas, M.; and Potamianos, A. 2018.         and affect analysis. In International Journal of Computational
                                                                   Linguistics Research. Hokkaido University.
Ntua-slp at semeval-2018 task 3: Tracking ironic tweets us-
ing ensembles of word and character level attentive rnns.          Ptaszynski, M.; Leliwa, G.; Piech, M.; and Smywinski-Pohl,
In Proceedings of the 12th International Workshop on Seman-        A. 2018. Cyberbullying detection - technical report 2/2018,
tic Evaluation(SemEval-2018). Association for Computational        department of computer science agh, university of science
Linguistics.                                                       and technology.
Bird, S.; Loper, E.; and Klein, E. 2019. Natural language          Ptaszynski, M.; Eronen, J. K. K.; and Masui, F. 2017. Learn-
toolkit. https://www.nltk.org/.                                    ing deep on cyberbullying is always better than brute force.
                                                                   In IJCAI 2017 3rd Workshop on Linguistic and Cognitive Ap-
Bouazizi, M., and Otsuki, T. 2016. A pattern-based ap-             proaches to Dialogue Agents (LaCATODA 2017).
proach for sarcasm detection on twitter. In Digital Object.
IEEE Access.                                                       Reyes, A.; Rosso, P.; and Veale, T. 2013. A multidimen-
                                                                   sional approach for detecting irony in twitter. Lang Re-
Chia, Z. L.; Ptaszynski, M.; and Masui, F. 2019. Exploring         sources and Evaluation.
machine learning techniques for irony detection. In Proceed-
ings of The 33rd Annual Conference of the Japanese Society for
                                                                   Riloff, E.; Qadir, A.; Surve, P.; Silva, L. D.; Gilber, N.; and
Artificial Intelligence (JSAI 2019). Japanese Society of Artifi-
                                                                   Huang, R. 2013. Sarcasm as contrast between a positive
cial Intelligence.                                                 sentiment and negative situation. In Proceddings of the 2013
                                                                   Conference on Empirical Methods in Natural Language Process-
Davidov, D.; Tsur, O.; and Rappoport. 2010. Semi-                  ing(EMNLP 2013). EMNLP.
supervised recognition of sarcastic sentences in twitter and
                                                                   Tepperman, J. 2006. Yeah right: Sarcasm recognition for
amazon. In Proceedings of the 14th Conference on Computa-
                                                                   spoken dialogue system. In Interspeech 2006. ICSLP.
tional Natural Language Learning. Association of Computa-
tional Linguistics.                                                Tsur, O.; Davidov, D.; and Rappoport, A. 2010. Icwsm – a
                                                                   great catchy name: Semi-supervised recognition of sarcastic
Ghosh, A., and Veale, T. 2017. Fracking sarcasm using neu-         sentences in online product reviews. In Proceedings of the
ral network. In Proceedings of NAACL-HLT 2016. Association         4th International AAAI Conference on Weblogs and Social Media.
for Computational Linguistics.                                     Association for the Advancement of Artificial Intelligence.
Gonzalez-Ibanez, R.; Muresan, S.; and Wacholder, N. 2011.          Van Hee, C.; Lefever, E.; and Hoste, V. 2018. Semeval-2018
Identifyinng sarcasm in twitterl: A closer look. In Proceed-       task 3: Irony detection in english tweets. In Proceedings of the
ings of the 49th Annual Meeting of the Association For Computa-    12th International Workshop on Semantic Evaluation (SemEval-
tional Linguistics. Assosiation for Computational Linguistics.     2018). Association for Computational Linguistic.
Hee, C. V. 2017. Can machine sense irony? exploring auto-          Wu, C.; Wu, F.; Wu, S.; Liu, J.; Yuan, Z.; and Huang, Y.
matic irony detection on social media. University Gent.            2018. Thu ngn at semeval-2018 task 3: Tweet irony de-
Hochreiter, S., and Schmidhuber, J. 1997. Long short-term          tection with densely connected lstm and multi-task learn-
memory.                                                            ing. In Proceedings of the 12th International Workshop on Seman-
                                                                   tic Evaluation(SemEval-2018). Association for Computational
Kim, T., and Wurster, K. 2019. Emoji for python.
https://pypi.org/project/emoji/.                                   Linguistics.
Kim, Y. 2014. Convolutional neural networks for sentence
classification. In Proceedings of the 2014 Conference on Empir-
ical Methods in Natural Language Processing (EMNLP). Associ-
ation for Computational Linguistics.
Kumar, L.; Somani, A.; and Bhattacharyya, P. 2017. Ap-
proaches for computational sarcasm detection: A survey.
ACM CSUR.
LeCun, Y.; Bottou, L.; Bengio, Y.; and Haffner, P. 1998.
Gradient-based learning applied to document recognition. In
Proc of the IEEE.