<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Detecting Sarcasm in News Headlines</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Onyinye Chudi-Iwueze</string-name>
          <email>1o.chudi-iwueze@mycit.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Haithem Afli</string-name>
          <email>2haithem.afli@cit.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>ADAPT Centre, Cork Institute of Technology</institution>
          ,
          <addr-line>Cork</addr-line>
          ,
          <country country="IE">Ireland</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2020</year>
      </pub-date>
      <fpage>100</fpage>
      <lpage>111</lpage>
      <abstract>
        <p>The use of sarcasm dates as far back as communication goes. Sarcasm is expressed in words, facial expression and can be noticed in the intonation of voice. In this digital age, sarcastic comments are passed everyday, via tweets, comment sections of different social media outlets, and even in news headlines. Some of these headlines can be misunderstood and taken to mean something different than the original intentions. This leads to a need to detect sarcasm especially in the news and on social media. Detection of sarcasm from text has its challenges because text lacks intonation and deflection of voice that occurs when a sarcastic statement is made vocally by a human. This paper concentrates on the effect of different feature encoding techniques applied to text for feature extraction for machine learning models. A deep learning model is also applied and the results are compared. Prior to feature extraction, data pre-processing techniques like tokenization, removal of stop-words and punctuation are applied by researchers in any work involving text analysis. These pre-processing techniques are widely used and accepted and are also applied in this project. Different feature extraction methods like Count Vectorizer, Term Frequency-Inverse Document Frequency and word embedding were implemented in this experiment. The Support Vector Machine, Naive Bayes and Logistic Regression were the traditional machine learning algorithms used in this research. Convolutional Neural Network was also used. Results of these algorithms are recorded in this paper and these include the F1-score, precision, recall and the accuracy scores. The results of these different algorithms for the methods of feature extraction examined in this paper show that a combination of more than one technique is better suited for the purpose of classifications. The results of this paper show</p>
      </abstract>
      <kwd-group>
        <kwd>Sarcasm Detection</kwd>
        <kwd>Convolutional Neural Networks (CNN)</kwd>
        <kwd>Feature Extraction</kwd>
        <kwd>Machine Learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>In the last decade, there has been an increase in the use of online resources
for dissemination of information. These texts have different characteristics that
are explored by the use of Natural Language Processing techniques of machine
learning for knowledge and insight. One of such important characteristics is the
presence of sarcasm in text. Sarcasm is a complex act of communication that
2</p>
      <p>
        Chudi-Iwueze and Afli
allows speakers the opportunity to express sentiment-rich opinions in an implicit
way [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. There is an abundance of sarcasm in texts distributed online and on
social media. This includes news headlines and other media. Sarcasm is classically
defined as the process of intentional misuse of words to convey some meaning
(usually the opposite of what is said) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. For example, if a person says ‘How
lucky am I? I caught the corona virus!” It is clear that although the words used
are positive, the real meaning is negative, making the statement a sarcastic one.
Sarcasm is characterised by the use of irony that reflects a negative meaning.
      </p>
      <p>
        Although sarcasm is well known and widely used, it is challenging not only for
computers but also for humans to detect promptly. Some humans find it difficult
to identify and understand the use of sarcasm [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Due to this fact, the presence
of sarcasm, if not detected and accounted for can affect other machine learning
ventures like sentiment analysis and opinion mining. This makes the detection
of sarcasm a crucial task. Automated sarcasm detection can be seen as a text
classification problem. Text data is one of the simplest forms of data. Machine
learning algorithms are unable to process non-numerical data and as such the
need for feature extraction arises. It is important to extract useful information
from any forms of data, especially unstructured forms like text data. Feature
extraction and the proper representation of text for classification purposes is an
important factor that affects the accuracy of classification. This paper explores
the use of the Count Vectorizer, Word Level TF-IDF, Character Level TF-IDF
and N-Grams Level TF-IDF on different supervised learning algorithms and the
accuracy of these models is measured and compared. The Word2Vec is also an
efficient feature extraction technique which is widely used. The Word2Vec is
used for the Deep Learning Algorithm explored in this paper. Other methods
like Doc2Vec and LDA are not discussed in this paper as they are outside the
scope of this project.
      </p>
      <p>The rest of this paper is arranged as follows: Section 2 goes over some
related work both in sarcasm detection and feature extraction methods. Section
3 details the methodology and which models were applied for the analysis.
Section 4 outlines the results of the application of the different feature extraction
techniques to the chosen models and describes which performed best.
2
2.1</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <sec id="sec-2-1">
        <title>Feature Extraction</title>
        <p>
          Text extraction describes the process of taking words from text data and
transforming them into a numerical feature set that is useful for a machine
learning classifier [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. Extracting features is useful for all analyses involving text
data in all domains. In the work detailed in [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] the impact of pre-processing
is discussed and reviewed. The researchers determine the importance of ’slang’
and correct spelling. SVM classifier is applied for their experiment. Another
researcher uses vector representations to provide the solution to the problem of
sentiment analysis [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], obtaining an accuracy of (86%) which is relatively high.
Another study considers the use of four data sets and the the use of
Bag-OfWords Features,lexicon and part of speech based features [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. An ensemble
model of SVM, Logistic Regression and Naive Bayes was implemented for the
analysis. The authors in [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] applied three levels of feature extraction techniques.
They also applied three classifiers in their analysis. Another group of researchers
in the work detailed in [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] applied ten different feature extraction techniques
to a sentiment analysis problem. They concluded that feature extraction
techniques applied for a problem have the potential to improve the performance of
a classifier in any given problem. Paper [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] compares the results of 6 algorithms
when features are extracted using Bag-Of-Words and TF-IDF. They conclude
that the TF-IDF gives better results of up to 4% difference. Much of the research
experiments are done with the SVM, Logistic Regression and Naive Bayes
algorithms for the traditional supervised machine applications. This informed the
use of these algorithms for this analysis. A CNN model is also run at the end
of the analysis using the Count Vectorizer vector as the features and also using
features generated with word2vec.
2.2
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Sarcasm Detection</title>
        <p>
          The detection of sarcasm is very crucial in the domain of sentiment analysis and
opinion mining. Different machine learning algorithms have been applied to the
problem. Some researchers used the Naive Bayes Classifier and Support Vector
Machines for analyses of social media data in Indonesia [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. The classifiers used
in the work detailed in [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] performed well for the task and that informed the
decision to use Naive Bayes and the SVM for this analysis. Another body of
work that applied traditional supervised techniques to the problem is described
in [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. They used support vector machines with Tf-IDF and Bag-Of-Words. The
work by Davidov et.al [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] details a technique of sarcasm detection using
semisupervised methods on two different data set, one containing the reviews on
Amazons products and one with tweets collected from Twitter. These researchers
concentrated on features like punctuation, sentence syntax, hashtags used etc.
The proposed system had an accuracy of over 75% [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. The work by [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] shows
the use of Support Vector Machines also for the detection of sarcasm from news
headlines. The optimal method of feature extraction used in this work results in
an accuracy score of about 80%. Some other researchers applied deep learning
techniques to the problem [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. Gathering data for the purpose of sarcasm
detection is challenging and even more so for supervised learning. This is especially
the case for social media data. This research uses some of the methods described
in the previous works with better results for accuracy and other measures as
well.
4
3
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Methodology</title>
      <p>The data used for this analysis was taken from Kaggle. The news headlines were
curated into two different files for competitions on Kaggle. Each file was a JSON
file, containing news headlines and the label named ’is sarcastic’ that indicates
the presence or absence of sarcasm in that headline. The two files were joined
together to create a bigger data set for the analysis. The complete data used
in the analysis contains over 50000 headlines. The data set is not significantly
imbalanced but there were a few more non-sarcastic news headlines than the
sarcastic headlines. Table 1 below shows a few rows in the data and the labels
for those data points.</p>
      <p>
        Headline Is sarcastic
1 former versace store clerk sues over secret ’black code’ for minority shoppers 0
2 why writers must plan to be surprised 0
3 boehner just wants wife to listen, not come up with alternative debt-reduction ideas 1
4 remembrance is the beginning of the task 0
5 4 lessons prison taught me about power and control 0
6 top snake handler leaves sinking huckabee campaign 1
7 courtroom sketch artist has clear manga influences 1
8 stock analysts confused, frightened by boar market 1
9 gillian jacobs on what it’s like to kiss adam brody 0
10 diy: sports equipment closet 0
The first step in the pre-processing pipeline was the conversion of all the text to
lower case. This is done to achieve uniformity of words. Unwanted symbols like
digits and newlines were removed also. All punctuation were also removed from
the data set. After these, the texts were tokenized. Tokenization is the breaking
down of sentences into words [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Stopwords are words that are commonly used
in any language. They do not contribute any contextual meaning to the data
and as such should be eliminated. Removal of stopwords was the last step in
the pre-processing pipeline and the data was thus ready for further preparation
prior to model building.
3.3
      </p>
      <sec id="sec-3-1">
        <title>Feature Extraction</title>
        <p>Count Vectorizer The Count Vectorizer is a simple means used to tokenize and
build vocabulary of known words. It is also used to encode new documents using
the vocabulary that has been created. The product of the count vectorizer is an
encoded vector which has the length of the entire vocabulary of the document
and a count for the number of times each word appeared in the document. Count
Vectorizer was implemented using scikit-learn for this project.</p>
      </sec>
      <sec id="sec-3-2">
        <title>Term Frequency – Inverse Document Frequency (TF-IDF) TF-IDF is</title>
        <p>a popular and well recognized method for the evaluation of the importance of
6</p>
        <p>Chudi-Iwueze and Afli
a word in a document. Term Frequency measures the frequency of a term in a
document. For a variety of documents with different lengths, the probability of
a word occurring more times in a longer document is present. Due to this fact,
normalization is required.</p>
        <p>N</p>
        <p>T F (t) = T
where N = Number of times term t appears in a document
and TN = Total number of terms in the document</p>
        <p>The Inverse Document Frequency is the measure of importance of a term.
This is done to reduce the influence of words e.g ‘of’, ‘the’ that could be used
multiple times in a document but have no real meaning and importance in
context.</p>
        <p>T D</p>
        <p>IDF (t) = loge N D
where TD = Total number of documents
and ND = Number of documents with term t in it</p>
        <p>
          For most traditional applications the bag-of-words model was used. However,
words that occurred frequently in the document were given significant weights
in the Bag-OF-Words model and this affected the accuracy of these results. The
use of TF-IDF was introduced to solve these problems of the typical
Bag-OfWords Model. The IDF of an infrequent term due to the log applied is high and
the IDF for common words is low [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. Three levels of TF-IDF were used for
this analysis. They are the Word-level, N-Gram and Character Level TF-IDF.
All these are implemented using the scikit-learn package of python.
Word2Vec Word embeddings are vector representations of words and word2Vec
is one of the most popular methods of word embedding. The word2Vec method
was first proposed in the work of Mikolov et al [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] as a method of feature
extraction for Natural Language Processing (NLP). It is an improvement on the
traditional methods of feature extraction. The construction of word embedding
using neural networks and can be doe using either the Common Bag of Words
or Skip Gram. According to [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ], the Skip Gram method performs better for
smaller data set. The word2Vec vectors applied in this experiment are trained
on the words available in the data set.
3.4
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>Classification algorithms</title>
        <p>
          A couple of classification algorithms were chosen for this experiment. These
algorithms are briefly described below:
Support Vector Machines The Support Vector Machine is efficient for both
classification and regression. It is known to give good results for classification.
The classes are separated by a hyper plane found by the algorithm. The
LinearSVC from scikit-learn was implemented for this analysis.
Naive Bayes The Naive Bayes is a powerful algorithm used for classifying
data based on probabilities [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. This algorithm is based on the Bayes theorem in
statistics. The classification of the data is done using various probabilities. It is
fast and scalable but is not without disadvantages as it assumes independence
among predictors. It works well for small data sets and has been known to give
great results. The Naive Bayes Classifier employed in all stages of the analysis
are built using the scikit-learn package available in python.
        </p>
        <p>Logistic Regression Logistic Regression is a popular classification algorithm.
It belongs to the class of Generalized Linear Models. Its loss function is the
sigmoid function which minimizes the results to be a value between 0 and 1. The
Logistic Regression Model was implemented for this analysis using the function
available in the scikit-learn package.</p>
        <p>Convolutional Neural Networks (CNN) CNNs are very commonly used
in text classification problems due to their success and great results. This has
contributed to the decision to use CNN for this aspect of the experiment. In the
first experiment with CNN, one CNN model is used on the features extracted
using the Count Vectorizer. Another CNN model is also trained on the features
extracted using Word2Vec. The results of these two models are plotted and
compared.
4
4.1</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Experiments and Results</title>
      <sec id="sec-4-1">
        <title>Experiments</title>
        <p>8
4.2</p>
        <p>For the supervised learning techniques, the results to be examined are the
results of the SVM, Naive Bayes and Logistic Regression when text features
are extracted using the Count Vectorizer technique and different levels of the
TF-IDF technique. These results are detailed in Table 3 above and can be seen
graphically in 3 below.</p>
        <p>The bar plot that follows in Figure 3 is pictorial representation of the results
of the analysis performed using the supervised learning methodologies. It is clear
that the SVM outperforms both the Logistic Regression and the Naive Bayes in
all the cases, except the N-Gram level TF-IDF where they all perform similarly
poor. The figures that follow concentrate mainly on the results of the SVM
algorithm due to this fact.</p>
        <p>
          The results of the work done in our research shows that the use of Count
Vectorizer and TF-IDF gives an accuracy of 93% when this feature set is fed
to SVM. This is an improvement on the work of [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] as the methods used here
provide a higher accuracy all round.
        </p>
        <p>The results as seen above corroborate the known fact that Support Vector
Machine usually outperforms other algorithms for classification purposes,
depending on the measure of data available. This is clear in the measure of the
accuracy as seen in the bar plots in Figure 3. The accuracy of the SVM with
the Count Vectorizer is about 90%. The results for the n-gram level TF-IDF are
the worst of all. This is similar to the Bag-Of-Words model and as such the bad
results are expected.</p>
        <p>Figure 4 below shows the training and validation accuracy scores for the
model trained over 5 epochs. Unsurprisingly, the accuracy of training for the
CNN is very close to 100%. The validation accuracy however is about 92.5%.</p>
        <p>The loss also reduces for both the training and validation data.
10</p>
        <p>Chudi-Iwueze and Afli
The training and validation accuracy score is higher when word embedding
is applied to the CNN model. However, the Figure in 5 shows an interesting
pattern for the validation loss when word2vec is applied with the CNN. The
loss reduces and then begins to increase again for the data used in validation.
This suggests that the model does not actually perform as expected when word
embedding is applied. This is due to the fact that word2vec is a neural network
based solution and these work optimally when the volume of data is substantial.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>Sarcasm can be found in a wide variety of areas. To be able to accurately detect
sarcasm from different aspects and topics, a large enough data set will be needed.
The accuracy of prediction as seen in this paper greatly depends on the mode of
feature extraction. The use of N-grams and characters produced less desirable
results than the use of words. The combination of the Count Vectorizer and
the Term Frequency-Inverse Document Frequency for the supervised learning
techniques gave the most satisfactory performance metrics.</p>
      <p>This results of this research favor the supervised machine methods over the
deep learning methods. Although this could be due to the limited amount of
data available. This goes to show that the use of simple methods can produce
great results, especially in the situation that the data available is limited and
not suitable for more complicated deep learning techniques. The CNN performed
satisfactorily also. The use of word embedding for this task showed great
performance in terms of accuracy, both in the training and validation sets so does the
use of the features extracted using the count vectorizer. The overall performance
of both the CNN and supervised learning methods suggests that the use of Count
Vectorizer is most appropriate for this task. As stated above, the volume of data
available limits the strengths of the Word2Vec method, this the Count Vectorizer
outperforms in this situation. This is due to the fact that simpler methods are
always a better choice than more complicated methods provided that the results
are similar or better with the simpler methods.</p>
      <p>Sarcasm can also be multi-modal, contained in images and GIFs. This is
especially available on social media like Twitter where tweets can be replied
with GIFs and videos. Future research would involve an extension of this system
to include multi-modal methods for sarcasm detection. Social media data for
sarcasm detection will also contain emoticons (emojis) and analysis of those
emojis could add more meaning to the text. A bigger data set will allow for the
use of feature extraction techniques that require large data sets. This paper has
not explored the use of word embedding for both supervised and deep learning
techniques. More exploration into these methods of feature extraction available
is warranted and would be a focus of future research in this area.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ou</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lei</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <source>Sarcasm Detection in Social Media Based on Imbalanced Classification In: Web-Age Information Management</source>
          , pp.
          <fpage>459</fpage>
          -
          <lpage>471</lpage>
          . Cham, (
          <year>2014</year>
          ). https://doi.org/10.1007/978-3-
          <fpage>319</fpage>
          -08010-9 49
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Mehndiratta</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sachdeva</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Soni</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Detection of Sarcasm in Text Data using Deep Convolutional Neural Networks In: Scientific International Journal for Parallel and Distributed Computing</article-title>
          , vol.
          <volume>18</volume>
          ,
          <string-name>
            <surname>September</surname>
          </string-name>
          (
          <year>2017</year>
          ) https://doi.org/10.12694/scpe.v18i3.
          <fpage>1302</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Zhibo</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          , Ma, L., and Zhang, Y. :
          <article-title>A Hybrid Document Feature Extraction Method Using Latent Dirichlet Allocation and Word2Vec</article-title>
          . In : IEEE First International Conference on Data Science in Cyberspace, June (
          <year>2016</year>
          ). https://doi.org/10.1109/DSC.
          <year>2016</year>
          .110
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Zhang</surname>
          </string-name>
          , W.,
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guo</surname>
            ,
            <given-names>J. :</given-names>
          </string-name>
          <article-title>A Feature Extraction Method Based on Word Embedding for Word Similarity Computing</article-title>
          .
          <source>In: Communications in Computer and Information Science</source>
          , vol.
          <volume>496</volume>
          ,pp.
          <fpage>160</fpage>
          -
          <issue>167</issue>
          <year>January</year>
          , (
          <year>2014</year>
          ) https://doi.org/10.1007/978-3-
          <fpage>662</fpage>
          -45924-9 15
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Lunando</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Purwarianti</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Indonesian Social Media Sentiment Analysis With Sarcasm Detection</article-title>
          .
          <source>In: Int. Conf. Adv. Comput. Sci. Inf</source>
          .
          <source>Syst. ICACSIS</source>
          , pp.
          <fpage>195</fpage>
          -
          <lpage>198</lpage>
          .
          <string-name>
            <surname>September</surname>
          </string-name>
          (
          <year>2013</year>
          ). https://doi.org/10.1109/ICACSIS.
          <year>2013</year>
          .6761575
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Davidov</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tsur</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rappoport</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Semi-Supervised Recognition of Sarcasm in Twitter and Amazon</article-title>
          .
          <source>In: Proceedings of the Fourteenth Conference on Computational Natural Language Learning</source>
          , pp.
          <fpage>107</fpage>
          -
          <lpage>116</lpage>
          ,
          <string-name>
            <surname>July</surname>
          </string-name>
          (
          <year>2010</year>
          ) https://doi.org/10.5555/1870568.1870582
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Ahuja</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chug</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kohli</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gupta</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ahuja</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>The Impact of Features Extraction on the Sentiment Analysis In: Procedia Computer Science</article-title>
          , vol.
          <volume>152</volume>
          , pp.
          <fpage>341</fpage>
          -
          <lpage>348</lpage>
          . Elsevier,
          <string-name>
            <surname>January</surname>
          </string-name>
          (
          <year>2019</year>
          ). https://doi.org/10.1016/j.procs.
          <year>2019</year>
          .
          <volume>05</volume>
          .008
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Agarwal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xie</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vovsha</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rambow</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          , and Passonneau R.:
          <source>Sentiment Analysis of Twitter Data In: Proceedings of the Workshop on Language in Social Media</source>
          , pp.
          <fpage>30</fpage>
          -
          <lpage>38</lpage>
          , June (
          <year>2011</year>
          ). https://doi.org/10.5555/2021109.2021114
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Fan</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Du</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            ,
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            ,
            <surname>Wei</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          :
          <article-title>Apply word vectors for sentiment analysis of APP reviews</article-title>
          .
          <source>In: 3rd International Conference on Systems and Informatics(ICSAI)</source>
          , pp.
          <fpage>1062</fpage>
          -
          <lpage>1066</lpage>
          , (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Fouad</surname>
            ,
            <given-names>M.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gharib</surname>
            ,
            <given-names>T.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mashat</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <article-title>S: Efficient Twitter Sentiment Analysis System with Feature Selection and Classifier Ensemble</article-title>
          .
          <source>In: The International Conference on Advanced Machine Learning Technologies and Applications (AMLTA2018)</source>
          , vol.
          <volume>723</volume>
          , pp.
          <fpage>516</fpage>
          -
          <lpage>527</lpage>
          , Springer, Cham
          <string-name>
            <surname>January</surname>
          </string-name>
          (
          <year>2018</year>
          ). https://doi.org/10.1007/978- 3-
          <fpage>319</fpage>
          -74690-6 51
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Angulakshmi</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manicka</surname>
            <given-names>Chezian</given-names>
          </string-name>
          , R.:
          <article-title>Three level feature extraction for sentiment classification</article-title>
          .
          <source>In: International Journal of Innovative Research in Computer and Communication Engineering</source>
          <volume>2</volume>
          , vol.
          <volume>8</volume>
          ,pp.
          <fpage>5501</fpage>
          -
          <lpage>5507</lpage>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Prusa</surname>
            , Joseph,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Taghi</surname>
            , M.K, and
            <given-names>David J.D.</given-names>
          </string-name>
          :
          <article-title>Impact of Feature Selection Techniques for Tweet Sentiment Classification</article-title>
          .
          <source>In: FLAIRS Conference</source>
          <year>2015</year>
          , pp.
          <fpage>299</fpage>
          -
          <lpage>304</lpage>
          ,
          <string-name>
            <surname>April</surname>
          </string-name>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Waykole</surname>
            ,
            <given-names>R.N</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thakare</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <article-title>D: A Review Of Feature Extraction Methods For Text Classification</article-title>
          .
          <source>In: International Journal of Advance Engineering and Research Development</source>
          ,vol.
          <volume>5</volume>
          ,
          <string-name>
            <surname>April</surname>
          </string-name>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Vaishvi</surname>
          </string-name>
          , P. J.:
          <article-title>Optimal Feature Extraction based Machine Learning Approach for Sarcasm Type Detection in News Headlines</article-title>
          .
          <source>In: International Journal of Computer Applications</source>
          ,vol.
          <volume>177</volume>
          ,
          <string-name>
            <surname>March</surname>
          </string-name>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Mikolov</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Corrado</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dean</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          :
          <article-title>Efficient Estimation of Word Representations in Vector Space</article-title>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>