<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Humor Analysis based on Human Annotation(HAHA)-2019: Humor Analysis at Tweet Level using Deep Learning</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Avishek Garain</string-name>
          <email>avishekgarain@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Avishek Garain Computer Science and Engineering Jadavpur University</institution>
          ,
          <addr-line>Kolkata</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2019</year>
      </pub-date>
      <fpage>191</fpage>
      <lpage>196</lpage>
      <abstract>
        <p>This paper is a description of the system submitted to ”Humor Analysis based on Human Annotation(HAHA)-2019” shared task. The task is divided into two sub-tasks which includes detection of humour in Spanish tweets and predicting a Humor score for the same. The tweets are short (up to 240 characters) and the language is informal, i.e., it contains spelling mistakes, emojis, emoticons, onomatopeias etc. Humor detection includes classification of the tweets into 2 classes, viz., Humorous, Not humorous. For preparing the proposed system, I use Deep Learning networks like LSTMs.</p>
      </abstract>
      <kwd-group>
        <kwd>BiLSTM</kwd>
        <kwd>Embedding</kwd>
        <kwd>Humor Analysis</kwd>
        <kwd>Emoticons</kwd>
        <kwd>Humor Score</kwd>
        <kwd>Weighted Average</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>Humour Detection refers to the use of Natural Language Processing (NLP) to
systematically identify, extract, quantify, and study effective states and subjective information.
The Humor Analysis based on Human Annotations(HAHA)-2019 was a classification
task where it was required to classify a Spanish tweet on basis of its humor content,into
various classes like, Humorous and Non-Humorous, and thereby predicting the
Humor score if humorous. However, the task threw some additional challenges. The given
tweets involved lack of context, where the number of words were less than 240.
Moreover, the tweets were in an informal language and contained multi-linguality. Also, the
classification system that would be prepared for the task, needed to be generalized for
various test corpora as well.</p>
      <p>To solve the task in hand, I built a bidirectional Long Short Term Memory (LSTM)
based neural network, for classification purpose as well as humor score prediction
purpose.</p>
      <p>The rest of the paper has been organized as follows. Section 2 describes the data, on
which, the task was performed. The methodology followed is described in Section 3.
This is followed by the results and concluding remarks in Section 4 and 5 respectively.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Data</title>
      <p>
        The dataset that was used to train and validate the model was provided by the IberLEF
[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The data was collected from Twitter and it was retrieved using the Twitter API
by searching for keywords and constructions that are often included in various texts
of different sentiments. The dataset provided consisted of tweets in their original form
along with the corresponding labels and scores, as shown in Table 1.
      </p>
      <p>Label value Meaning
0 Not Humorous
1 Humorous</p>
      <p>Table 1: Labels used in the dataset</p>
      <p>The dataset originally comprised of Spanish tweets. The tweets were also tagged
with their respective Humor labels. The resulting dataset had 24,000 humor tagged and
scored tweets, which were splitted into 16,800 instances of training data and 7,200
instances of development data. My approach was to convert the tweets into a sequence
of words and convert them into word embeddings. I then ran a neural-network based
algorithm on the processed tweet. Language and label based categorical division of data
is given in Table 2, 3 and 4.</p>
      <p>Value Humorous Not-Humorous</p>
      <p>All 10323 6477
Table 2: Distribution of the labels in the training dataset</p>
      <p>Value Humorous Not-Humorous</p>
      <p>All 4424 2776
Table 3: Distribution of the labels in the development dataset</p>
      <p>Value Humorous Not-Humorous</p>
      <p>All 14747 9253</p>
      <p>Table 4: Distribution of the labels in the combined dataset
3</p>
    </sec>
    <sec id="sec-3">
      <title>Methodology</title>
      <p>
        I used SenticNet5[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] for finding sentiment values of individual words after converting
the sentences to English via GoogleTrans API. Apart from this, I also used a Spanish
Sentiment lexicon for the same.
      </p>
      <p>
        The use of BiLSTM networks is a key factor in our model. The work of
[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] brought a revolutionary change by bringing the concept of memory into usage for
sequence based problems.
      </p>
      <p>
        I first took the tweets and sent the raw data through some preprocessing steps, for which
I took inspiration from the work on Hate Speech against immigrants in Twitter[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], part
of SemEval2019. The steps used here are built as an advancement of this work. It
consisted of the following steps:
1. Replacing emojis and emoticons by their corresponding meanings
2. Removing mentions
3. Removing URLs
4. Contracting whitespace
5. Extracting words from hashtags
      </p>
      <p>In step 1, for example,
”,-) ” is replaced by ”winking happy”
”;-(” is replaced with ”crying”
”:-C” is replaced with ”real unhappy”
Similarly I replaced 110 emoticons by their feelings.</p>
      <p>The last step (step 5) consists of taking advantage of the Pascal Casing of hashtags
(e.g. #AngryBird). A simple regular expression could extract all words; I ignored
a few errors that arise in this procedure. Using this extraction, contributed to features
mainly because words in hashtags, to some extent, may convey sentiments for the tweet.
They played an important role during the model-training stage.</p>
      <p>The preprocessed tweets are treated as a sequence of words with interdependence
among various words contributing to its meaning. I convert the tweets into one-hot
vectors. I also included certain manually extracted features listed below:
1. Counts of words with positive sentiment, negative sentiment and neutral sentiment
in Spanish
2. Counts of words with positive sentiment, negative sentiment and neutral sentiment
in English
3. Subjectivity score of the tweet
4. Number of question marks,Exclamations and full-stops in the tweet</p>
      <p>I use a Bidirectional-LSTM based approach to capture information from both the
past and future context.</p>
      <p>My models for both the sub-tasks are neural-network based models. For subtask-1,
first, I merged the manually-extracted features with the feature vector obtained after
converting the processed tweet to one-hot encoding. The output was processed through
an embedding layer which transformed the tweet into a 128 length vector. The
embedding layer learns the word embeddings from the input tweets. I passed the embeddings
through a Bidirectional LSTM layer containing 128 units. This was followed by another
bidirectional LSTM layer containing 256 units with its dropout and regular dropout set
to 0.45 and activation being a sigmoid activation. This is followed by a Bidirectional
LSTM layer with 128 units for better learning. This was followed by the final output
layer of neurons with sigmoid activation, where, each neuron predicts a label as present
in the dataset.</p>
      <p>For sub-task 1, I trained a model containing 2 neurons for predicting Humorous,
Not humorous respectively. The model was compiled using the Adam optimization
algorithm with a learning rate of 0.0005. Binary-crossentropy was used as the loss
function. The working is depicted in Figure 1.</p>
      <p>For sub-task 2, I trained on the same model but the Dense layer consisted of 5
neurons for five classes representing the classes ranging from 1-star count to 5-star
count. I got a probability prediction percentage against each of the classes and thus
finally got the final humour score by finding the Weighted Average using the following
formula:</p>
      <p>S =</p>
      <p>P5
i=1(pi i)
P5</p>
      <p>i=1 pi
where,
pi=Probability of getting i stars humor rating
i=Number of stars
The model is compiled using the Adam optimization algorithm with a learning rate of
0.001. Categorical crossentropy is used as the loss function.</p>
      <p>I noted that the dataset is highly skewed in nature. If trained on the entire training
dataset without any validation, the model tended to completely overfit to the class with
higher frequency as it led to a higher accuracy score.</p>
      <p>To overcome this problem, I took some measures. Firstly, the training data was
splitted into two parts — one for training and one for validation comprising 70 % and
30 % of the dataset respectively. The training was stopped when two consecutive epochs
increased the measured loss function value and decrease in Validation accuracy for the
validation set.</p>
      <p>Secondly, class weights were assigned to the different classes present in the data
which were chosen to be proportional to the inverse of the respective frequencies of the
classes. Hypothetically, the model then gave equal weight to the skewed classes and this
penalized tendencies to overfit to the data.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Results</title>
      <p>I participated in subtasks 1 and 2 of Humor Analysis based on Human
Annotation(HAHA)2019 and our system works quite well.</p>
      <p>I have included the automatically generated tables of evaluation metrics with my
results. The results are depicted in Tables 5-7.</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>In this system report, I have presented a model which performs satisfactorily in the
given tasks. The model is based on a simple architecture. There is scope for
improvement by including more manually extracted features (like those removed in the
preprocessing step) to increase the performance. Another fact is that the model is a
constrained system, which may lead to poor results based on the modest size of the
data. Related domain knowledge may be exploited to obtain better results. Use of
regularizers led to proper generalization of model, henceforth increasing our task
submission score.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Cambria</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Poria</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hazarika</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kwok</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Senticnet 5: Discovering conceptual primitives for sentiment analysis by means of context embeddings</article-title>
          .
          <source>In: AAAI</source>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Castro</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chiruzzo</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosa</surname>
            <given-names>´</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Garat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Moncecchi</surname>
          </string-name>
          , G.:
          <article-title>A crowd-annotated spanish corpus for humor analysis</article-title>
          .
          <source>In: Proceedings of the Sixth International Workshop on Natural Language Processing for Social Media</source>
          . pp.
          <fpage>7</fpage>
          -
          <lpage>11</lpage>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Chiruzzo</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Castro</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Etcheverry</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Garat</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Prada</surname>
            ,
            <given-names>J.J.</given-names>
          </string-name>
          , Rosa´,
          <string-name>
            <surname>A.</surname>
          </string-name>
          : Overview of HAHA at IberLEF 2019:
          <article-title>Humor Analysis based on Human Annotation</article-title>
          .
          <source>In: Proceedings of the Iberian Languages Evaluation Forum (IberLEF</source>
          <year>2019</year>
          ).
          <source>CEUR Workshop Proceedings</source>
          , CEUR-WS, Bilbao,
          <source>Spain (9</source>
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Garain</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Basu</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>The titans at SemEval-2019 task 5: Detection of hate speech against immigrants and women in twitter</article-title>
          .
          <source>In: Proceedings of the 13th International Workshop on Semantic Evaluation</source>
          . pp.
          <fpage>494</fpage>
          -
          <lpage>497</lpage>
          . Association for Computational Linguistics, Minneapolis, Minnesota, USA (Jun
          <year>2019</year>
          ), https://www.aclweb.org/anthology/S19-2088
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Hochreiter</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schmidhuber</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>Long short-term memory</article-title>
          .
          <source>Neural Comput</source>
          .
          <volume>9</volume>
          (
          <issue>8</issue>
          ),
          <fpage>1735</fpage>
          -
          <lpage>1780</lpage>
          (
          <year>Nov 1997</year>
          ). https://doi.org/10.1162/neco.
          <year>1997</year>
          .
          <volume>9</volume>
          .8.1735, http://dx.doi.org/10.1162/neco.
          <year>1997</year>
          .
          <volume>9</volume>
          .8.
          <fpage>1735</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Mihalcea</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Strapparava</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Making computers laugh: Investigations in automatic humor recognition</article-title>
          .
          <source>In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing</source>
          . pp.
          <fpage>531</fpage>
          -
          <lpage>538</lpage>
          . HLT '
          <volume>05</volume>
          ,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computational Linguistics, Stroudsburg, PA, USA (
          <year>2005</year>
          ). https://doi.org/10.3115/1220575.1220642, https://doi.org/10.3115/1220575.1220642
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7. Sjo¨bergh, J.,
          <string-name>
            <surname>Araki</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Recognizing humor without recognizing meaning</article-title>
          . In: Masulli,
          <string-name>
            <given-names>F.</given-names>
            ,
            <surname>Mitra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Pasi</surname>
          </string-name>
          ,
          <string-name>
            <surname>G</surname>
          </string-name>
          . (eds.)
          <article-title>Applications of Fuzzy Sets Theory</article-title>
          . pp.
          <fpage>469</fpage>
          -
          <lpage>476</lpage>
          . Springer Berlin Heidelberg, Berlin, Heidelberg (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>