<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>TECHSSN at HAHA @ IberLEF 2021: Humor Detection and Funniness Score Prediction using Deep Learning Techniques</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ayush Nanda[</string-name>
          <email>ayush18031@cse.ssn.edu.in</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Abrit Pal Singh</string-name>
          <email>abritpal18007@cse.ssn.edu.in</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Aviansh Gupta</string-name>
          <email>aviansh18028@cse.ssn.edu.in</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rajalakshmi Sivanaiah</string-name>
          <email>rajalakshmis@ssn.edu.in</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Angel Deborah Suseelan</string-name>
          <email>angeldeborahs@ssn.edu.in</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>S Milton Rajendram</string-name>
          <email>miltonrs@ssn.edu.in</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mirnalinee T T</string-name>
          <email>mirnalineett@ssn.edu.in</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science and Engineering Sri SivasubramaniyaNadar College of Engineering</institution>
          ,
          <addr-line>Chennai - 603 110, Tamil Nadu</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper is a description of a system used to classify tweets in Spanish as humorous or not and rate the level of humor of each tweet. The system developed by the team TECHSSN uses binary classification techniques to classify the text as humor or not (subtask1) and ensemble learning regression model to rate the funniness score of the tweet (subtask2). The data undergoes preprocessing and is given to a modification of BERT [1] (Bidirectional Encoder Representations from Transformers) for the subtask1. The model is retrained, and the weights are learned for the dataset provided. XGBoost ensemble model is used to predict the funniness score on the BERT output for subtask 2. These systems were developed for the HAHA subtasks for IberLEF2021.</p>
      </abstract>
      <kwd-group>
        <kwd>Humor Detection</kwd>
        <kwd>Spanish</kwd>
        <kwd>NLP</kwd>
        <kwd>BERT</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Humor is an experience that makes a person happy or amused. Throughout history,
humans have been studying it from a psychological or linguistic perspective, but to
see it through the eyes of a computer, which is basically figuring out the patterns and
sequential repetitions in the textual content, is a challenging task for the field of NLP.
One of the main reasons for this is the subjective nature of humor, as
the humorousness of a joke depends on various factors such as age, gender, and
cultural background of an individual. To make advancements in virtual assistants and
chatbots, the integration of automated humor detection has become a necessity, which
would make the conversations between them and human users more convenient and
make their interactions look more human-like.We have participated in subtask 1
(humor detection) and subtask 2 (funniness score prediction).</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>Humor is a well-studied topic in the fields of psychology and linguistics, but in the
field of computer science continuous research has been going on and the
humorrecognition systems are getting better every year for us to get a better understanding
of the factors that makes a conversation humorous.</p>
      <p>
        Mihalcea, R., and Strapparava, C. in their work Making Computers Laugh (2005)
[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] showed that automatic classification techniques can be successfully applied to the
task of humor-recognition.
      </p>
      <p>
        UO UPV system was developed for the Humor Analysis based on Human
Annotation (HAHA) track proposed in IberEval 2018 [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] Workshop. The task focuses on
classifying tweets in Spanish as humorous or not and predicting how funny they are.
This system combines both linguistic features and an Attention-based Recurrent
Neural Network, where the attention layer helps to calculate the contribution of each term
towards targeted humorous classes. This model achieves an accuracy of 84.55%.
      </p>
      <p>
        Santiago Castro et. al. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], in the previous iteration of our task, in IBERAMIA
2016's Natural Language Processing sub task, built a crowd sourced corpus of labeled
tweets, annotated according to its humor value, letting the annotators subjectively
decide which are humorous. They used SVM classifier for Spanish tweets was
assembled based on supervised learning, reaching a precision of 84 % and a recall of 69 %.
      </p>
      <p>
        In the HAHA task of IberLEF2019, Chiruzzo et. al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] the best classifier was
developed by the user adilism [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] used the multilingual cased BERT-Base pretrained
model along with the fastai library, to achieve an accuracy of 85.5% and recall of
85.2%.
      </p>
      <p>
        Orion Weller and Kevin Seppi [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] presented a novel way of approaching this
problem by building a model that learns to identify humorous jokes based on ratings
learned from Reddit pages. Transformer architecture was employed using these
ratings to determine the level of humor. This model outperforms all previous work done
on these tasks, with an F-measure of 93.1% for the Puns dataset and 98.6% on the
Short Jokes dataset.
      </p>
      <p>
        Omar Khattab and Matei Zaharia [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] developed a novel ranking model that
employs contextualized late interaction over deep language models for efficient retrieval.
This architecture maintains high Mean Reciprocal Rank(MRR) at relatively lower
reranking latency(540 times lower) and FLOPs/query(48,600 times lower) as compared
to BERT-Large.
      </p>
      <p>
        There are other papers which describe systems that detect humor in non-English
text, like Ismailov A. et. al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], in Iberian Languages and Sushmitha Reddy Sane et.
al. [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] in Hindi-English texts.
      </p>
    </sec>
    <sec id="sec-3">
      <title>Methodology</title>
      <p>For the classification of text, we have chosen the BERT-Base multilingual model
which has 12 layers with the last layer’s activation function as the sigmoid function,
as we are performing binary classification.
3.1</p>
      <sec id="sec-3-1">
        <title>Model Architecture</title>
        <p>The classification model uses a separate line of hidden layers especially designed to
extract features from each sentence. The used model is a neural network that includes
two parallel lines of hidden layers: One to view text as a whole and another one to
view each sentence separately. Figure 1 displays the architecture of the proposed
method. It is comprised of a few general steps:
1. The sentences are separated and are tokenized individually, to analyse each
sentence separately.
2. To convert the text to proper numerical inputs for the neural network, they are
encoded using BERT sentence embedding. This step is performed individually on
each sentence and on the whole text (shown in Figure 1).
3. The resultant BERT sentence embeddings for each sentence that we get from the
previous step are then given as an input to the partial hidden layers of the neural
network, whose purpose is to extract mid-level features for each sentence (could be
related to context, type of sentence, etc).
4. While our main idea is to detect relationships between sentences (especially with
punchline), it is also required to examine word-level connections in the whole text
(such as synonyms and antonyms) that may have meaningful impacts in
determining congruity of the text. Like the previous step, we feed BERT sentence
embeddings for the whole text into hidden layers of the neural network.
5. Finally, three sequential layers of the neural network conclude our model. These
final layers combine the output of all previous lines of hidden layers to result in the
final output. In theory, these final layers should determine the congruity of
sentences and detect the transformation of reader’s viewpoint after reading the
punchline.
6. For predicting the humor level (funniness score) of a tweet we factor in the votes
(votes_no to votes5) instead of the binary labels, which would give us a
5dimensional vector as a result, which is given as an input to a XGBoost Regression
Model to predict the humor rating of the given tweet.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Dataset Collection</title>
        <p>The dataset used for training our model is the one provided in the Codalab
competition page, Training Dataset (haha_2021_train.csv).
─ Id - tweetId
─ text - tweet
─ is_humor – 0 or 1
─ votes_no, votes_1...votes_5 – 0-1
─ humor_rating – 1-5
humor_mechanism - {absurd, analogy, embarrassment, exaggeration, insults,
irony, misunderstanding, parody, reference, stereotype, unmasking}
─ humor_target – {age, body shaming, ethnicity/origin, family/relationships, health,
lgbt, men, professions, religion, self-deprecating, sexual aggressors, social status,
substance use, technology, women}
For the pre-processing the data is tokenized using the BERT Tokenizer (pre-trained
on the BERT-Base multilingual model) and then it undergoes stemming
(SnowballStemmer) and lemmatizing (WordNetLemmatizer). The tokenized input is
encoded into ids, masks, and segments, for the transformer (BERT) to accept it as an input.
For the model we have chosen BERT-Base multilingual, which is pre-trained in
English language. This model is compared against some of the existing techniques such as
Support Vector Machine (SVM), Decision Trees (DT) and Multinomial Naïve bayes
(MNB).
For training of the model, we loop over the folds in gkf (Group K-Fold) and train each
fold for 3 epochs with a learning rate of 3e-5 and a batch size of 6.As we have
performed binary classification for the humor detection task, we have set the loss
function as a simple binary cross-entropy, and we have chosen Adam as the optimization
function. For the second task the loss function is a Mean Squared Error (MSE)
function, and the optimization function remains the same.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Results and Analysis</title>
      <p>As mentioned in section 3.4, the performance of our model (BERT) was tested against
various machine learning techniques (SVM, DT, MNB). These models were tested on
the test set with gold labels provided in the Codalab's competition page.</p>
      <p>Table 2 shows the results for the various models used for the subtask 1. This table
we can infer that there is a trend, where all the models show high precision and low
recall. The model has a very low false positive rate and an average false negative rate,
which is illustrated in the Figure 2. And the same trend is seen amongst other models,
which means that classifying a text as non-humorous is harder as compared to
classifying it as humorous. Figure 2 shows the confusion matrix formed for the BERT
model.
Humor detecting systems in Spanish with high accuracy can help serve to the Spanish
audience on various social media platforms. It can be used to make interaction with
chat-bots and virtual assistants affable. The HAHA subtask 1 and 2 for IberLEF2021
involves classifying tweets in Spanish as humorous or not and rate their humor level
on a particular scale. We used a model that is built on top of BERT which is used to
classify such sentences (text) into humorous or not. XGBoost regression model is
used to predict the humor level or funniness score in the tweet.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>Jacob</given-names>
            <surname>Devlin</surname>
          </string-name>
          ,
          <string-name>
            <surname>Ming-Wei</surname>
            <given-names>Chang</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Kenton</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Kristina</given-names>
            <surname>Toutanova</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>BERT: Pretraining of Deep Bidirectional Transformers for Language Understanding</article-title>
          .
          <article-title>North American Chapter of the Association for Computational Linguistics</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Mihalcea</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Strapparava</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Making Computers Laugh: Investigations in Automatic Humor Recognition</article-title>
          .
          <source>In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing. HLT '05</source>
          ,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computational Linguistics, Vancouver, British Columbia,
          <string-name>
            <surname>Canada</surname>
          </string-name>
          (
          <year>2005</year>
          ), pp.
          <fpage>531</fpage>
          -
          <lpage>538</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>Reynier</given-names>
            <surname>Ortega-Bueno</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Carlos E</given-names>
            <surname>Muniz-Cuza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>José E Medina</given-names>
            <surname>Pagola</surname>
          </string-name>
          , and Paolo Rosso: UO UPV:
          <article-title>Deep linguistic humor detection in spanish social media</article-title>
          .
          <source>In Proceedings of the Third Workshop on Evaluation of Human Language Technologies for Iberian LanguagesIberEval</source>
          <year>2018</year>
          co
          <article-title>-located with 34th Conference of the Spanish Society for Natural Language Processing (SEPLN</article-title>
          <year>2018</year>
          ), pp
          <fpage>204</fpage>
          -
          <lpage>213</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Castro</surname>
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cubero</surname>
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Garat</surname>
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moncecchi</surname>
            <given-names>G.</given-names>
          </string-name>
          (
          <year>2016</year>
          )
          <article-title>Is This a Joke? Detecting Humor in Spanish Tweets</article-title>
          . In: Montes y
          <string-name>
            <given-names>Gómez M.</given-names>
            ,
            <surname>Escalante</surname>
          </string-name>
          <string-name>
            <given-names>H.</given-names>
            ,
            <surname>Segura</surname>
          </string-name>
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Murillo</surname>
          </string-name>
          <string-name>
            <surname>J</surname>
          </string-name>
          . (eds)
          <source>Advances in Artificial Intelligence - IBERAMIA 2016. IBERAMIA 2016. Lecture Notes in Computer Science</source>
          , vol
          <volume>10022</volume>
          . Springer, Cham, pp
          <fpage>139</fpage>
          -
          <lpage>150</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>Chiruzzo</given-names>
            <surname>Luis</surname>
          </string-name>
          ., and Castro Santiago.,
          <string-name>
            <given-names>Góngora</given-names>
            <surname>Santiago</surname>
          </string-name>
          .,
          <string-name>
            <given-names>Rosá</given-names>
            <surname>Aiala</surname>
          </string-name>
          .,
          <string-name>
            <surname>Meaney</surname>
            ,
            <given-names>J. A.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Mihalcea Rada</surname>
          </string-name>
          (
          <year>2021</year>
          ).
          <article-title>Overview of HAHA at IberLEF 2021: Detecting, Rating and Analyzing Humor in Spanish</article-title>
          .
          <source>Procesamiento del Lenguaje Natural</source>
          , vol
          <volume>67</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>Orion</given-names>
            <surname>Weller</surname>
          </string-name>
          and Kevin Seppi:
          <article-title>Humor Detection: A Transformer Gets the Last Laugh:</article-title>
          <source>Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)</source>
          :
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Peng-Yu Chen</surname>
          </string-name>
          and
          <string-name>
            <surname>Von-Wun Soo</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Humor recognition using deep learning</article-title>
          .
          <source>In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</source>
          , Volume
          <volume>2</volume>
          (
          <issue>Short Papers)</issue>
          , pp
          <fpage>113</fpage>
          -
          <lpage>117</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>Omar</given-names>
            <surname>Khattab</surname>
          </string-name>
          &amp; Matei
          <string-name>
            <surname>Zaharia</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>ColBERT: Efficient and effective passage search via contextualized late interaction over BERT</article-title>
          .
          <source>In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          , pp
          <fpage>39</fpage>
          -
          <lpage>48</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Ismailov</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <source>Humor Analysis Based on Human Annotation Challenge at IberLEF</source>
          <year>2019</year>
          :
          <article-title>First-place Solution</article-title>
          .
          <source>In: Proceedings of the Iberian Languages Evaluation Forum (IberLEF</source>
          <year>2019</year>
          ).
          <source>CEUR Workshop Proceedings</source>
          , CEUR-WS, Bilbao,
          <source>Spain (9</source>
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10. Sushmitha Reddy Sane, Suraj Tripathi, Koushik Reddy Sane, and
          <string-name>
            <given-names>Radhika</given-names>
            <surname>Mamidi</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Deep learning techniques for humor detection in hindi-english codemixed tweets</article-title>
          .
          <source>In Proceedings of the Tenth Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis</source>
          , pp
          <fpage>57</fpage>
          -
          <lpage>61</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>