<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>LaSTUS/TALN at HAHA: Humor Analysis based on Human Annotation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Lut ye Seda Mut Altin</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alex Bravo</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Horacio Saggion</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Neural</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>LaSTUS-TALN Research Group Department of Information and Communication Technologies Universitat Pompeu Fabra C/Tanger 122-140</institution>
          ,
          <addr-line>08018 Barcelona</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2019</year>
      </pub-date>
      <fpage>145</fpage>
      <lpage>150</lpage>
      <abstract>
        <p>In this paper we describe the participation of LaSTUS/TALN team in the shared task:\Humor Analysis based on Human Annotation" (HAHA) at the Spanish Society for Natural Language Processing (SEPLN) organized in the context of IberLEF 2019. HAHA onjective is the classi cation of tweets in Spanish as humorous or not, also identifying the level of funniness. This paper presents a multi-task learning approach based on bidirectional long short-term memory (biLSTM) models. The paper presents and discusses the o cial results achieved by our team.</p>
      </abstract>
      <kwd-group>
        <kwd>Natural Language Processing Networks Spanish Language</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Humor is a complex phenomenon in human communication that results in
amusement or laughter. Although humans are very good at understanding humorous
language, computers still lack this essential capability. It is therefore important
to make progress in the area of humour recognition and understanding to pave
the way for better human machine communication systems. Recent progress in
machine learning have produced interesting results in the eld of humour
classi cation.</p>
      <p>
        In this paper, we describe a neural network for humor recognition within
the context of 'Humor Analysis based on Human Annotation' (HAHA) at
IberLEF2019 which is based on tweets written in Spanish [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. The task is composed
of two sub-task as below:
      </p>
      <p>In Section 2 of the paper we present an overview for the related work. In
Section 3 we provide information about the data and give a description of our
model. In Section 4, we give the results and discuss the performance and nally
in Section 5 we introduce the conclusions.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        Previous research for humor recognition is mainly based on taking the
problem into account as a classi cation problem. Mihalcea et al. formulated humor
recognition with a classi cation approach and facilitated classi ers such as SVM
and Naive Bayes [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Purandare and Litman analyzed humorous conversations
from a well-known comedy television show using standard supervised classi ers
[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Barbieri and Saggion [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] presented a machine learning approach based on a
linguistically motivated set of features which were also applied to irony detection
[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        Later on, Zhang and Liu worked on several categories of humor-related
features giving input around fty features into the Gradient Boosting Regression
Tree model for automated recognition on Twitter data [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Radev et al.
described an experiment for humor detection in cartoon captions where they
compare several automatic methods for selecting the funniest caption and stated
that negative sentiment, human-centeredness and lexical centrality match most
strongly with the funniest captions [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. Yang et al. constructed di erent
computational classi ers to recognize humor, based on the designed sets of features [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
More recently, Chen et al. presented a Convolutional Neural Network (CNN) for
humor recognition focusing on lexical cues and pointed out to the advantages
of CNN [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Chen and Soo proposed a deep learning CNN architecture that can
learn to distinguish between humorous and non-humorous texts based on a large
scale of balanced positive and negative dataset and reported that it outperforms
the previous work [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        On the other hand there are some researches focusing on humor ranking. The
shared task:'SemEval-2017 Task 6: HashtagWars: Learning a Sense of Humor'
focused on humor ranking to de ne the funniness level based on a dataset of
funny tweets posted. The top performing system used an ensemble method of
both feature based and neural network-based systems [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Data and Methodology</title>
      <p>
        The corpus that was provided by the shared task organizers consist of 30,000
crowd-annotated tweets based on [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], divided in 80% (24,000 tweets) for training
and 20% (6,000 tweets) tweets for testing. The annotation was made with a
voting scheme in which users could select one of six options: the tweet is not
humorous or, in case the tweet is humorous, an integer score between one (not
funny) and ve (excellent). Finally, all tweets are classi ed as humorous or not
humorous. The humorous tweets were those which received at least three votes
indicating the tweet was somehow humorous with at least ve annotations. The
not humorous tweets were those that received at least three votes for not humor
(they might have less than ve votes in total). The corpus contains tweets from
every Spanish-speaking country, but the country of the user is not speci ed in
the data-set. Most tweets are written in the Spanish language spoken in Spain,
for that reason, we considered that the corpus contains tweets in Spanish.
      </p>
      <p>In this work, we presented a multi-task neural network based on a
bidirectional long short-term memory (biLSTM) model with two dense layers at the
end. We have used data from di erent tasks in the context of the IberLEF 2019
evaluation which we believed can assist in humor identi cation (e.g. irony
detection, sentiment). More speci cally, we have selected three task to simultaneously
train with HAHA:
{ From MEX-A3T task, we used the Aggressiveness Identi cation track, which
focuses on the detection of aggressive comments in tweets from Mexican
users.
{ From the TASS 2019 task, which focused on the evaluation of polarity
classi cation systems of tweets written in Spanish, we used the data related to
opinion mining. The data-set consists of tweets written in the Spanish
language spoken in Spain, Peru, Costa Rica, Uruguay and Mexico, which were
annotated with 4 di erent levels of opinion intensity (Positive, Negative,
Neutral and Nothing).
{ From the IroSvA task, the rst shared task fully dedicated to identify the
presence of irony in short messages, we also used the training dataset, which
consist of 2,400 short messages annotated with irony for each Spanish variant
spoken in Cuba, Mexico and Spain.</p>
      <p>In this scenario, we de ned an Embedding layer for each Spanish variant.
Classi cation tasks with the same Spanish variant used the same Embedding
layer during the training process. For instance, the embedding layer related to
the Spanish from Mexico was used by the MEX-A3T task, the Mexican part
of the TASS 2019 task and the tweets written in the Spanish language spoken
in Mexico from IroSvA. Furthermore, all task shared the biLSTM layer during
training.</p>
      <p>In Figure 1 a simpli ed schema of our shared model can be seen. In the
following we explain how the model works in one speci c classi cation task. In
order to train all task at the same time, we have divided each data set into the
same number of batches. Then, during the training, a batch of data is randomly
selected and it used to train its speci c model (sharing the embedding and
BiLSTM layers with other models). In this sense, we consider one epoch when
all batches from all task were trained.</p>
      <p>First, the text of the tweets were tokenized, removing punctuation marks, and
keeping emoji and full hashtags since they can contribute to de ne the meaning
of a tweet (or short message).</p>
      <p>Second, the embedding layer transforms each element in the tokenized tweet
into a low-dimension vector. The embedding layer, composed of the vocabulary
of the task, was randomly initialized from a uniform distribution (between -0.8
and 0.8 values and with 100 dimensions). The initialized embedding layer was
updated with the word vectors included in a pre-trained model from Regional
Embeddings, which provides FastText word embeddings for Spanish language
variations. After this update, words not included in the pre-trained model keep
their random value.</p>
      <p>Then, a biLSTM layer gets high-level features from previous embeddings,
con gured with 128 units. A disadvantage of LSTM models is that they
compress all information into a xed-length vector, causing the incapability of
remembering long tweets. To overcome the limitation of xed-length vector
keeping relevant information from long tweet sequences, we added an attention layer
producing a weight vector and merge word-level features from each time step
into a tweet-level feature vector, by multiplying the weight vector. Finally, the
tweet-level feature vector produced by the previous layers is used for classi
cation task by two fully-connected (dense) layers. In the case of the HAHA task,
the output from the classi cation task (humorous or not humorous) was
redirected to another output layer in order to learn the funniness score value, that
is, the regression task. In the test step, if a tweet is classi ed as humorous, the
funniness score predicted was also considered, otherwise was 0.</p>
      <p>Moreover, to be able to mitigate over tting problem we applied dropout
regularization. Dropout operation sets randomly to zero a proportion of the hidden
units during forward propagation, creating more generalizable representations of
data. In the model, we employ dropout on the embeddings and biLSTM layers.
The dropout rate was set to 0.5 in all cases.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Results</title>
      <p>In sub-task 1, we ranked 10th with an F-score of 0.759 (precision of 0.774 and
recall of 0.745) and accuracy of 0.816. In sub-task 2, we ranked 7th with root
mean square error of 0.919 (see Table 1).
In this paper, we have presented our results from the participation in the HAHA
task from the IberLEF 2019. We have investigated multi-task learning on neural
networks with di erent tasks. Our results improved the baselines presented by
the organizers and also the average scores achieved by all participants. Due to
time constraints, we were not able to perform an error analysis, for that reason,
in future work, we will work in a detailed error analysis in order to understand
the limitations of our approach. Furthermore, we want to test di erent types of
neural networks (e.g. convolutions or combinations of convolutions and LSTM
layers) and share more layers between task. Finally, we also consider that the
integration of linguistic features (e.g. word frequency, POS tags and word shape)
and metadata (e.g. whether a tweet is a response to another tweet) can represent
useful contextual information to improve our performance.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Barbieri</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Saggion</surname>
          </string-name>
          , H.:
          <article-title>Automatic detection of irony and humour in twitter</article-title>
          .
          <source>In: Proceedings of the Fifth International Conference on Computational Creativity</source>
          , Ljubljana, Slovenia, June 10-13,
          <year>2014</year>
          . pp.
          <volume>155</volume>
          {
          <issue>162</issue>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Barbieri</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Saggion</surname>
          </string-name>
          , H.:
          <article-title>Modelling irony in twitter</article-title>
          .
          <source>In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2014, April 26-30</source>
          ,
          <year>2014</year>
          , Gothenburg, Sweden. pp.
          <volume>56</volume>
          {
          <issue>64</issue>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Castro</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chiruzzo</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosa</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Garat</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moncecchi</surname>
          </string-name>
          , G.:
          <article-title>A crowd-annotated spanish corpus for humor analysis</article-title>
          .
          <source>In: Proceedings of the Sixth International Workshop on Natural Language Processing for Social Media</source>
          . pp.
          <volume>7</volume>
          {
          <issue>11</issue>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>C.M.</given-names>
          </string-name>
          :
          <article-title>Convolutional neural network for humor recognition</article-title>
          .
          <source>arXiv preprint arXiv:1702.02584</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Chen</surname>
          </string-name>
          , P.Y.,
          <string-name>
            <surname>Soo</surname>
            ,
            <given-names>V.W.</given-names>
          </string-name>
          :
          <article-title>Humor recognition using deep learning</article-title>
          .
          <source>In: NAACL-HLT</source>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Chiruzzo</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Castro</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Etcheverry</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Garat</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Prada</surname>
            ,
            <given-names>J.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosa</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          : Overview of HAHA at IberLEF 2019:
          <article-title>Humor Analysis based on Human Annotation</article-title>
          .
          <source>In: Proceedings of the Iberian Languages Evaluation Forum (IberLEF</source>
          <year>2019</year>
          ).
          <source>CEUR Workshop Proceedings</source>
          , CEUR-WS, Bilbao,
          <source>Spain (9</source>
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Mihalcea</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Strapparava</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Making computers laugh: Investigations in automatic humor recognition</article-title>
          .
          <source>In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing</source>
          . pp.
          <volume>531</volume>
          {
          <fpage>538</fpage>
          .
          <article-title>Association for Computational Linguistics (</article-title>
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Potash</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Romanov</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rumshisky</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Semeval-2017 task 6:# hashtagwars: Learning a sense of humor</article-title>
          .
          <source>In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)</source>
          . pp.
          <volume>49</volume>
          {
          <issue>57</issue>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Purandare</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Litman</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          : Humor:
          <article-title>Prosody analysis and automatic recognition for f* r* i* e* n* d* s</article-title>
          .
          <source>In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing</source>
          . pp.
          <volume>208</volume>
          {
          <fpage>215</fpage>
          .
          <article-title>Association for Computational Linguistics (</article-title>
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Radev</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stent</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tetreault</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pappu</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Iliakopoulou</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chanfreau</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , de Juan,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Vallmitjana</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Jaimes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Jha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Manko</surname>
          </string-name>
          , R.:
          <article-title>Humor in collective discourse: Unsupervised funniness detection in the new yorker cartoon caption contest</article-title>
          .
          <source>In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC</source>
          <year>2016</year>
          ). pp.
          <volume>475</volume>
          {
          <fpage>479</fpage>
          .
          <string-name>
            <surname>European Language Resources Association</surname>
          </string-name>
          (ELRA), Portoroz, Slovenia (May
          <year>2016</year>
          ), https://www.aclweb.org/anthology/L16-1076
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lavie</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dyer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hovy</surname>
          </string-name>
          , E.:
          <article-title>Humor recognition and humor anchor extraction</article-title>
          .
          <source>In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing</source>
          . pp.
          <volume>2367</volume>
          {
          <issue>2376</issue>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Zhang</surname>
          </string-name>
          , R., Liu, N.:
          <article-title>Recognizing humor on twitter</article-title>
          .
          <source>In: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management</source>
          . pp.
          <volume>889</volume>
          {
          <fpage>898</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>