Humor Detection in Spanish Tweets Using
                 Neural Network

    Rida Miraj1,2[0000−0002−1605−0403] and Masaki Aono1,3[0000−0003−1383−1076]
               1
                   Toyohashi University of Technology, Toyohashi, Japan
                              2
                                ridamiraj974@gmail.com
                                    3
                                       aono@tut.jp


        Abstract. With its linguistic, social, and psychological components, hu-
        mor has long been a part of human life. Allowing computers to interpret
        humor has become extremely important due to its wide range of uses
        and growing popularity on social media platforms. While humor has
        long been researched from psychological, cognitive, and linguistic per-
        spectives, computational linguistics has yet to investigate it. Our contri-
        bution to HAHA@IberLEF2021: Humor Analysis based on Human An-
        notation is described in this paper. We offer a deep neural network-based
        approach in this paper. Our research team uses multi-kernel convolution
        recurrent neural network model for the humor detection in tweets. We
        examine our method’s performance and show how each component of
        our design contributes to its overall success.

        Keywords: Humor · Recurrent Neural Network · Text · CNN · tweets.


1     Introduction
Humor is a universal and subtle emotion that can be found all over the world.
The majority of prior research and studies on humor difficulties were focused
on binary categorization or the identification of linguistic features. Purandare
and Litman utilized typical supervised learning classifiers to recognize hilarious
speech in a hilarious spoken dialogue as data from a famous comedy television
show [18]. Taylor and Mazlack used the methodology that was based on the
extraction of structural patterns and peculiar structure of jokes newcite [20].
Luke de Oliveira and Láinez applied recurrent neural network (RNN) and con-
volutional neural networks (CNNs) to humor detection from reviews in Yelp
dataset [9].
   Because tweets are brief and informal user-generated text that typically do
not follow grammatical standards, detecting humor in them presents particular
problems to the research community. Furthermore, tweets feature a plethora of
    IberLEF 2021, September 2021, Málaga, Spain.
    Copyright © 2021 for this paper by its authors. Use permitted under Creative
    Commons License Attribution 4.0 International (CC BY 4.0).
unique abbreviations as well as Twitter-specific syntaxes like hashtags and emo-
jis. To address the challenge of humor detection, Chiruzzo et al. [8] presented a
task that focuses on detecting humor in Spanish tweets. Various subtasks related
to automated humor detection are proposed in the present HAHA assessment
campaign. In this paper, we propose a neural network method that combines
the multi-kernel convolution with the Bi-LSTM. Experimental results on given
Spanish tweets demonstrate the submitted results of our framework.
    The rest of the paper is structured as follows: Section 2 presents a summary
of previous studies. In Section 3, we introduce our proposed humor detection
framework. Section 4 includes experiments and evaluations. Some concluded
remarks of our work are described in Section 5.


2   Related Research


In the related work of humor identification, there are a lot of work that is done
over the year which includes statistical and N-gram analysis [20], Regression
Trees [18], Word2Vec combined with K-NN Human Centric Features, and Con-
volutional Neural Networks [6]. When working with a limited number of charac-
teristics, neural networks function exceptionally effectively. When dealing with
changing length sequences, sequence variants to prior states, as in recurrent neu-
ral networks, can be introduced to the network. To identify jokes from non-jokes,
several humor detection algorithms include hand-crafted (typically word-based)
characteristics [21, 12, 3, 15]. Such word-based features work well when the non-
joke dataset contains terms that are entirely distinct from the humor dataset.
According to humor theory, the sequence of words matters, because announcing
the punchline before the setup would merely lead to the discovery of the second
interpretation of the joke, causing the joke to lose its humorous component [19].
For some years, figurative language, particularly humor, has been a fruitful field
of research in the domain of shared tasks. One of the problems provided by fig-
urative language, such as metaphors and irony, was discussed in Semeval-2015
Task 11 [10]: its influence on Sentiment Analysis. Semeval-2017 Task 6 [17] sup-
plied participants with hilarious tweets sent to a comedy show, and asked them
to guess how the audience and producers of the show would rate the tweets.
The HAHA task was arranged by Grupo PLN-InCo at two different conferences:
IberEVAL 2018 [5]. There was two subtasks: humor detection and prediction of
funniness score. SemEval-2021 Task 7 [11] is a newer task that combines humor
detection with offense detection. It includes all of the subtasks from HAHA 2018
and 2019[7], plus two new ones: Offense Score Prediction and Controversial Hu-
mor Classification. In [8], there were four subtasks, from which we participated
in two of them.
3     Framework
In this section, we describe the details of our proposed framework for humor
detection is Spanish tweets. Figure 1 depicts an overview of our proposed frame-
work.
   We utilize the pre-trained-word-vectors-for-spanish word embedding pur-
poses. The embedding matrix is fed into the embedding layer of our neural
network. We start by extracting higher-level feature sequences from the target
added tweet embeddings using multi-kernel convolution filters. These feature se-
quences are supplied into the Bi-LSTM that is linked to it. Following that, we
go through each component in detail.


                                               Dimensions = 300,
                                               Vectors=1,000,653


       Multi-kernel CNN
       kernel size [3,4,5]


         Pooling Layer


                             Feature Vector


                               Forward
                                Layer                  LSTM
         Bi-LSTM Layer
                                                                                 Backward
                                                                          LSTM     Layer


                                                                   Dense Layer

                                                                     Results


                                              Fig. 1. Proposed framework.


3.1   Embedding
Starting with random weights, the Embedding layer in Figure 1 will learn an
embedding for all of the words in the training dataset. The initial hidden layer
of a network is defined as this flexible layer. A pre-trained model used in the
embedding layer simply required a file containing tokens and their associated
word vectors. The pre-trained word vectors for Spanish model was built using
300-dimensional word vectors. Dimensionality is a term that refers to how many
dimensions there are in anything. The Embedding matrix will have a dimen-
sionality of L x D, where L is the sentence length and D is the word-vector
dimension.

3.2   Convolution Neural System
We use the technique given by [13] to extract higher-level features in our multi-
kernel convolution. The embedding matrix created in the embedding layer is
the module’s input. Then, using a filter, we apply convolution on it. We use
three distinct kernel sizes, or the size of the convolution filters, to apply multiple
convolutions: 3, 4, and 5. Each filter creates the matching feature maps after
performing convolutions, and then a max-pooling function is used to build a
univariate feature vector. Finally, each kernel’s feature vectors are concatenated
to create a single high-level feature vector.

3.3   Bi-LSTM
Bidirectional Long Short Term Memory (Bi-LSTM) is a bidirectional variant
of LSTM seen in the center of Figure 1. Bi-LSTM combines the forward and
backward hidden layers, allowing access to both the previous and subsequent
contexts. The Bi-LSTM neural network is used to obtain a vector representation
of the input sentence that captures the semantics of the phrase effectively. The
final result from Bi-LSTM’s output layer is formed by merging the results from
both RNN hidden layers, namely the forward and backward layers.

3.4   Humor classification and Prediction
We get our results from the last linear layer of the model. We consider binary
cross-entropy and mean square error (mse) as the loss function in sub-task1 and
sub-task2, respectively. We use the stochastic gradient descent (SGD) to learn
the model parameter and adopt the Adam optimizer [14].


       Table 1. Results of Sub-task1 and Sub-task2 with other teams results.

                               Team        F1 RMSE
                         Baseline        0.6493 0.6532
                         Our Framework 0.7441 1.5164
                           Jocoso        0.8850 0.6296
                           icc           0.8716 0.6853
                           RoBERToCarlos 0.7961 0.8602
4     Evaluation

4.1   Dataset

The organizer provide a corpus of crowd-annotated tweets divided into three
subsets for tasks 1 and 2: training (24,000 tweets), development (4,000 tweets),
and testing (6,000 tweets). The annotation has a voting system in which users
can choose from six different choices. The tweet is either not funny or funny,
with a score ranging from one (not funny) to five (very hilarious) (excellent).
    To prepare the data, we eliminated stop words using NLTK’s standard sto-
plist, eliminated special characters, and performed hashtag segmentation using
the hashtag segmentation tool [2]. The fixed length of a sentence was set in the
beginning of embedding technique.


4.2   Model Configuration

In the following, we describe the set of parameters that we have used in our
framework during experiments. We used one embedding model to initialize the
word embeddings in the embedding layer. The embedding model has 300-dimensional
with 1,000,653 vectors. It is trained on Spanish Billion Word Corpus which has
the size of 1.4 billion words [4]. For the multi-kernel convolution, we employed
3 kernel sizes (3,4,5), and the number of filters was set to 36. The framework
which we used to design our model was based on TensorFlow [1] and training of
our model is done on a GPU [16] to capture the benefit from the efficiency of
parallel computation of tensors. We trained our model for a max of 50 epochs
with a batch size of 64 and an initial learning rate of 0.001 by Adam optimizer.
In this paper, we reported the results based on these settings. Unless otherwise
stated, default settings were used for the other parameters.


4.3   Results and Analysis

Our target is to classify the tweets into humorous or not humorous from Spanish
tweets. In Table 1, at first we reported the results of sub-task1 and sub-task2
based on a Naive Bayes with tfidf features and SVM regression with tfidf features,
respectively. Next, we reported the results of our proposed framework that were
submitted in the competition. After the competition, some parameters were
changed to check the betterment of results, and we found that we could improve
our results so far to 77.453 percent F1 for sub-task1, and 0.6977 RMSE error for
sub-task2, although these results could not be submitted to the competition.


5     Conclusion

Our technique for HAHA@IberLEF2021: Humor Analysis based on Human An-
notation Forum was described in this article. Humor detection is a difficult pro-
cess. We used deep learning techniques to try to solve the problem. We ran
some tests with other models, such as a basic regression model and a multi-
layer perceptron model, however the model described in this paper was the one
that produced the best results. In a summary, our unified framework’s key con-
tribution is that it successfully learns contextual information, which improves
comedy detection performance. We want to leverage external data to generalize
our model for comedy identification in the same region in the future.


Acknowledgments
This research was supported by the Japan International Cooperation Agency –
JICA under Innovative Asia program.


References
 1. Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghe-
    mawat, S., Irving, G., Isard, M., et al.: Tensorflow: A system for large-scale machine
    learning. In: 12th {USENIX} Symposium on Operating Systems Design and Im-
    plementation ({OSDI} 16). pp. 265–283 (2016)
 2. Baziotis, C., Pelekis, N., Doulkeridis, C.: DataStories at SemEval-2017 task 4: Deep
    LSTM with attention for message-level and topic-based sentiment analysis. In:
    Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-
    2017). pp. 747–754. Association for Computational Linguistics, Vancouver, Canada
    (Aug 2017). https://doi.org/10.18653/v1/S17-2126, bluehttps://www.aclweb.org/
    anthology/S17-2126
 3. van den Beukel, S., Aroyo, L.: Homonym detection for humor recognition in short
    text. In: Proceedings of the 9th Workshop on Computational Approaches to Sub-
    jectivity, Sentiment and Social Media Analysis. pp. 286–291 (2018)
 4. Cardellino, C.: Spanish Billion Words Corpus and Embeddings (August 2019),
    bluehttps://crscardellino.github.io/SBWCE/
 5. Castro, S., Chiruzzo, L., Rosá, A.: Overview of the haha task: Humor analysis
    based on human annotation at ibereval 2018. In: IberEval@ SEPLN. pp. 187–194
    (2018)
 6. Chen, P.Y., Soo, V.W.: Humor Recognition Using Deep Learning pp. 113–117
    (2018). https://doi.org/10.18653/v1/n18-2018
 7. Chiruzzo, L., Castro, S., Etcheverry, M., Garat, D., Prada, J.J., Rosá, A.: Overview
    of haha at iberlef 2019: Humor analysis based on human annotation. In: IberLEF@
    SEPLN. pp. 132–144 (2019)
 8. Chiruzzo, L., Castro, S., Góngora, S., Rosá, A., Meaney, J.A., Mihalcea, R.:
    Overview of HAHA at IberLEF 2021: Detecting, Rating and Analyzing Humor
    in Spanish. Procesamiento del Lenguaje Natural 67(0) (2021)
 9. De Oliveira, L., Rodrigo, A.L.: Humor detection in yelp reviews. Retrieved on
    December 15, 2019 (2015)
10. Ghosh, A., Li, G., Veale, T., Rosso, P., Shutova, E., Barnden, J., Reyes, A.:
    SemEval-2015 task 11: Sentiment analysis of figurative language in Twitter. In:
    Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval
    2015). pp. 470–478. Association for Computational Linguistics, Denver, Colorado
    (Jun 2015). https://doi.org/10.18653/v1/S15-2080, bluehttps://www.aclweb.org/
    anthology/S15-2080
11. Gupta, A., Pal, A., Khurana, B., Tyagi, L., Modi, A.: Humor@iitk at semeval-2021
    task 7: Large language models for quantifying humor and offensiveness (04 2021)
12. Kiddon, C., Brun, Y.: That’s what she said: double entendre identification. In:
    Proceedings of the 49th annual meeting of the association for computational lin-
    guistics: Human language technologies. pp. 89–94 (2011)
13. Kim, Y.: Convolutional Neural Networks for Sentence Classification pp. 1746–1751
    (2014)
14. Kingma, D.P., Ba, J.L.: A : a m s o pp. 1–15 (2015)
15. Mihalcea, R., Strapparava, C.: Making computers laugh: Investigations in auto-
    matic humor recognition. In: Proceedings of Human Language Technology Confer-
    ence and Conference on Empirical Methods in Natural Language Processing. pp.
    531–538 (2005)
16. Owens, J.D., Houston, M., Luebke, D., Green, S., Stone, J.E., Phillips, J.C.: Gpu
    computing. Proceedings of the IEEE 96(5), 879–899 (2008)
17. Potash, P., Romanov, A., Rumshisky, A.: SemEval-2017 task 6: #HashtagWars:
    Learning a sense of humor. In: Proceedings of the 11th International Workshop on
    Semantic Evaluation (SemEval-2017). pp. 49–57. Association for Computational
    Linguistics, Vancouver, Canada (Aug 2017). https://doi.org/10.18653/v1/S17-
    2004, bluehttps://www.aclweb.org/anthology/S17-2004
18. Purandare, A., Litman, D.: Humor: Prosody analysis and automatic recognition for
    f* r* i* e* n* d* s. In: Proceedings of the 2006 Conference on Empirical Methods
    in Natural Language Processing. pp. 208–215 (2006)
19. Ritchie, G.: Developing the incongruity-resolution theory. Tech. rep. (1999)
20. Taylor, J.M., Mazlack, L.J.: Computationally Recognizing Wordplay in Jokes The-
    ories of Humor (1991) (2000)
21. Taylor, J.M., Mazlack, L.J.: Computationally recognizing wordplay in jokes. In:
    Proceedings of the Annual Meeting of the Cognitive Science Society. vol. 26 (2004)