Identifying Ironic Content Spreaders on Twitter using
Psychometrics, Contextual and Ironic Features with
Gradient Boosting Classifier
Notebook for PAN at CLEF 2022

Ehsan Tavan*1 , Maryam Najafi*1 and Reza Moradi*1
1
    NLP Department, Part AI Research Center, Tehran, Iran


                                         Abstract
                                         The study of irony detection on social networks has gained much attention in recent years. As part of
                                         this task, a collection of users’ tweets is provided, and the goal is to determine if these users are spreaders
                                         of irony or not. As we hypothesized that user-generated content is affected by the author’s psychometric
                                         characteristics, contextual information, and irony features in the text, we investigated the effects of
                                         incorporating this information to identify ironic spreaders. Using the emotion and personality detection
                                         module, we were able to extract the author’s psychometric features. A pre-trained language model based
                                         on SBERT and T5-based architecture has been employed to extract context-based features. Our paper
                                         describes a framework using the author’s psychometric, contextual, and ironic features in a Gradient
                                         Boosting classifier based on our theory. Experimental results in this paper demonstrate the importance
                                         of this combination in identifying ironic spreader users. As a result, we were able to achieve an accuracy
                                         of 95.00% and 93.81% with 5-fold and 10-fold cross-validation respectively on the IROSTEREO training
                                         dataset. However, on the official PAN test set, our system attained a 88.89% score.

                                         Keywords
                                         Stereotype, Author Profiling, Irony Detection, Sarcasm Detection, Language Psychology, Deep Learning


1. Introduction
Social media networks have become a platform for people with different intellectual, ideological,
and psychological characteristics to share their thoughts, opinions, and interests. In view of the
high importance and comprehensiveness of the information published in these networks, it is
urgent to develop automatic tools for processing it. Furthermore, users often use informal text
in addition to the usual linguistic complexities such as slang, idioms, typos, and grammatical
errors. By contrast, people often convey their meaning in complex ways. Using Figurative
Language (FL) is one way to emphasize one’s point of view.
   FL utilizes linguistic features such as sarcasm and irony, which are common in user generated
content on platforms including Facebook and Twitter. Despite some attempts to provide a
good definition of irony, there is still no consensus in the research community. But almost all

                  *
    These authors contributed equally to this work
CLEF 2022 – Conference and Labs of the Evaluation Forum, September 5–8, 2022, Bologna, Italy
$ ehsan.tavan@partdp.ai (E. Tavan* ); maryam.najafi@partdp.ai (M. Najafi* ); rezymo@gmail.com (R. Moradi* )
 0000-0003-1262-8172 (E. Tavan* ); 0000-0002-0877-7063 (M. Najafi* ); 0000-0003-0372-6993 (R. Moradi* )
                                       © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073       CEUR Workshop Proceedings (CEUR-WS.org)
researchers agree on two ironic features. First, it is unclear exactly what the author meant by
his words. In fact, it uses words contrary to the original meaning. The second is that irony
arises from opinions, emotions, feelings, and thoughts [1, 2, 3, 4, 1].
   Irony and sarcasm are two related concepts that are sometimes used interchangeably in the
literature, but there are subtle differences between them [1, 5]. Sarcasm’s rarity, infrequency,
difficulty in detection, and ambiguous meaning are its most challenging aspects. A positive
word that is used in sarcasm can imply a negative meaning. For example, in "If voting by mail
is good enough for Trump, it is good enough for me." despite the use of positive words, polarity
is negative due to the presence of ironic content.
   Recently, the research community has become more interested in identifying users who
consistently publish certain types of textual content, such as Fake News (FN) and Hate Speech
(HS). PAN introduces hate speech and fake news spreader profiling tasks in 2020 and 2021
demonstrating the importance of Author Profiling (AP) in social media networks [6, 7]. These
tasks focused on identifying spreaders as an AP task rather than the published text to prevent
the publication of hateful and fake news. Now, in Profiling Irony and Stereotype Spreaders
on Twitter (IROSTEREO) task in PAN@2022 [8, 9], the goal is to identify spreaders of ironic
content on Twitter. Especially, those prone to use irony to spread stereotypes against women or
the LGBT community 1 . The ultimate goal is to identify and profile spreaders of ironic content
to prevent the spread of this type of content. As far as we know, this task has not been done
before.
   In the IROSTEREO task, the main challenge is to get a comprehensive representation of the
user’s tweets. User tweets contain meaningful information that indicates the user’s personality
type, opinion, and thoughts. Using this wide range of information in a user-level representation
is challenging. Therefore in this research, we propose a parallel framework for extracting
user’s representation based on contextualize, psychometric and ironic features. The proposed
framework includes three modules: 1) Contextualized embedding module, 2) Psychometric
embedding module, and 3) Irony embedding module. The sentence transformers (SBERT)
are used in the contextual embedding module to extract context features from user’s tweets.
Also, in the psychometric embedding module and irony embedding module, we use deep
learning architecture by using the Text-To-Text Transfer Transformer (T5) language model.
To prepare user-level representation by these modules, we first extract the representation for
each user’s tweets. By averaging this tweet-level representation embedding, the final user-
level representation embedding is obtained. After extracting the user-level embedding vector,
the Gradient Boosting classifier is used to classify users into ironic and non-ironic categories.
By using the concatenating contextualized, irony, and psychometric embedding vector, the
proposed framework has achieved 88.89% accuracy on the official PAN test set. Our code is
available at GitHub2 for researchers.
   The rest of the paper is organized as follows: In Section 2 we present the related works; in
Section 3 we described the characteristics of dataset. In Section 4 we present our proposed
method; the experiments and results are presented in Section 5, and finally, our conclusion and
future works are in Section 6.

   1
       https://pan.webis.de/clef22/pan22-web/author-profiling.html
   2
       https://github.com/MarSanTeam/Author_Profiling
2. Related Works
IROSTEREO is a challenging interdisciplinary sub-task of NLP that sits between Irony Detection
and AP. Therefore, both Irony detection and AP were investigated in the following sections.

2.1. Irony Detection
Due to the high semantic similarity of irony and sarcasm as well as the comprehensiveness of our
studies, research on both tasks has been considered. In [10], a hybrid method called CASCADE
is proposed to use both context and content-based information. It used user embedding and
discourse embedding to capture the stylometric, and personality traits of users, and also topical
information. On the other hand, it used content-based features with a CNN-based network to
capture local patterns. Some research has shown the remarkable effect of the use of sentiment
analysis in irony detection. For instance, [2] used transfer learning from sentiment resources.
They found that sentiment knowledge has valuable information for irony detection to overcome
their implicit incongruity. The author of [11], used the multi-head attention bidirectional
LSTM-based network to recognize sarcasm comments.
   Recently, attention is focused on using the pre-trained language model. [12] proposed an
end-to-end architecture that uses a pre-trained transformer, namely RCNN-RoBERTa. Moreover,
in [13] a RoBERTa-based model is proposed to demonstrate the importance of using contextual
information. The [14] proposed a BERT-based model using weight loss techniques to overcome
data unbalance problems and ensemble learning for better generalization. Additionally, [15]
introduced a pre-trained transformer model and aspect-based sentiment analysis to identify
whether response comments in a dialogue are sarcastic or not.

2.2. Author Profiling
Due to the significant increase in the research community’s attention to AP, much research has
been done in recent years. Some of these works have used deep learning methods but others
investigated hybrid methods. In the following sections, these methods are described in more
detail.
   Deep-learning-based Methods:
   In recent years, the two-step method has become the most popular in this field. These methods
aim to extract user profiles in a two-step way. In the first and second steps, a tweet-level and
user-level representation is extracted respectively. [16] proposed a two-step method. In the first
step, they extract the contextualized vector for tweets to make a tweet-level representation,
and in the second, aggregate them by averaging to make a user-level representation. For the
tweet-level representation, They examined different variants of the BERT model and found that
the best result was achieved by the BERT-base model that was fine-tuned by the external corpus
of Twitter for sentiment analysis [17]. Finally, they used the user-level representation to identify
haters. In [18], two methods were proposed for the Hate Speech task (HS) at PAN 2021 in the
first method, tweets were classified based on the hate label in the tweet-level method. Then,
users were classified based on the average label of user tweets. In another method, they used
the proposed method in [16] with a little modification. They found that the user-level method
with the BERT model outperforms the other baselines. Likewise, [19] introduced a method
like what was proposed in [18] that uses BERTweet instead of the BERT model. Although, the
CNN-based model is in the literature, in [20] authors proposed a CNN-based model for handling
the Hate Speech Profiling in English and Spanish on PAN@2020. During this competition, the
proposed method won the competition. It was based on a 1D-Convolutional layer in addition to
local and global average pooling.
   Hybrid methods:
   The main purpose of these methods is to use the techniques of other NLP sub-tasks and
sciences such as cognitive science methods. In the study concluded by [21], a hybrid approach
was proposed that used contextualized representation (CR) and character-based statistical
embedding, named Char-LDSE. For CR, used extractive summarization and the RoBERTa
language model to extract the semantically important tweets. Then, used these representations
to make an ensemble. These representations were used to make an ensemble. However, in [22]
an interesting idea was proposed based on using concepts and coarse-grained to handle the FN
problem. They used a variant of ConceptNet [23] by the name ConceptNet Numberbatch to
make the conceptual representation. On the other hand, n-gram features and the tfidf weighting
method were used to make a coarse-grained representation. These two forms of representation
were combined and fed to a CNN-based network. Regardless of these methods’ abilities and the
promising results they have achieved, they are still far from competing with state-of-the-art
methods.


3. Dataset
The IROSTEREO task dataset consists of 420 anonymous authors in English. The XML file is
provided for each author containing 200 tweets. Thus, the author is classified as either an ironic
or non-ironic spreader. There is an equal distribution for each category in the dataset, with
each category having 210 authors. Additionally, it is important to note that all URLs, hashtags,
and mentioned or retweeted users were masked by standard tokens.


4. Proposed Method
Our methodology for IROSTEREO uses the proposed dataset in PAN@2022 and includes the
following modules: Contextualized Embedding (CE), Psychometric Features Embedding (PFE),
Irony Embedding (IRE) module. The proposed framework is shown in Figure 1. CE, PFE,
and IRE were developed using deep learning methods and language models. Each module
aims to make a user-level representation from its perspective. The user-level representation
is derived by averaging the tweet-level representation. An overall user-level representation
is obtained by concatenating representations derived from CE, PFE, and IRE. The Gradient
Boosting Classifier was used over this representation to classify ironic and non-ironic users
based on user representations. Below are details regarding each component.
              Contextualized Representation     Psychometric Features Representation                Irony Representation
                                              Personality Representation   Emotion Representation


                                                                User Classifier


                                                     Irony                        Not Irony


                            Figure 1: The architecture of the Proposed Method

4.1. Contextualized Embedding Module
Creating a contextualized representation for each tweet is the main goal of the module. The
use of contextual information is critical for irony detection because a high degree of irony
detection is based on context. The transformer-based language models have excelled in capturing
contextualized embedding, they have performed very well in various NLP-based tasks. Thus,
Sentence-BERT (SBERT) [24], a variant of BERT that was developed to determine sentence
semantic similarity was used in our paper. With SBERT, semantically meaningful sentence
embedding is calculated using a siamese architecture. We derive sentence embeddings from
word embeddings by performing a mean pooling process.

4.2. Psychometric Features Embedding Module
According to language psychology, the words that people use in everyday life are influenced by
their psychological processes, including thoughts, feelings, emotional states, and personality
[25]. Understanding an author’s psychological state can be enhanced by identifying personality
and emotional features in the text. [26] shows that identifying these states improves the accuracy
of irony recognition models. Therefore, to identify the author’s psychological state, we have
                                    Tweet Embedding              Softmax Layer


                                         Fully Connected
                                              Layer


                                              Max Pooling
                                                Layer


                                                                  T5 Encoder


                                                 T1         T2       T3          Tn


Figure 2: The architecture of the proposed model for the Personality Embedding, Emotion
Embedding, and Irony Embedding Module. The input of this model is just a single tweet.


considered a representation of emotion and personality.

4.2.1. Personality Embedding Module
The personality embedding module is trained to recognize the author’s personality type based
on the Myers Briggs Type Indicator (MBTI)3 dataset. In MBTI dataset, texts were labeled by 16
distinct personality types along 4 axes. A count of 8675 different samples was included in this
dataset, including MBTI type of each author and their last 50 tweets.
   We prepare a personality representation for each tweet by using a deep learning model based
on the Text-to-Text Transfer Transformer (T5) language model [27]. The T5 language model
was shown high performance in various NLP applications. The proposed architecture for this
module is shown in Figure 2.
   In order to train personality embedding module, we utilize the last hidden-states of the T5
encoder model to prepare the word representation vector. Then, the most appropriate features
are extracted using max-pooling and a fully connected layer. The output label is then predicted
using a softmax layer. After training thid module personality representation for each tweet is
obtained by the fully connected layer output. Then, an average of all tweet-level representations
for each user uses as a personality embedding of user.

4.2.2. Emotion Embedding Module
Making user-level emotion embedding is the aim of this module. Studies have shown the high
influence of emotional features on irony detection models [28, 29, 30]. The use of emotion
features has improved the accuracy of the irony detection task in many studies. Thus, we used
the described method in Section 4.2.1 and the CrowdFlower4 dataset to make user-level emotion
representation. CrowdFlower is an emotion detection dataset from Kaggle that was used for
training this module. This dataset has 13 different emotion labels and 4000 samples.


   3
       https://www.kaggle.com/datasets/datasnaek/mbti-type
   4
       https://www.kaggle.com/datasets/pashupatigupta/emotion-detection-from-text
4.3. Irony Embedding Module
As described in Section 3, the IROSTEREO dataset is not labeled at the tweet level. However,
with the theory that irony spreader users publish more ironic content, we examined the impact
of using ironic features on IROSTEREO performance. To accomplish this, we used a model
with the same architecture as shown in Figure 2 at the tweet level and train this model with
WLV5 irony detection dataset [31]. Using the method described in Section 4.2.1, we calculated
user-level irony representation by averaging tweet-level irony representation.

4.4. User Embedding Module
To create a final user-level embedding, we concatenate Contextualized, Personality, Emotion,
and Irony user-level embedding. Eventually, the final user-level embedding was created with
dimensions of 4096, which in addition to context-based and irony-based features, also includes
emotion-based and personality-based features called psychometrics features.

4.5. Classifier
This study uses the Gradient Boosting Classifier to identify ironic-content spreaders on Twitter
using the introduced features. The idea behind gradient boosting is to take a weak hypothesis
or weak learning algorithm and make a series of changes to it that will improve the strength of
the hypothesis or learning algorithm. we utilized the Log Loss function [32] as the loss function
and Friedman MSE [33] as criteria for stopping splitting in the classifier.


5. Experiments and Results
In this section, we review the results of various experiments. we use two metrics accuracy and
confidence interval (CI) with 95% confidence, to evaluate different classifiers and features. The
main evaluation metric in the IROSTEREO task is accuracy and the systems will be ranked by it.
CI is a range of estimation on a parameter that is performed on accuracy here. This is a range
of values that will expect the estimation to fall into it if the experiment was repeated.
   In our experiments, we evaluate the performance of our proposed framework by using
context-base, personality-base, emotion-base, and irony-base features, as well as a combination
of theirs on TIRA platform [34]. Also, in the experiments, two classifiers, support vector machine
and Gradient Boosting, are used and their performance is compared. Table 1. shows the results
of the experiments using 5-fold and 10-fold cross validation.
   Based on the results shown in Table 1, the use of context-base, personality-base, emotion-base,
and irony-base features can be used in the IROSTEREO task as a suitable features. Using the
Gradient Boosting classifier, which according to the experiments has a much better performance
compared to the support vector machine, the accuracy of the model using intorduced features
in 5-fold cross validation is 91.43, 91.43, 88.1 and 92.14 percent respectively. The accuracy of the
model when using the context-base features that extracted by the SBERT is better than other
features. This can can be due to the training of the SBERT model on a large amount of data and
   5
       https://github.com/omidrohanian/irony_detection
power of the SBERT to extract context-base features. Also, the performance of the model using
psychometric-base and irony-base features is very acceptable due to the small amount of data
to train the appropriate model to extract these features.
   Different experiments have been performed to obtain the best performance of the model
when concatenation the four introduced features. Table 1 shows these experiments results.

                                                                 Methods
               Feature                   Gradient Boosting Classifier         SVM(SVC)
                                            5-fold        10-fold        5-fold       10-fold
                                         ACC      CI  ACC        CI   ACC      CI  ACC      CI
            Personality                  91.43 5.89 91.19       8.35  61.19 10.42 62.14 14.61
               Irony                     91.43 5.89 90.71       8.49  61.19 10.42 62.14 14.61
             Emotion                     88.1    6.88  88.1     9.43  76.19     9  76.67 12.6
             Context                     92.14 5.62 90.48       8.44  92.86 5.35 93.33      6.8
      Context ⊕ Personality              92.38 5.60 93.10       7.31  92.38 10.42 92.41 14.61
         Context ⊕ Irony                 92.62 5.55 92.86       7.45  92.36 5.52 92.28 7.28
       Context ⊕ Emotion                 94.14 4.89 93.57       7.04  61.19 5.49 62.14 7.31
       Personality ⊕ Irony               91.67 5.84 91.43       8.21  60.22 5.53 62.43 7.33
      Personality ⊕ Emotion              91.67 5.82 91.90       8.03  76.67 8.98 77.62 12.16
         Emotion ⊕ Irony                 91.67 5.82 92.14       7.95  76.67 8.98 77.86 9.53
  Context ⊕ Irony ⊕ Personality          92.38 5.62 92.38       7.74  92.14 5.46 92.14 7.78
   Context ⊕ Irony ⊕ Emotion             94.76 4.59 93.57       7.11  90.95 6.06 90.71      8.6
 Context ⊕ Personality ⊕ Emotion         94.52 4.67 93.33       7.25  90.95 6.06 90.71      8.6
  Personality ⊕ Irony ⊕ Emotion          91.9     5.7 92.14     7.95  76.43 9.03 77.14 12.3
        Context ⊕ Irony ⊕
                                         95.00    4.5   93.81     6.93     90.95   6.06    91.19   8.35
      Emotion ⊕ Personality
Table 1
Results of experiments with four introduced features and Gradient Boosting and SVM classifier. 5-fold
and 10-fold cross validation are used in experiments.

   Because the use of these four features helps the model to extract different aspects of the
user’s tweets features, it is expected that by concatenation them, the model performance will
improve for recognize ironic authors. So we concatenated the features in different modes.
In concatenated the features in pairs, the highest accuracy is related to the concatenation of
context-base and emotion-base features. The accuracy of the model by concatenating these
features and using the Gradient Boosting classifier in 5-fold cross validation is 94.14, which
is higher accuracy compared to concatenating other features. The emotion-base feature has
lower accuracy than other features when used separately, but by concatenating this feature
with a context-based feature, the accuracy of the model is greatly improved, indicating that a
concatenating of these two features can be used to improved the performance of model.
   In the next step, we concatenated the features in threes. Based on the result of Table 1, the
concatenation of context-base and emotion -base features with irony-base and personality-base
features has helped to increase the accuracy of the model in 5-fold cross validation. Using a
concatenation of context-base, irony-base and emotion-base features, the model has reached
94.76 accuracy in 5-fold cross validation.
   In the last step, we concatenated all four introduced features ,and with the Gradient Boosting
in 5-fold cross validation, the accuracy is 95% accuracy. Based on these results, it can be
concluded that the concatenation of context-base, irony-base and also psychometric features
can be very helpful to identify ironic authors.
   One of the metrics that has been examined in all these experiments is the CI. This metric
indicates that if another experiment is performed on other data, the accuracy of the model is
expected to be within a certain range. The value obtained for this metric in the concatenation
of all four introduced features in 5-fold cross validation is 4.5%. So, we can claim that the true
classification accuracy of the model is likely between 99.5% and 91.5%. In fact if we repeat this
experiment several times and use different data each time, we would find that for approximately
95% (we use this confidence value for all of our experiments) of these experiments, the calculated
interval would contain the true accuracy.


6. Conclusions and Future Works
In this paper, we proposed a framework for the Profiling Irony and Stereotype Spreaders on the
Twitter task in PAN 2022. As part of our approach, in the first step, we extracted features at
the tweet level, and in the second step, we built a user-level representation of each user. The
novelty of this work was emphasizing the Psychometric, Ironic, and Context-based features.
We demonstrated how context-based, irony-related, and psychometric features affect system
performance for distinguishing ironic from non-ironic Twitter authors. Finally, in the 10-fold
cross-validation and test sets published by the organizers, we achieved an average accuracy
of 93.81% and 88.89% in the train and test sets, respectively. In future works, we would like to
examine different language models over the proposed framework and other architectures to
achieve the best performance of the current framework. Also, we plan to investigate a variety
of features like sentiment analysis to boost the current user-level representation.


References
 [1] R. Ortega-Bueno, F. Rangel, D. Hernández Farıas, P. Rosso, M. Montes-y Gómez, J. E.
     Medina Pagola, Overview of the task on irony detection in spanish variants, in: Proceedings
     of the Iberian languages evaluation forum (IberLEF 2019), co-located with 34th conference
     of the Spanish Society for natural language processing (SEPLN 2019). CEUR-WS. org,
     volume 2421, 2019, pp. 229–256.
 [2] S. Zhang, X. Zhang, J. Chan, P. Rosso, Irony detection via sentiment-based transfer learning,
     Information Processing & Management 56 (2019) 1633–1644.
 [3] K. Buschmeier, P. Cimiano, R. Klinger, An impact analysis of features in a classification
     approach to irony detection in product reviews, in: Proceedings of the 5th workshop on
     computational approaches to subjectivity, sentiment and social media analysis, 2014, pp.
     42–49.
 [4] C. Van Hee, Can machines sense irony?: exploring automatic irony detection on social
     media, Ph.D. thesis, Ghent University, 2017.
 [5] E. Filatova, Irony and sarcasm: Corpus generation and analysis using crowdsourcing., in:
     Lrec, Citeseer, 2012, pp. 392–398.
 [6] F. Rangel, G. Sarracén, B. Chulvi, E. Fersini, P. Rosso, Profiling hate speech spreaders on
     twitter task at pan 2021, in: CLEF, 2021.
 [7] F. Rangel, A. Giachanou, B. H. H. Ghanem, P. Rosso, Overview of the 8th author profiling
     task at pan 2020: Profiling fake news spreaders on twitter, in: CEUR Workshop Proceedings,
     volume 2696, Sun SITE Central Europe, 2020, pp. 1–18.
 [8] J. Bevendorff, B. Chulvi, E. Fersini, A. Heini, M. Kestemont, K. Kredens, M. Mayerl,
     R. Ortega-Bueno, P. Pezik, M. Potthast, F. Rangel, P. Rosso, E. Stamatatos, B. Stein, M. Wieg-
     mann, M. Wolska, E. Zangerle, Overview of PAN 2022: Authorship Verification, Profiling
     Irony and Stereotype Spreaders, and Style Change Detection, in: M. D. E. F. S. C. M. G. P. A.
     H. M. P. G. F. N. F. Alberto Barron-Cedeno, Giovanni Da San Martino (Ed.), Experimental
     IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Thirteenth
     International Conference of the CLEF Association (CLEF 2022), volume 13390 of Lecture
     Notes in Computer Science, Springer, 2022.
 [9] O.-B. Reynier, C. Berta, R. Francisco, R. Paolo, F. Elisabetta, Profiling Irony and Stereotype
     Spreaders on Twitter (IROSTEREO) at PAN 2022, in: CLEF 2022 Labs and Workshops,
     Notebook Papers, CEUR-WS.org, 2022.
[10] D. Hazarika, S. Poria, S. Gorantla, E. Cambria, R. Zimmermann, R. Mihalcea, Cascade:
     Contextual sarcasm detection in online discussion forums, arXiv preprint arXiv:1805.06413
     (2018).
[11] A. Kumar, V. T. Narapareddy, V. A. Srikanth, A. Malapati, L. B. M. Neti, Sarcasm detection
     using multi-head attention based bidirectional lstm, Ieee Access 8 (2020) 6388–6397.
[12] R. A. Potamias, G. Siolas, A.-G. Stafylopatis, A transformer-based approach to irony and
     sarcasm detection, Neural Computing and Applications 32 (2020) 17309–17320.
[13] T. Dadu, K. Pant, Sarcasm detection using context separators in online discourse, in:
     Proceedings of the Second Workshop on Figurative Language Processing, 2020, pp. 51–55.
[14] S. Jiang, C. Chen, N. Lin, Z. Chen, J. Chen, Irony detection in the portuguese language
     using bert, Proceedings http://ceur-ws. org ISSN 1613 (2021) 0073.
[15] S. Javdan, B. Minaei-Bidgoli, et al., Applying transformers and aspect-based sentiment
     analysis approaches on sarcasm detection, in: Proceedings of the Second Workshop on
     Figurative Language Processing, 2020, pp. 67–71.
[16] E. Finogeev, M. Kaprielova, A. Chashchin, K. Grashchenkov, G. Gorbachev, O. Bakhteev,
     Hate speech spreader detection using contextualized word embeddings, in: CLEF, 2021.
[17] A. Go, R. Bhayani, L. Huang, Twitter sentiment classification using distant supervision,
     CS224N project report, Stanford 1 (2009) 2009.
[18] T. Anwar, Identify hate speech spreaders on twitter using transformer embeddings features
     and automl classifiers (2021).
[19] R. L. Tamayo, D. C. Castro, R. O. Bueno, Deep modeling of latent representations for
     twitter profiles on hate speech spreaders identification task (2021).
[20] M. Siino, E. Di Nuovo, I. Tinnirello, M. La Cascia, Detection of hate speech spreaders using
     convolutional neural networks, in: CLEF, 2021.
[21] H. B. Giglou, T. Rahgooy, J. Razmara, M. Rahgouy, Z. Rahgooy, Profiling haters on twitter
     using statistical and contextualized embeddings, in: CLEF, 2021.
[22] H. B. Giglou, J. Razmara, M. Rahgouy, M. Sanaei, Lsaconet: A combination of lexical and
     conceptual features for analysis of fake news spreaders on twitter., in: CLEF (Working
     Notes), 2020.
[23] R. Speer, J. Chin, C. Havasi, Conceptnet 5.5: An open multilingual graph of general
     knowledge, in: Thirty-first AAAI conference on artificial intelligence, 2017.
[24] N. Reimers, I. Gurevych, Sentence-bert: Sentence embeddings using siamese bert-networks,
     arXiv preprint arXiv:1908.10084 (2019).
[25] S. Argamon, S. Dhawle, M. Koppel, J. W. Pennebaker, Lexical predictors of personality type,
     in: Proceedings of the 2005 Joint Annual Meeting of the Interface and the Classification
     Society of North America, 2005, pp. 1–16.
[26] S. Poria, E. Cambria, D. Hazarika, P. Vij, A deeper look into sarcastic tweets using deep
     convolutional neural networks, arXiv preprint arXiv:1610.08815 (2016).
[27] C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, P. J. Liu,
     Exploring the limits of transfer learning with a unified text-to-text transformer, arXiv
     preprint arXiv:1910.10683 (2019).
[28] S. Frenda, V. Patti, Computational models for irony detection in three spanish variants, in:
     2019 Iberian Languages Evaluation Forum, IberLEF 2019, volume 2421, CEUR-WS, 2019,
     pp. 297–309.
[29] D. I. H. Farías, V. Patti, P. Rosso, Irony detection in twitter: The role of affective content,
     ACM Transactions on Internet Technology (TOIT) 16 (2016) 1–24.
[30] A. Reyes, P. Rosso, D. Buscaldi, From humor recognition to irony detection: The figurative
     language of social media, Data & Knowledge Engineering 74 (2012) 1–12.
[31] O. Rohanian, S. Taslimipoor, R. Evans, R. Mitkov, Wlv at semeval-2018 task 3: Dissecting
     tweets in search of irony, Association for Computational Linguistics, 2018.
[32] V. Vovk, The fundamental nature of the log loss function, in: Fields of Logic and Compu-
     tation II, Springer, 2015, pp. 307–318.
[33] J. H. Friedman, Greedy function approximation: a gradient boosting machine, Annals of
     statistics (2001) 1189–1232.
[34] M. Potthast, T. Gollub, M. Wiegmann, B. Stein, TIRA Integrated Research Architecture,
     in: N. Ferro, C. Peters (Eds.), Information Retrieval Evaluation in a Changing World, The
     Information Retrieval Series, Springer, Berlin Heidelberg New York, 2019. doi:10.1007/
     978-3-030-22948-1\_5.