Profiling Irony and Stereotype Spreaders on Twitter
with BERT
Yifan Xu1 , Hui Ning1,*
1
    Harbin Engineering University, Harbin, China


                                         Abstract
                                         This paper summarises the participation at the "Profiling Irony and Stereotype Spreaders on Twitter"
                                         shared task at PAN at CLEF 2022, and proposes a method which can detect irony and stereotype spreaders
                                         automatically. We detect whether a user is a irony and stereotype spreader instead of detecting a single
                                         content. In this paper, we use BERT embeddings and autogluon which can automates classic machine
                                         learning methods to train a classifier.We upload the forecast results to TIRA[1] Platform. Using our
                                         method, an accuracy of 94.3 % is achieved on the English training set. On the English test set, our system
                                         achieved an accuracy result of 94.4 %.

                                         Keywords
                                         Irony and stereotype, Twitter, Autogluon, BERT


1. Introduction
With the development of the Internet, social media has become an important medium which
people use it to communicate. Information spreads widely and quickly on social media. How-
ever, with the popularity of social media, some problems have gradually emerged. Irony and
stereotype spreaders on Twitter is one of them.
   With irony, language is employed in a figurative and subtle way to mean the opposite to what
is literally stated. In case of sarcasm, a more aggressive type of irony, the intent is to mock or
scorn a victim without excluding the possibility to hurt. Stereotypes are often used, especially
in discussions about controversial issues such as immigration or sexism and misogyny.
   In this paper, we perform irony and stereotype spreaders identification from Twitter data[2],
provided by the organizers of PAN’22. At PAN’22[3], we focus on profiling ironic authors in
Twitter. Special emphasis will be given to those authors that employ irony to spread stereotypes,
for instance, towards women or the LGBT community. The goal will be to classify authors as
ironic or not depending on their number of tweets with ironic content.
   In Section 2 we present some related work on profiling irony and stereotype spreaders. In
Section 3 we describe the method proposed. In Section 4 we present the experimental results
achieved. Finally, in Section 5, we present the conclusions and future work.


                  ∗
    Corresponding author
CLEF 2022: Conference and Labs of the Evaluation Forum, September 5–8, 2022, Bologna, Italy
$ xvyifan@hrbeu.edu.cn (Y. Xu); ninghui@hrbeu.edu.cn (H. Ning)
                                       © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)
2. Related Work
There are few researches on profiling irony and stereotype spreaders. But we can find many
researches on hate speech detection. Both of them have a lot in common. Though a lot of
sentences with irony don’t contain rude words, they can hurt people deeply. Irony is another
way to attack other people.
   Detection of hate speech has been popular research in recent years. We can consider this
problem as a text classification task. Researchers usually extract different types of features
and exploit them in combination with the techniques of Machine Learning. There are various
kinds of classifiers used for this: Naive Bayes in combination with a Bag-of-Words approach[4];
Support Vector Machines, again applied on Bag-of-Words features[5]; Logistic Regression,
trained, for instance, on N-grams[6]. Besides these methods, Deep Learning techniques have
also been used in this problem. In many studies, we can find Deep Learning Models such as
Recurrent Neural Networks (RNN)[7] and Convolutional Neural Networks (CNN)[8].
   These days, transformer[9] has been the most popular Deep Learning technique in tasks of
Natural Language Processing (NLP)[10]. Using transformer such as BERT[11], the researchers
have achieved good results in classification tasks such as profiling hate speech and fake news
detection.
   In the 2021 PAN[12] shared task, Profiling Hate Speech Spreaders on Twitter[13], there was a
variety of methods used for classification, preprocessing, and feature selection, such as SVM[14],
LSTM[15], Naive Bayes, BERT and RoBERTa[16].
   With reference to the above research contents about hate speech detection, we can apply
their methods to our experiments on profiling irony and stereotype spreaders.


3. Method
This section will introduce the datasets, data preprocessing and system for identifying irony
and stereotype spreaders.

3.1. Datasets
The datasets for this task are given in English. There are 420 authors in the train dataset and 180
authors in the test dataset. Half of all the authors in both sets were labeled with "I" indicating
that the author spreads irony and stereotype, while the other half in both sets was labeled with
"NI" to indicate that the author does not spread irony and stereotypes. Data is provided by the
organizers of PAN’22, which is collected on the Twitter. Each author has 200 unique tweet posts.
There are a total 12000 tweet posts in the dataset.

3.2. Preprocessing
Data preprocessing is used to remove noise. Links, user mentions, hashtags, and retweets were
removed. All punctuation and all numbers were removed. After that, we convert tweets to
lower case. Stop-words were retained. We set labels of authors as the label of their each tweet.
Table 1
Data after Preprocessing
                           Text                                         Label
                           billion tshirt ngotta be some in the...        I
                           The simple answer is usd just like...          I
                           I honestly boggle at the very existence...     I
                           Why would it ath means nothing...              I


3.3. Training
We didn’t finetune the BERT model on the data. Instead of that, we used the BERT embeddings
from the BERT model. We used the embeddings extracted from the last hidden layer and the
last four hidden layers to extract features of the data. In that way, we can find which layer can
achieve a better result. After extracting tweets features, we average them to author features.
Then, we send author features to autogluon[17]. AutoGluon automates machine learning tasks
which can easily achieve strong predictive performance in applications. Using autogluon, we
can find the best model for the classification. For the training, we used a 5-fold cross-validation.


Figure 1: Irony and stereotype classification


4. Results
According to the requirements of PAN’22, we use accuracy as the evaluation. From Table 2, we
can get the results. For the training set, the accuracy is 94.3 % using embeddings from the last
hidden layer of BERT. We achieved a accuracy of 94.0 % by the last four hidden layers. For the
testing set, the result is 93.3 % and 94.4 %.
Table 2
Results
                     Embeddings      Accuracy (Train Set)   Accuracy (Test Set)
                       The last            94.3 %                 93.3 %
                     The last four         94.0 %                 94.4 %


5. Conclusion
The emergence of social networking sites has epoch-making significance for the whole Internet
industry. Because it brings the real world into the virtual network world, profoundly changes
the way where people interact, and greatly improves the efficiency of spreading information.
But the social media also triggers some adverse issues such as irony and stereotype spreading. In
this paper, we propose a method which can detect irony and stereotype spreaders automatically.
Instead of analyzing the single content, the aim is to detect users who tend to publish posts
that fall into the category of "irony and stereotype". We used BERT embeddings and autogluon
which can automate classic machine learning methods used to classify irony and stereotype
spreaders on Twitter. The accuracy we achieved is 94.2% on the training set and 94.4% on the
testing set.


References
 [1] M. Potthast, T. Gollub, M. Wiegmann, B. Stein, TIRA Integrated Research Architecture,
     in: N. Ferro, C. Peters (Eds.), Information Retrieval Evaluation in a Changing World, The
     Information Retrieval Series, Springer, Berlin Heidelberg New York, 2019. doi:10.1007/
     978-3-030-22948-1\_5.
 [2] O.-B. Reynier, C. Berta, R. Francisco, R. Paolo, F. Elisabetta, Profiling Irony and Stereotype
     Spreaders on Twitter (IROSTEREO) at PAN 2022, in: CLEF 2022 Labs and Workshops,
     Notebook Papers, CEUR-WS.org, 2022.
 [3] J. Bevendorff, B. Chulvi, E. Fersini, A. Heini, M. Kestemont, K. Kredens, M. Mayerl,
     R. Ortega-Bueno, P. Pezik, M. Potthast, F. Rangel, P. Rosso, E. Stamatatos, B. Stein, M. Wieg-
     mann, M. Wolska, E. Zangerle, Overview of PAN 2022: Authorship Verification, Profiling
     Irony and Stereotype Spreaders, and Style Change Detection, in: M. D. E. F. S. C. M. G. P. A.
     H. M. P. G. F. N. F. Alberto Barron-Cedeno, Giovanni Da San Martino (Ed.), Experimental
     IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Thirteenth
     International Conference of the CLEF Association (CLEF 2022), volume 13390 of Lecture
     Notes in Computer Science, Springer, 2022.
 [4] I. Kwok, Y. Wang, Locate the hate: Detecting tweets against blacks, in: National Conference
     on Artificial Intelligence, 2013.
 [5] E. Greevy, A. F. Smeaton, Classifying racist texts using a support vector machine, ACM
     (2004).
 [6] Z. Waseem, D. Hovy, Hateful symbols or hateful people? predictive features for hate
     speech detection on twitter, SRW@HLT-NAACL (2016) 88–93.
 [7] D. F. Vigna, A. Cimino, F. Dell’Orletta, M. Petrocchi, M. Tesconi, Hate me, hate me not:
     Hate speech detection on facebook, ITASEC (2017) 86–95.
 [8] P. Badjatiya, S. Gupta, M. Gupta, V. Varma, Deep learning for hate speech detection in
     tweets, WWW (Companion Volume) (2017).
 [9] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, N. A. Gomez, L. Kaiser, I. Polo-
     sukhin, Attention is all you need, Advances in Neural Information Processing Systems 30
     (NIPS 2017) (2017) 5998–6008.
[10] D. C. Manning, H. Schütze, Foundations of statistical natural language processing, Foun-
     dations of statistical natural language processing (2003) 37–38.
[11] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, Bert: Pre-training of deep bidirectional
     transformers for language understanding, north american chapter of the association for
     computational linguistics (2019).
[12] J. Bevendorff, B. Chulvi, L. D. l. P. G. Sarracén, M. Kestemont, E. Manjavacas, I. Markov,
     M. Mayerl, M. Potthast, F. Rangel, P. Rosso, E. Stamatatos, B. Stein, M. Wiegmann, M. Wol-
     ska, E. Zangerle, Overview of pan 2021 - authorship verification, profiling hate speech
     spreaders on twitter, and style change detection, CLEF (2021) 419–431.
[13] F. Rangel, L. D. l. P. G. Sarracén, B. Chulvi, E. Fersini, P. Rosso, Profiling hate speech
     spreaders on twitter task at pan 2021, CLEF (2021) 1772–1789.
[14] I. Vogel, M. Meghana, Profiling Hate Speech Spreaders on Twitter: SVM vs. Bi-LSTM—
     Notebook for PAN at CLEF 2021, in: G. Faggioli, N. Ferro, A. Joly, M. Maistro, F. Piroi
     (Eds.), CLEF 2021 Labs and Workshops, Notebook Papers, CEUR-WS.org, 2021. URL:
     http://ceur-ws.org/Vol-2936/paper-196.pdf.
[15] M. Uzan, Y. HaCohen-Kerner, Detecting Hate Speech Spreaders on Twitter using LSTM
     and BERT in English and Spanish—Notebook for PAN at CLEF 2021, in: G. Faggioli,
     N. Ferro, A. Joly, M. Maistro, F. Piroi (Eds.), CLEF 2021 Labs and Workshops, Notebook
     Papers, CEUR-WS.org, 2021. URL: http://ceur-ws.org/Vol-2936/paper-194.pdf.
[16] T. Anwar, Identify Hate Speech Spreaders on Twitter using Transformer Embeddings
     Features and AutoML Classifiers—Notebook for PAN at CLEF 2021, in: G. Faggioli, N. Ferro,
     A. Joly, M. Maistro, F. Piroi (Eds.), CLEF 2021 Labs and Workshops, Notebook Papers, CEUR-
     WS.org, 2021. URL: http://ceur-ws.org/Vol-2936/paper-153.pdf.
[17] N. Erickson, J. Mueller, A. Shirkov, H. Zhang, P. Larroy, M. Li, A. Smola, Autogluon-tabular:
     Robust and accurate automl for structured data, arXiv preprint arXiv:2003.06505 (2020).