The role of sarcasm in hate speech. A multilingual perspective La función del sarcasmo en los discursos de odio. Una perspectiva multilingüe Simona Frenda1 2 1 PRHLT Research Center, Universitat Politècnica de València, Spain 2 Dipartimento di Informatica, Università degli Studi di Torino, Italy simona.frenda@unito.it Abstract: The importance of the detection of aggressiveness in social media is due to real effects of violence provoked by negative behavior online. For this reason, hate speech online is a real problem in modern society and the necessity of control of user- generated contents has become one of the priorities for governments, social media platforms and Internet companies. Current methodologies are far from solving this problem. Indeed, several aggressive comments are also disguised as sarcastic. In this perspective, this research proposal wants to investigate the role played by creative linguistic devices, especially sarcasm, in hate speech in multilingual context. Keywords: Hate speech, social media, aggressiveness, misogyny, sarcasm Resumen: La importancia del reconocimiento de la agresividad en las redes so- ciales es debido al hecho que esas conductas negativas se traducen en violencias en la vida real también. Por esa razón los discursos de odio online son un problema real en nuestra sociedad y la necesidad del control de los contenidos generados por usuarios se ha convertido en una de las prioridades de gobiernos, de las redes so- ciales y de empresas de Internet. Las metodologı́as corrientes están lejos de resolver este problema. De hecho gran parte de los comentarios agresivos son disfrazados como sarcasticos. En esta perspectiva, esta propuesta de investigación propone de estudiar la función de las figuras retóricas, con particular atención al sarcasmo, en los discursos de odio en un contexto multilingüe. Palabras clave: Discursos de odio, redes sociales, agressividad, misoginia, sarcasmo 1 Introduction events and to monitor the uncontrolled flow The web facilitates the large resonance of of users texts, several initiatives have been hate speech, inciting racism, misogyny or taken in the last years. An example is the xenophobia also in the real world. Actually, campaign No Hate Speech Movement 2 of the it is common that misbehaviours online are Council of Europe for human rights online. traduced in physical attacks, such as rapes The growing interest of NLP (Natural or bulling. For instance, Fulper et al. (2014) Language Processing) research community is demonstrated the existence of a correlation demonstrated by the proposal of national between the number of rapes and the amount and international workshops (such as ALW of misogynistic tweets per state in USA, sug- 20183 ) or campaigns of evaluation fostering gesting the fact that social media can be used the research in this issue in various languages, as a social sensor of violence. such as EvalIta 20184 , IberEval 20185 and Se- In addition, the persistence and diffu- mEval 20196 . These initiatives allow to share sion of misogynistic or offensive content can amanda-todd-suicide-social-media-sexualisation hurt and distress psychologically the victims, 2 https://www.coe.int/en/web/ causing sometime their suicide, such as the no-hate-campaign 3 case of the teenager Amanda Todd in 20121 . 4 https://sites.google.com/view/alw2018 In order to contrast the origin of these hate http://www.evalita.it/2018 5 https://sites.google.com/view/ 1 https://www.theguardian. ibereval-2018 6 com/commentisfree/2012/oct/26/ http://alt.qcri.org/semeval2019/index. Lloret, E.; Saquete, E.; Martı́nez-Barco, P.; Moreno, I. (eds.) Proceedings of the Doctoral Symposium of the XXXIV International Conference of the Spanish Society for Natural Language Processing (SEPLN 2018), p. 13–17 Sevilla, Spain, September 19th 2018. Copyright c 2018 by the paper’s authors. Copying permitted for private and academic purposes. information and results exploring the differ- The rest of the paper is structured as ent topics regarding the hate speech online. follows. Section 2 introduces the literature As well as, the organizers of these compe- that inspired our investigation. Section 3 titions provide resources such as annotated describes our participation in IberEval tasks datasets that are very costly to obtain. with the used approach and obtained results. The fact that the majority of data are col- In Section 4 we analyze the presence of sar- lected from Twitter or Facebook supports the casm in analyzed aggressive and offensive analysis of the computer-mediated communi- texts. Finally, in Section 5 and 6 we draw cation. As well as, the context of short text our research proposal and the future work. incites the creativity of authors who use fig- urative devices to express their opinion. One of the most used figures of speech to manifest 2 Related work negative opinions is the sarcasm. In fact, it is used to disguise and, at the same time, to The literature about hate speech detection reinforce the negative thinking, such as: includes different issues, such as: cyberbul- lying, misogyny, nastiness and aggressive- i) Un pensiero di ringraziamento ogni mat- ness. The most commercial methods, cur- tina va sempre ai comunisti che ce li rently, rely on the use of blacklists. However, hanno portati fino a casa musulmani filtering the messages in this way does not rom e delinquenti grazie 7 . provide a sufficient remedy because it falls The ironic sharpness of the sarcasm seems short when the meaning is more subtle or al- to be appropriated to express contempt and tered by sarcasm. Actually, some authors, to offend individuals subtly. In order to such as Justo et al. (2014) and Nobata et study this correlation between sarcasm and al. (2016), underline the fact that sarcasm hate speech, we proposed the shared task makes the interpretation of the message dif- IronITA8 at Evalita 2018 that asks partici- ficult, generally requiring world knowledge. pants to recognize ironic and sarcastic tweets Also Smokey, one of the first systems, im- in a dataset containing also offensive mes- plemented by Spertus (1997), uses syntactic sages addressed, especially, immigrants (San- and semantic rules with lexicons to recognize guinetti et al., 2018). flames. Moreover, we participated in two tasks In this context, the research is ori- proposed at IberEval 2018 about hate speech: ented at investigating deeply the language aggressiveness detection in Mexican Spanish using classical (Samghabadi et al., 2017) tweets (MEX-A3T)9 organized by Álvarez- and deep learning methods (Del Vigna et Carmona et al. (2018) and identification al., 2017). Differently from Mehdad and of misogynistic English and Spanish tweets Tetreault (2016) and Gambäck and Sikdar (AMI)10 organized by Fersini, Anzovino, and (2017), for MEX-A3T task in Frenda and Rosso (2018). As a confirmation of our in- Banerjee (2018) we applied an experimental tuition, the systems proposed for these tasks technique that combines linguistic features show some difficulties to classify the sarcas- and Convolutional Neural Network (CNN). tic abusive tweets. Indeed, sarcasm, inde- For the first time, Anzovino, Fersini, and pendently from the differences between lan- Rosso (2018) propose a classical machine guages, disguises the real intention of the learning approach to identify misogyny in message which is with difficulty recognized English, comparing different classifiers. Tak- by machine. In line with these early experi- ing into account this previous work and the ments, IronITA could be a good step of anal- psychological studies about sexism (Ford and ysis. Boxer, 2011), in Frenda and Ghanem (2018) we combined sentiment and stylistic informa- php?id=tasks 7 Each morning, I would like to thank communists tion with specific lexicons involving several who bring home musulmans, roms and delinquents aspects of misogyny online. thanks. Tweet from IronITA corpus. 8 In the following section we report how we http://di.unito.it/ironita18 9 https://mexa3t.wixsite.com/home/ addressed the identification of aggressiveness aggressive-detection-track and misogyny in Twitter, the experiments 10 https://amiibereval2018.wordpress.com/ carried out and the results obtained. 14 3 Hate speech, aggressiveness ish tweets. In the case a tweet is classified and misogyny as misogynistic (Task A), we need to distin- Considering our motivations, our early guish (Task B) if the target is an individual experiments focus mainly on hate speech or not (Tar.) and identify the type of misog- detection. For this purpose, we participated yny, according to the following classes (Cat.): at two tasks at IberEval 2018 respectively stereotype and objectification, dominance, about aggressiveness and misogyny detec- derailing, sexual harassment and threats of tion. violence, and discredit. This subdivision of misogyny allows us to explore the different aspects of misogyny and compare them in 3.1 Aggressiveness detection two different languages. Moreover, the data The first task aims to classify aggressive and are not geolocalized. Therefore, in order to non-aggressive tweets in Mexican Spanish. gather the linguistic variations and consider We applied a deep learning approach incor- the various traits of misogyny, we proposed porating into CNN architecture a set of lin- an approach based on stylistic features cap- guistic features (DL+FE) concerning: proper tured by means of the character n-grams, sen- characteristics of a tweet, such as emoticons, timent and affective information, and on a set abbreviations and slang words; stylistic infor- of lexicons concerning: sexuality, profanity, mation, such as the length of tweets, the use femininity, human body and stereotypes. In of the punctuation and the uppercase charac- addition, we considered slangs, abbreviations ters; bags of words weighted with tf-idf; emo- and hashtags. tive traits of the aggressiveness; and deroga- By means of Information Gain, we discov- tory adjectives and vulgar expressions typical ered some differences between the two lan- of Mexican culture. guages: sexual language is more used in En- By means of Information Gain, we no- glish misogynistic tweets, whereas profanities ticed that anger and disgust are the princi- or vulgarities are more used in Spanish ones. pal emotions that incite the aggressive be- For this task, we applied Support Vector Ma- haviour. We compared this system with a chine (SVM) and majority voting technique. simple CNN architecture (DL) in order to un- To evaluate the Task A the organizers used derstand the contribution of features to deep Accuracy measure and for Task B the average learning approach. The measure used for the Macro-F1 measure. In Table 2 and Table 3 competition is F-score for positive class (i.e. we report the promising results obtained with aggressive class). Despite the novel approach, better runs for both languages. the results obtained are low and the features seem not to help deep learning, as showed in Approach Acc Rank Table 1. En Ensemble 0.87 2 Sp Ensemble 0.81 3 Prec. Rec. F-pos Rank DL 0.34 0.34 0.34 9 DL+FE 0.27 0.38 0.31 10 Table 2: Results for Task A of misogyny de- tection Table 1: Results for aggressiveness detection Approach F1 Cat. Tar. Rank Therefore, in order to understand what En SVM 0.44 0.29 0.59 1 are the difficulties of DL+FE, we carried Sp Ensemble 0.44 0.33 0.55 2 out the error analysis. We mainly noticed that there are several humorous cases, especially sarcastic (see Section 4), which Table 3: Results for Task B of misogyny de- are misclassified. tection 3.2 Automatic misogyny identification 4 Sarcasm The second task proposes to identify misog- In Traité des tropes (1729) Dumarsais has de- yny in two collection of English and Span- fined the sarcasm as an ironie faite avec ai- 15 greur et emportement 11 , that is a kind of ag- a multilingual perspective, the automated gressive and sharp irony addressed a target to methods to flag abusive language. hurt or criticize him without to exclude the For this purpose, we propose an accurate possibility to amuse. This statement is cor- analysis of different kinds of hate speech on- roborate by our analyses on English, Spanish, line especially in Italian, English and Span- Mexican and Italian hate speech corpora. As ish, taking into account also the geographical said above, we carried out the error analysis linguistic variations. We focus in particular for both tasks. on short texts such as tweets, posts or com- In the first competition we noticed that ments, exploring the informal language. our approach fails in the classification of sar- Considering the previous observations, we castic aggressive utterances, such as: propose approaching the hate speech detec- tion issues taking into account the figurative ii) @USUARIO #LOS40MeetAndGreet 9 . dimension of language and especially of abu- Por q es una mamá luchona que cuida a sive language. Moreover, it is necessary to su bendiciòn 12 . examine the appropriateness of various com- Actually, the sarcasm is a type of figurative putational techniques to solve this problem. devices that modifies the perception of mes- In this line, we want to examine the contribu- sage, hindering the correct detection of hate tion of the linguistic features to deep learning speech by automatic systems. We found, in approaches by comparison with the perfor- fact, the same difficulty for the recognition of mances of classical techniques. Finally, the misogynistic tweets in both languages, such multilingual context allows to discover the as: typical aspects of hate speech in order to rec- ognize it independently from the languages. iii) ¿Cuál es la peor desgracia para una mu- Indeed, the scope of this investigation is to jer? Parir un varón, porque después de propose a methodology for monitoring cor- tener un cerebro dentro durante 9 meses, rectly the user-generated contents allowing van y se lo sacan 13 ; the system to work as sensor of the violence, also in real world. iv) What’s the difference between a blonde and a washing machine? A washing ma- 6 Future work chine won’t follow you around all day af- ter you drop a load in it. Our research aims to explore the several di- mensions of hate speech considering, above In virtual as in real life, sexist jokes are all, the use of figurative devices that hinder very common. In general, they are considered the automatic processes of recognition. In innocent by the majority of people. How- order to investigate the remarks observed in ever, Ford and Boxer (2011) reveal that sex- these first experiments, as future work, we ist jokes are experienced by women as sex- would like to participate in HaSpeeDe14 and ual harassment as well as offences. Moreover, AMI15 at Evalita 2018 for Italian. Ford, Wentzel, and Lorion (2001) investigate In addition, similar tasks are proposed at on the effects of exposure to sexist jokes and SemEval 2019 concerning: multilingual hate they underline that a continue exposition can speech against immigrants and women (Hat- also modify the perception of sexism as norm Eval)16 , and the identification and catego- and not as misbehavior. rization of offensive language in social me- dia (OffensEval)17 . Analyzing different kinds 5 Research Proposal of abusive language allows to understand the These early observations suggest the neces- boundaries between them and their singular sity to address the use of figures of speech aspects. Finally, multilingual context gives such as sarcasm, in order to accurate, in us the opportunity to delineate the differ- 11 14 “type of irony done with sharpness and a fit of http://www.di.unito.it/~tutreeb/ anger” haspeede-evalita18/index.html\# 12 15 @User #LOS40MeetAndGreet 9 . Because she is https://amievalita2018.wordpress.com/ 16 a fighter mother who takes care of her kid. https://competitions.codalab.org/ 13 What’s the worst disgrace for a woman? Giving competitions/19935 17 birth to boy, because after she has got a brain into her https://competitions.codalab.org/ for 9 months, it is taken out competitions/20011 16 ences and analogies between the various lan- Fulper, R., G. L. Ciampaglia, E. Fer- guages, inferring general characteristics of rara, Y. Ahn, A. Flammini, F. Menczer, hate speech online. B. Lewis, and K. Rowe. 2014. Misogy- nistic language on twitter and sexual vi- References olence. In Proceedings of the ACM Web Álvarez-Carmona, M. Á., E. Guzmán-Falcón, Science Workshop on ChASM. M. Montes-y Gómez, H. J. Escalante, Gambäck, B. and U. K. Sikdar. 2017. Using L. Villaseñor-Pineda, V. Reyes-Meza, and convolutional neural networks to classify A. Rico-Sulayes. 2018. Overview of hate-speech. In Proceedings of the First mex-3at at ibereval: Authorship and ag- Workshop on Abusive Language Online, gressiveness analysis in mexican spanish pages 85–90. tweets. In Notebook Papers of 3rd SEPLN Workshop on Evaluation of Human Lan- Justo, R., T. Corcoran, S. M. Lukin, guage Technologies for Iberian Languages M. Walker, and M. I. Torres. 2014. Ex- (IBEREVAL), Seville, Spain, September. tracting relevant knowledge for the detec- tion of sarcasm and nastiness in the social Anzovino, M., E. Fersini, and P. Rosso. 2018. web. Knowledge-Based Systems, 69:124– Automatic identification and classification 133. of misogynistic language on twitter. In In- ternational Conference on Applications of Mehdad, Y. and J. Tetreault. 2016. Do char- Natural Language to Information Systems, acters abuse more than words? In Pro- pages 57–64. Springer. ceedings of the 17th Annual Meeting of the Del Vigna, F., A. Cimino, F. Dell’Orletta, Special Interest Group on Discourse and M. Petrocchi, and M. Tesconi. 2017. Hate Dialogue, pages 299–303. me, hate me not: Hate speech detection on Nobata, C., J. Tetreault, A. Thomas, facebook. In Proceedings of ITASEC17. Y. Mehdad, and Y. Chang. 2016. Abu- Fersini, E., M. Anzovino, and P. Rosso. 2018. sive language detection in online user con- Overview of the task on automatic misog- tent. In Proceedings of the 25th interna- yny identification at ibereval. In Note- tional conference on world wide web, pages book Papers of 3rd SEPLN Workshop on 145–153. Evaluation of Human Language Technolo- Samghabadi, N. S., S. Maharjan, A. Sprague, gies for Iberian Languages (IBEREVAL), R. Diaz-Sprague, and T. Solorio. 2017. Seville, Spain, September. Detecting nastiness in social media. In Ford, T. E. and C. F. Boxer. 2011. Sexist Proceedings of the First Workshop on humor in the workplace: A case of subtle Abusive Language Online, pages 63–72. harassment. In Insidious Workplace Be- Sanguinetti, M., F. Poletto, C. Bosco, havior. Routledge, pages 203–234. V. Patti, and M. Stranisci. 2018. An Ford, T. E., E. R. Wentzel, and J. Lorion. italian twitter corpus of hate speech 2001. Effects of exposure to sexist humor against immigrants. In Proceedings of on perceptions of normative tolerance of the Eleventh International Conference sexism. European Journal of Social Psy- on Language Resources and Evaluation. chology, 31(6):677–691. LREC. Frenda, S. and S. Banerjee. 2018. Deep Spertus, E. 1997. Smokey: Auto- analysis in aggressive mexican tweets. In matic recognition of hostile messages. In Notebook Papers of 3rd SEPLN Work- AAAI/IAAI, pages 1058–1065. shop on Evaluation of Human Lan- guage Technologies for Iberian Languages (IBEREVAL), Seville, Spain, September. Frenda, S. and B. Ghanem. 2018. Explo- ration of misogyny in spanish and english tweets. In Notebook Papers of 3rd SEPLN Workshop on Evaluation of Human Lan- guage Technologies for Iberian Languages (IBEREVAL), Seville, Spain, September. 17