=Paper=
{{Paper
|id=Vol-2006/paper024
|storemode=property
|title=Hate Speech Annotation: Analysis of an Italian Twitter Corpus
|pdfUrl=https://ceur-ws.org/Vol-2006/paper024.pdf
|volume=Vol-2006
|authors=Fabio Poletto,Marco Stranisci,Manuela Sanguinetti,Viviana Patti,Cristina Bosco
|dblpUrl=https://dblp.org/rec/conf/clic-it/PolettoSSPB17
}}
==Hate Speech Annotation: Analysis of an Italian Twitter Corpus==
Hate Speech Annotation: Analysis of an Italian Twitter Corpus Fabio Poletto Marco Stranisci Manuela Sanguinetti, Dipartimento di StudiUm Acmos Viviana Patti, University of Turin marco.stranisci@acmos.net Cristina Bosco f.poletto91@gmail.com Dipartimento di Informatica University of Turin {msanguin,patti,bosco}@di.unito.it Abstract distinguish it from offline communication and make it potentially more dangerous and hurtful English. The paper describes the develop- (Ziccardi, 2016). What is more, HS is featured ment of a corpus from social media built as a complex and multifaceted phenomenon, also with the aim of representing and analysing because of the variety of approaches employed hate speech against some minority groups in attempting to draw the line between HS and in Italy. The issues related to data collec- free speech (Yong, 2011). Therefore, despite tion and annotation are introduced, focus- the multiple efforts, there is yet no universally ing on the challenges we addressed in de- accepted definition of HS. signing a multifaceted set of labels where From a juridical perspective, two contrasting the main features of verbal hate expres- approaches can be recognised: while US law is sions may be modelled. Moreover, an oriented, quite uniquely, towards granting free- analysis of the disagreement among the dom of speech above all, even when potentially annotators is presented in order to carry hurtful or threatening, legislation in Europe and out a preliminary evaluation of the data set the rest of the world tends to protect the dignity and the scheme. and rights of minority groups against any form of Italiano. L’articolo descrive un corpus di expression that might violate or endanger them. testi estratti da social media costruito con Several European treaties and conventions ban il principale obiettivo di rappresentare ed HS: to mention but one, the Council of European analizzare il fenomeno dell’hate speech ri- Union condemns publicly inciting violence or volto contro i migranti in Italia. Vengono hatred towards persons or groups defined by introdotti gli aspetti significativi della rac- reference to race, colour, religion, descent or colta ed annotazione dei dati, richia- national or ethnic origin. The No Hate Speech mando l’attenzione sulle sfide affrontate Movement1 , promoted by the Council of Europe, per progettare un insieme di etichette che is also worth-mentioning for its efforts in endors- rifletta le molte sfaccettature necessarie ing responsible behaviours and preventing HS a cogliere e modellare le caratteristiche among European citizens. delle espressioni di odio. Inoltre viene presentata un’analisi del disagreement tra The main aim of this paper is at introducing gli annotatori allo scopo di tentare una a novel resource which can be useful for the in- preliminare valutazione del corpus e dello vestigation of HS in a sentiment analysis perspec- schema di annotazione stesso. tive (Schmidt and Wiegand, 2017). Providing that among the minority groups targeted by HS, the present socio-political context shows that some of 1 Introduction them are especially vulnerable and garner constant attention - often negative - from the public opin- Hate is all but a new phenomenon, yet the global ion, i.e. immigrants (Bosco et al., 2017), Roma spread of Internet and social network services and Muslims, we decided to focus our work on HS has provided it with new means and forms of against such groups. Furthermore, providing the dissemination. Online hateful content, or Hate spread of HS in social media together with their Speech (HS), is characterised by some key aspects 1 (such as virality, or presumed anonymity) which https://www.nohatespeechmovement.org current relevance in communication, we focused 3 Dataset Collection on texts from Twitter, whose peculiar structure and conventions make it particularly suitable for data The dataset creation phase was divided into three gathering and analysis. main stages. We first collected all the tweets written in Italian and posted from 1st October 2016 to 25th April 2 Related Work 2017. Then we discussed in order to establish a) which One of the earlier attempts to develop a corpus- minority groups should be identified as possible based model for automated detection of HS on the HS targets, and b) the set of keywords associated Web is found in Warner and Hirschberg (2012): with each target, in order to filter the data col- the authors collect and label a set of sentences lected in the previous step. As for the first as- from various websites, and test a classifier for pect, we identified three targets that we deemed detecting anti-Semitic hatred. They observe that particularly relevant in the present Italian scenario; HS against different groups is characterised by a based also on the terminology used in European small set of high frequency stereotypical words, Union reports2 , the targets selected for our corpus also stressing the importance of distinguishing HS were immigrants (class: ethnic origin), Muslims from simply offensive content. (class: religion), and Roma. As regards the sec- The same distinction is at the core of Davidson et ond aspect mentioned above, we are aware of the al. (2017), where a classifier is trained to recog- limits of a keyword-based method in HS identifi- nise whether a tweet is hateful or just offensive, cation (Saleem et al., 2016), especially regarding observing that for some categories this difference the amount of noisy data (e.g. off-topic tweets) is less clear than for others. that may result from such method; on the other An exhaustive list of the targets of online hate is hand, the choice to adopt a list of explicitly hateful found in Silva et al. (2016), where HS on two words3 may prevent us from finding subtler forms social networks (Twitter and Whisper) is detected of HS, or even just tweets where a hateful message through a sentence structure-based model. is expressed without using a hate-related lexicon. One of the core issues of manually labelling HS With this in mind, we then filtered the data by re- is the reliability of annotations and the inter- taining a small set of neutral keywords associated annotator agreement. The issue is confronted by with each target. The keywords selected are sum- Waseem (2016) and Ross et al. (2017), who find marised below: that more precise results are obtained by relying ethnic group religion Roma on expert rather than amateur annotations, and that immigrat* terrorismo rom the overall reliability remains low. The authors (immigrant*) (terrorism) (roma) suggest that HS should not be considered as a bi- immigrazione terrorist* nomad* nary ”yes/no” value and that finer-grained labels (immigration) (terrorist*) (nomad*) may help increase the agreement rate. migrant* islam An alternative to lexicon-based approaches is sug- stranier* mussulman* gested in Saleem (2016), where limits and biases (foreign) (muslim*) of manual annotation and keyword-based tech- profug* corano niques are pointed out, and a method based on (refugee*) (koran) the language used within self-defined hateful web communities is presented. The method, suitable The dataset thus retrieved consisted of 370,252 for various platforms, bypasses the need to define tweets about ethnic origins, 176,290 about religion HS and the inevitable poor reliability of manual annotation. 2 See the 2015 Eurobarometer Survey on discrimination While most of the available works are based on in the EU: http://ec.europa.eu/justice/ fundamental-rights/files/factsheet_ English language, Del Vigna et al. (2017) is the eurobarometer_fundamental_rights_2015. first work on a manually annotated Italian HS cor- pdf 3 pus: here the authors apply a traditional procedure Such as the ones extracted for the Italian HS map (Musto et al., 2016): on a corpus crawled from Facebook, developing http://www.voxdiritti.it/ecco-la-nuova- two classifiers for automated detection of HS. edizione-della-mappa-dellintolleranza/ and 31,990 about Roma. target tweet The last stage consisted in the creation of the religion Ci vuole la guerra per salvare l’Italia corpus to be annotated. In order to obtain a bal- dai criminali filo islamici anced resource, we randomly selected from the (”We need a war to save Italy from previous dataset 700 tweets for each target, with pro-Islamic criminals”) a total amount of 2,100 tweets. In case even just one of these conditions is not However, a large number of tweets were further detected, HS is assumed not to occur. removed from the corpus, during the annotation In line with this definition, we also attempted stage (because of duplicates and off-topic con- to extend the scheme to other annotation cate- tent). Despite the size reduction, though, the dis- gories that seemed to significantly co-occur with tribution of the targets in the corpus remained HS; this in order to better represent the (perceived) quite unchanged, resulting in a balanced resource meaning of the tweet, and to help the annotator in in this respect. the task, by providing a richer and finer-grained At present, the amount of annotated data con- tagset4 . The newly-introduced categories are de- sists of 1,828 tweets. In the next section, we scribed below. describe the whole annotation process and the scheme adopted for this purpose. Aggressiveness (labels no - weak - strong): it fo- cuses on the user intention to be aggressive, harm- 4 Data Annotation: Designing and ful, or even to incite, in various forms, to violent Applying the Schema acts against a given target; if the message reflects an overtly hostile attitude, or whenever the target Being HS a complex and multi-layered concept, group is portrayed as a threat to social stability, and being the task of its annotation quite difficult the tweet is considered weakly aggressive, while and prone to subjectivity, we undertook some pre- if there is the reference – whether explicit or just liminary steps in order to make sure that all anno- implied – to violent actions of any kind, the tweet tators share a common ground of basic concepts, is strongly aggressive. starting from the very definition of HS. tweet aggressiveness When determining what can, or cannot, be consid- nuova invasione di migranti weak ered HS (thus in a yes-no fashion), and based on in Europa the juridical literature and observations reported (A new migrant invasion in Europe) above in Section 1, we considered two different factors: Cacciamo i rom dall’Italia strong (Let’s kick Roma out of Italy) • the target involved, i.e. the tweet should be Offensiveness (labels no - weak - strong): con- addressed, or just refer to, one of the minority versely to aggressiveness, it rather focuses on the groups identified as HS targets in the previ- potentially hurtful effect of the tweet content on ous stage (see Section 3), or even to an indi- a given target. A tweet is considered weakly of- vidual considered for its membership in that fensive in a large number of cases, among these: category (and not for its individual character- the given target is associated with typical human istics); flaws (laziness in particular), the status of disad- vantaged or discriminated minority is questioned, • the action, or more precisely the illocution- or when the members of the target group are de- ary force of the utterance, in that it is capable scribed as unpleasant people; on the other hand, if of spreading, inciting, promoting or justify- an overtly insulting language is used, or the target ing violence against a target. is addressed to by means of outrageous or degrad- ing expressions, the tweet is expected to be con- Whenever both factors happen to co-occur in sidered as strongly offensive. the same tweet, we consider it as a HS case, as in 4 The whole scheme description along with the de- the example below: tailed guidelines are available at https://github.com/ msang/hate-speech-corpus tweet offensiveness Annotation process The annotation task con- I migranti sanno solo weak sisted in a multiple-step process, and it was carried ostentare l’ozio out by four independent annotators after a prelimi- (Migrants can only show off nary step where the guidelines were discussed and their laziness) partially revised. The corpus was split in two, and each part was Zingari di merda strong annotated by two annotators. The annotator pairs (You fucking Roma) then switched to the other part, in order to provide a third (possibly solving) annotation to all those Irony (labels no - yes): it determines whether tweets where at least one category was labelled the tweet is ironic or sarcastic rather than based on differently by the previous two annotators. A fur- the literal meaning of words. The introduction of ther subset of around 130 tweets still received dif- this category in the scheme was led by preliminary ferent labels by the different annotators (namely observations of the data, which highlighted how it for aggressiveness and offensiveness). In order to was a fairly common linguistic expedient used to solve these remaining cases, a fifth independent mitigate or indirectly convey a hateful content. annotator was finally involved. As a result, the tweet irony final corpus only contains tweets that were fully ora tutti questi falsi profughi yes revised. li mandiamo a casa di Renzi ??! Regarding the results of the annotation in terms (shall we send all these of label distribution, we found that 16% of all fake refugees to Renzi’s house??!) tweets have been considered containing HS, of which 23% against immigrants, 38% against Mus- Stereotype (labels no - yes): it determines lims and 39% against Roma. When considered whether the tweet contains any implicit or ex- alone, aggressiveness occurs in 14% , offensive- plicit reference to (mostly untrue) beliefs about a ness in 10%, irony in 11% and stereotype in 29% given target. There is a whole host of stereotypes of tweets. However, the labels that co-occur more and prejudices associated with the target groups frequently with hate speech are those indicating selected for our research; from an exploratory the presence of aggressiveness (81%), stereotypes observation of the data in the corpus, the fol- (81%), and offensiveness (56%), and, overall, they lowing cases were identified: the members of a co-occur altogether 52% of the times; irony is la- given target are referred to as invaders, freeload- belled in 11% of HS tweets. While, within the ers, criminals, filthy (or having filthy habits), sex- whole corpus, 57% of cases are just tweets with a ist/mysoginist, undemocratic, violent people. “neutral” content, which means that no one of the Furthermore, we also take into account the role categories were annotated as such. that conventional media may have in spreading stereotypes and prejudices while reporting news 4.1 Agreement Analysis on refugees, migrants, and minorities in general. The development phase related to the inter- Based on what suggested in the Italian journalists’ annotator agreement (IAA) is not only a necessary Code of Conduct, known as ”Carta di Roma”5 , in step for validating the corpus and evaluating the order to ensure a correct and responsible reporting schema adopted, but also a tool that provides more about these topics, we also applied this criterion to details about the trends and biases of individual any tweet containing a news headline that implic- annotators with respect to specific annotation cat- itly endorses, or contributes to the spread of, such egories. stereotypical portrayals (see the example below). In this study, we measured the IAA right after the first annotation step was completed, i.e. the tweet stereotype one where just two annotators were involved (see Roma in bancarotta ma regala yes Section 4). In line with related cases6 , our data 12 milioni ai rom showed a very low agreement: in 47% of cases, (Rome is bankrupt but still gives the annotator pair annotated at least one of the five 12 millions to Roma) 6 See (Del Vigna et al., 2017), (Gitari et al., 2015), (Kwok and Wang, 2013), (Ross et al., 2017), (Waseem, 2016), to 5 See https://www.cartadiroma.org/ mention a few. categories using different labels. In fact, the dis- 5 Conclusion and Future Work agreement took place mostly in one (40%) or two (16%) categories, while just 4 tweets received a We introduced in this paper the collection and an- completely different annotation by the annotator notation of an Italian Twitter corpus representing pairs. More specifically, we measured the agree- HS towards some selected target. Our main aim is ment coefficient, using Cohen’s kappa (Carletta, at producing a corpus to be used for training and 1996), for each individual category. Results – also testing sentiment analysis systems, but some effort reported in Table 1 – show that the category with must still be applied to achieve this goal. The cur- the highest agreement is namely the one related to rent contribute is mainly in designing and trying a the presence of hate speech (abbreviated to ‘hs’ in novel schema for HS, but the relatively low agree- the table), followed by irony (‘iro.’) and stereotype ment shows that modelling this phenomenon is a (‘ster.’). very challenging task and a further refinement of the guidelines and of the scheme must be applied, together with the application to larger data sets. hs aggr. off. iro. ster. Acknowledgments before merge 0,54 0,18 0,32 0,44 0,43 after merge 0,54 0,43 0,37 0,44 0,43 The work of Cristina Bosco, Viviana Patti and Manuela Sanguinetti was partially funded Table 1: Agreement (Cohen’s k) for each annota- by Progetto di Ateneo/CSP 2016 (Immigrants, tion category before and after merging labels for Hate and Prejudice in Social Media, project aggressiveness and offensiveness. S1618 L2 BOSC 01) and partially funded by Fondazione CRT (Hate Speech and Social Media, project n. 2016.0688). Considering that the lowest agreement was found in aggressiveness (‘aggr.’) and offensive- References ness (‘off.’) – the only categories where three la- Cristina Bosco, Patti Viviana, Marcello Bogetti, bels were used, instead of two – the agreement was Michelangelo Conoscenti, Giancarlo Ruffo, recalculated by merging the weak-strong labels; it Rossano Schifanella, and Marco Stranisci. 2017. thus increased considerably (especially in aggres- Tools and resources for detecting hate and prejudice siveness), though still remaining far below an ac- against immigrants in social media. In Proceedings ceptable threshold. of First Symposium on Social Interactions in Com- plex Intelligent Systems (SICIS), AISB Convention The low agreement with regard to the degree of 2017, AI and Society, Bath, UK. offensiveness can be attributed to the absence of clear indications within the annotation guidelines Jean Carletta. 1996. Assessing agreement on classi- in this respect. fication tasks: The kappa statistic. Computational Linguistics, 22(2):249–254, June. Finally, among the annotation criteria established in the preliminary stage, one in particular proved Thomas Davidson, Dana Warmsley, Michael Macy, to be quite misleading, i.e. whenever a clearly and Ingmar Weber. 2017. Automated hate speech hateful tweet did not actually refer to the target detection and the problem of offensive language. In International AAAI Conference on Web and Social identified by one of the selected keywords, HS Media. and stereotype were assumed not to occur. On the other hand, the remaining categories should be an- Fabio Del Vigna, Andrea Cimino, Felice Dell’Orletta, notated accordingly. This principle was conceived Marinella Petrocchi, and Maurizio Tesconi. 2017. in order to provide annotated data that could be Hate me, hate me not: Hate speech detection on Facebook. In Proceedings of the First Italian Con- considered a true reflection of HS towards the tar- ference on Cybersecurity (ITASEC17), Venice, Italy, gets we identified in our study, though still ”pre- January 17-20, 2017., pages 86–95. serving” the meaning and the intent of the tweet in itself, regardless of the target involved. This, to- Njagi Dennis Gitari, Zhang Zuping, Hanyurwimfura Damien, and Jun Long. 2015. A lexicon-based gether with other points of the guidelines, will be approach for hate speech detection. International further discussed and clarified in the next project Journal of Multimedia and Ubiquitous Engineering, phase. 10(4):215–230. Irene Kwok and Yuzhou Wang. 2013. Locate the hate: Detecting tweets against blacks. In Marie desJardins and Michael L. Littman, editors, AAAI. AAAI Press. Cataldo Musto, Giovanni Semeraro, Marco de Gem- mis, and Pasquale Lops. 2016. Modeling com- munity behavior through semantic analysis of so- cial data: The italian hate map experience. In Pro- ceedings of the 2016 Conference on User Modeling Adaptation and Personalization, UMAP 2016, Hali- fax, NS, Canada, July 13 - 17, 2016, pages 307–308. Björn Ross, Michael Rist, Guillermo Carbonell, Ben- jamin Cabrera, Nils Kurowsky, and Michael Wo- jatzki. 2017. Measuring the reliability of hate speech annotations: The case of the European refugee crisis. CoRR, abs/1701.08118. Haji Mohammad Saleem, Kelly P. Dillon, Susan Be- nesch, and Derek Ruths. 2016. A web of hate: Tackling hateful speech in online social spaces. In Proceedings of the First Workshop on Text Analyt- ics for Cybersecurity and Online Safety (TA-COS 2016)s, Portoro, Slovenia. Anna Schmidt and Michael Wiegand. 2017. A survey on hate speech detection using natural language pro- cessing. In Proceedings of the Fifth International Workshop on Natural Language Processing for So- cial Media, pages 1–10, Valencia, Spain, April. As- sociation for Computational Linguistics. Leandro Silva, Mainack Mondal, Denzil Correa, Fab- rcio Benevenuto, and Ingmar Weber. 2016. Ana- lyzing the targets of hate in online social media. In Proceedings of the 10th International Conference on Web and Social Media, ICWSM 2016, pages 687– 690. AAAI Press. William Warner and Julia Hirschberg. 2012. Detecting hate speech on the world wide web. In Proceedings of the Second Workshop on Language in Social Me- dia, LSM ’12, pages 19–26, Stroudsburg, PA, USA. Association for Computational Linguistics. Zeerak Waseem. 2016. Are you a racist or am i seeing things? annotator influence on hate speech detection on Twitter. In Proceedings of the First Workshop on NLP and Computational Social Science, pages 138–142, Austin, Texas, November. Association for Computational Linguistics. Caleb Yong. 2011. Does freedom of speech include hate speech? Res Publica, 17(4):385, Jul. Giovanni Ziccardi. 2016. L’odio online. Violenza ver- bale e ossessioni in rete. Saggi / [Cortina]. Cortina Raffaello.