Progress in the Modeling of Violent Messages in Spanish Social Networks

Progress in the Modeling of Violent Messages in Spanish Social Networks BeatrizBotella-Gil Dept. of Software and Computing Systems University of Alicante

Apdo. de Correos 99 E-03080 Alicante Spain

Progress in the Modeling of Violent Messages in Spanish Social Networks 1613-0073 2849DD822903031ACFD68864B1D0663F GROBID - A machine learning software for extracting information from scholarly documents Natural Language Processing Annotation Guideline Dataset Annotation Detection of Violent Messages

Society advances loaded with new and highly accessible knowledge, that is published in the virtual world. It is a reality that ICTs have brought many benefits for our lives but we also see how year after year the use of violence in platforms increase. The application of PLN is fundamental in this type of research given the large volume of existing data, which facilitates a breakthrough in the investigation of the detection of this type of messageThe doctoral work focuses on the detection of violent messages in the social network twitter from different perspectives.

Introduction

The Internet has become an indispensable part of our lives, being used in practically all of society's daily activities. Nowadays it is possible to have immediate contact with any person in the world through an electronic device. Society is moving forward with new and very accessible knowledge that is published in the virtual world. Personal relationships have also been affected, not only in the private sphere, but also in the workplace.

According to We are Social [1], almost 44 million people in Spain spend more than 6 hours a day on the Internet and around 41 million Spaniards are users of social networks. It is a reality that ICTs have brought many benefits to our lives, but also, thanks to the possibility of being an anonymous user and the absence of observing face to face the damage that our words can generate, problems still to be solved are created [2]. In particular, many researchers call this type of violent action as hate speech, an offensive behavior through language towards people or groups and whose detection is being a problem for researchers, since, it is possible that violence is not used explicitly in a discourse, but as a single word or even implicitly through the use of emoticons [3], or by using humor, irony, sarcasm [4,5] or stereotypes [6].

Given the number of users present in social networks, it is impossible to manually control the comments that are registered and their intention, creating impunity for people who use these networks in order to do harm. The identification of violent messages and control of hate speech on the Internet has been approached from different points of view, being essential the use of Natural Language Processing (NLP) to develop computational systems that help to interpret and process human language quickly and effectively. Doctoral Symposium on Natural Language Processing from the Proyecto ILENIA, 28 September 2023, Jaén, Spain.

beatriz.botella@dlsi.ua.es (B. Botella-Gil)

One barrier we encountered right at the start of the study is the collection of messages in social networks, since, as Bruns points out [7], the restriction of access to social network data makes it difficult to analyze issues of great importance such as abusive language, harassment, hate speech or disinformation campaigns. That is why in the present research the social network Twitter will be used, where as Ott [8] defines: "Twitter discourse is disrespectful because its register is informal, and because it depersonalizes social interactions". This research aims to provide solutions to the existing problems in the detection of violent messages in social networks in a fast, automatic and effective way.

Background and Related Work

Many studies have been carried out on the analysis of violent messages in social networks and media. In particular, much research can be found focused on discovering the characteristics of human behavior that promote the emission of such messages, as well as those that focus on discovering the characteristics of the messages themselves through PLN techniques.

Language and behavior study

There is a wealth of research on human behavior in the face of violent messages and the language used. As McMenamin said [9], "hate speech is studied according to how it is defined, how it is interpreted, and what are the best practices to deal with it". That is why we find works such as Salado [10], who based their research on a syntactic analysis of language, and discovered that there are different linguistic elements to take into account that are present in the forms of violent speech such as, the linguistic category, the lexicon used or how the words are placed. Plaza-Del-Arco et al. [11] carried out a study of the implicit and explicit linguistic phenomena of offensive language. Others such as Gitari [12] focused on something as specific as the creation of a list of verbs that can be indicators of violent messages. On the other hand, there are works that focus on the roles present in these acts, such as, for example, Nielsen, who, through interviews and a study of the participants, observed the consequences for the victim, his or her harm and the possibility of crime in the messages.

PLN applied to the detection of violent messages

The application of PLN is fundamental in this type of research given the large volume of existing data, which facilitates a breakthrough in the investigation of the detection of this type of messages, thanks to the following techniques:

• Keyword-based classifiers

Part of the research in this field has focused on the development of lists of insults that help automatic detection. In this sense, lexicons and dictionaries have been developed in order to observe whether the presence of these terms determines the violence in the message [13]. Although such lists have aided detection, they have fallen short of being the sole tool for determining violence. Violent language is constantly evolving, language varies from place to place varies depending on where it occurs and there may be terms that in some geographic areas are insults and in others are not [14].

• Machine learning Most of the work related to the detection of violent messages addresses this problem using classical machine learning (ML) algorithms. Works such as Xu et al. [15] and Dadvar et al. [16] have used support vector machines (SVM) in their research and obtained satisfactory results, proving to be very effective with large training samples. SVM is not the only classical algorithm used in research in this field; works such as [17], used other algorithms, showing in their results that the one that offered the best performance is logistic regression, followed by Naive Bayes and SVMs.

Most ML-based classifiers use traditional text representations such as bag-of-words (BOW), n-grams, term frequency (TF), among others. In Burnap and Williams [18] all of the above techniques are used. This research compares the results obtained individually by the classifiers with the use of a set of classifiers (ensemble) that integrates them all, demonstrating greater accuracy in the latter. Sentiment analysis is another of the most widely used tools in this field. With it we can extract the polarity of the message and use this indicator along with other tasks to determine more accurately whether we are facing a violent or non-violent message [19].

Corpus development, have an important role in offensive language research when ML techniques are applied. In recent years we have observed a large volume of work by PLN researchers to generate these resources [20,21,22,23,24,25]. These authors created English-language resources, with SOLID [25] being the resource containing more than nine million English-language tweets tagged in a semi-supervised manner.

On the other hand, HurtLex [26] is a multilingual lexicon of hate speech spanning multiple languages and hatebase31 is a collaborative repository of hate speech that is also multilingual. The main drawback of these resources is their paucity of English terms, and those that are present have been compiled using a semi-automatic translation from another language, neglecting the importance of cultural and linguistic factors in each country. However, despite the fact that Spanish is one of the most widely spoken languages in the world, we found a shortage of resources in this language to carry out the task of detection. There are resources in Spanish for offensive words such as Plaza-Del-Arco et al. [11] for misogynistic and xenophobic terms; and Share [27] that label them as offensive and not. After the study carried out on the literature, it is considered necessary to elaborate another corpus to collect more characteristics present in violent messages, which can help in the explicability and detail of the detection.

• Deep learning

Within AI there are other more complex techniques that have also been used in this task. We refer to deep learning (DL), as is the case of the research by Arcila-Calderón et al.

[17] that after using ML tools and neural networks, the latter improved the evaluation metrics against the models generated with traditional ML algorithms. To the same end Badjatiya et al. [28] uses DL models to train different word embeddings validating that, using these representations, obtains better results than traditional representations such as term frequency -inverse document frequency (TF-IDF) or BoW. Models based on transformer architecture, such as BERT, RoBERTa and ALBERT, show the best state-of-the-art results in the detection of violent messages in tasks recognized as OffensEval or HatEval [29]. In Sarkar et al. [29] fine-tuning (fine-tuning) to BERT is performed using SOLID, the largest English offensive language identification corpus, improving the results obtained with BERT in the tasks mentioned above. In Song et al. [30] an ensemble of classifiers (ensemble) based on RoBERTa and BERT is used which obtains the best results in the shared task "SemEval-2021 Task 7: HaHackathon, Detecting and Rating Humor and Offense"2 which includes a subtask of detecting offensive messages. This work consists of fine tuning these models to create a classifier and clustering them into a set of classifiers based on stacking.

Main Hypothesis and Objectives

After literature reviews, future work was found to improve the detection of violent messages in social networks, more specifically focused not only on the language itself, but also on determining the existence of patterns of behavior and user profiles. This problem is being studied taking as references data such as user role, type of violence, message form (implicit/explicit), isolated message vs. thread of messages, number of followers and activity in the social network (number of # or mentions of the user).

Objectives

In our research the following objectives will be pursued:

1. Expanding the corpus VIL [31].

2. Define behavioral patterns for violent identification.

3. Achieve a system that is able to give a user violence score, both to know if the user is violent and to know if is receiving violence. 4. Identify the phases of the violence process (beginning, in process, completed). 5. Creation of a program for automatic detection of violence in social networks.

Hypothesis

For the study of violence detection through messages on social networks with the help of NLP, the following hypotheses are pursued:

1. Is it possible to detect violence on social networks through language analysis?. Certain words and phrases could serve as indicators that increase the likelihood of a message being considered violent. Therefore, it is relevant to investigate how language can be used offensively in online communication, using linguistic cues as potential markers of violence in messages. 2. Do patterns exist that aid in the detection of violence on social networks?. The study of the characteristics surrounding the issue of violence on social networks, including user behavior and their use of language, could provide essential insights that bring us closer to identifying effective techniques to address this problem.

3. Is it possible to detect violent messages using NLP tools?. We believe that ML models trained with labeled data significantly enhance the ability to detect violent messages compared to rule-based approaches alone.

Experiments

We have conducted 3 experiments during this thesis, a FIERO chatbot, an annotation guide and a Corpus of violent messages through the social network Twitter.

Fiero

Fiero, a virtual assistant that maintains a conversation with the user encouraging him to express expletives through Telegram. This application is popularly known in the environment of the use of digital tools by the population. The collected dialogue will be used to generate linguistic resources that can be used in automatic artificial intelligence systems to combat social problems such as cyberbullying or hate speech [32]. In Figure 1, you can observe what a conversation with the chatbot looks like to collect insults from users.

Annotation Guide

Pursuing the goal of creating a resource to aid in the detection of violent messages, we decided to generate a fine-grained annotation guide for messages, with a certain degree of semantic complexity, which not only marks whether a message is violent or not, but also certain important elements regarding the content of the message. In this annotation guide shown in figure 2 we collect information about: Insults, Violent vs No-Violent, Level of violence, Role and Type of Violence.

VIL Corpus

Having studied the literature on the task and the importance of the application of PLN and ML and DL techniques, and because these techniques are fed by training data, we conclude the need to create a resource in Spanish that can be used in the effective detection of violent messages, with a level of detail that goes beyond simple binary detection, marking features, which we detail in our annotation guide, such as the degree, the role or type of violence, since we consider that if detection in the messages that can help future explainability in the decisions taken [31].

In the figure 3 you can see an example of how the labeled tweets would look in our corpus.

Research issues to discuss

Following our research and the creation of an annotation guide and corpus presented in the previous sections, we have questions that could be addressed in the future:

1. Is possible to detect violent messages through Machine Learning to curb the problem of the use of violence in social networks? Due to the ambiguity and personal subjectivity in how users understand violence, we may encounter difficulties in reaching an agreement on what constitutes a violent message and what does not.

2. Different phases of violence can be defined in order to know what level of severity we find or in what phase of violence we are in? We could determine the severity level of violence based on the message content to take appropriate actions with the violent user, imposing different consequences based on severity. 3. An automatic alert could be created to warn us that violence is being generated? A prompt response in a case of violence would be beneficial for the virtual community.

Acknowledgments

Figure 1 :1Figure 1: Fiero user interface.

Figure 2 :2Figure 2: Annotation guide

Figure 3 :3Figure 3: Examples of tweets annotated in the VIL corpus. urlhttps://hatebase.org/ http://bit.ly/3J8uHOX

This research work is part of the Spanish research project Cleartext «TED2021-130707B-I00>. This work is funded by MCIN/AEI/10.13039/501100011033 and by the European Union NextGen-erationEU/ PRTR. I would also like to thank my thesis directors Patricio Martinez-Barco and Estela Saquete Boro for their help and collaboration with the project Coolang (proyecto de I+D+i «PID2021-122263OB-C22», funded por MCIN/ AEI/10.13039/501100011033/ and "FEDER Una manera de hacer Europa").

<author> <persName><forename type="first">Hootsuite</forename><surname>Wearesocial</surname></persName> </author> <author> <persName><forename type="first">Digital</forename><surname>Report España</surname></persName> </author> <ptr target="https://encr.pw/8avSe" /> <imprint> <date type="published" when="2022">2022. 2022</date> </imprint> </monogr> </biblStruct> <biblStruct xml:id="b1"> <monogr> <title level="m" type="main">Guía rápida para la prevención del acoso por medio de las nuevas tecnologías JFloresFernandez 2008 LAlonso VJVázquez Sobre la libertad de expresión y el discurso del odio: Textos críticos, Athenaica ediciones universitarias 2017 Killing me softly: Creative and cognitive aspects of implicitness in abusive language online SFrenda VPatti PRosso Natural Language Engineering 2022 The unbearable hurtfulness of sarcasm SFrenda ATCignarella VBasile CBosco VPatti PRosso Expert Systems with Applications 193 116398 2022 Masking and bert-based models for stereotype identication JSánchez-Junquera PRosso MMontes BChulvi Procesamiento del Lenguaje Natural 67 2021 After the 'apicalypse': Social media platforms and their fight against critical scholarly research ABruns Information, Communication & Society 22 2019 The age of twitter: Donald j. trump and the politics of debasement BLOtt Critical studies in media communication 34 2017 GRMcmenamin Introducción a la lingüística forense: un libro de curso

Fresno

Press at 2017 California State University Análisis lingüístico del discurso de odio en redes sociales, VISUAL REVIEW MRSalado International Visual Culture Review/Revista Internacional de Cultura Visual 9 2022 Detecting misogyny and xenophobia in spanish tweets using language technologies F.-MPlaza-Del-Arco MDMolina-González LAUreña-López MTMartín-Valdivia ACM Transactions on Internet Technology (TOIT) 20 2020 A lexicon-based approach for hate speech detection NDGitari ZZuping HDamien JLong International Journal of Multimedia and Ubiquitous Engineering 10 2015 Automatic identification of personal insults on social news sites SOSood EFChurchill JAntin Journal of the American Society for Information Science and Technology 63 2012 Abusive language detection in online user content CNobata JTetreault AThomas YMehdad YChang Proceedings of the 25th international conference on world wide web the 25th international conference on world wide web 2016 Learning from bullying traces in social media J.-MXu K.-SJun XZhu ABellmore Proceedings of the 2012 conference of the North American chapter of the association for computational linguistics: Human language technologies the 2012 conference of the North American chapter of the association for computational linguistics: Human language technologies 2012 Improving cyberbullying detection with user context MDadvar DTrieschnigg ROrdelman FDJong European Conference on Information Retrieval Springer 2013 Using shallow and deep learning to automatically detect hate motivated by gender and sexual orientation on twitter in spanish CArcila-Calderón JJAmores PSánchez-Holgado DBlanco-Herrero Multimodal technologies and interaction 5 63 2021 for policy decision making PBurnap MLWilliams Hate speech, machine classification and statistical modelling of information flows on twitter: Interpretation and communication 2014 Hate speech classification in social media using emotional analysis RMartins MGomes JJAlmeida PNovais PHenriques 10.1109/BRACIS.2018.00019 Proceedings -2018 Brazilian Conference on Intelligent Systems -2018 Brazilian Conference on Intelligent Systems

BRACIS

2018. 2018 Inducing a lexicon of abusive words-a feature-based approach MWiegand JRuppenhofer ASchmidt CGreenberg Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2018 1 Long Papers JQian MElsherief EBelding WYWang arXiv:1904.02418 Learning to decipher hate symbols 2019 arXiv preprint The effect of extremist violence on hateful speech online AOlteanu CCastillo JBoy KVarshney Proceedings of the international AAAI conference on web and social media the international AAAI conference on web and social media 2018 12 A survey on automatic detection of hate speech in text PFortuna SNunes ACM Computing Surveys (CSUR) 51 2018 Resources and benchmark corpora for hate speech detection: a systematic review FPoletto VBasile MSanguinetti CBosco VPatti Language Resources and Evaluation 55 2021 SRosenthal PAtanasova GKaradzhov MZampieri PNakov arXiv:2004.14454 A large-scale semisupervised dataset for offensive language identification 2020 arXiv preprint Hurtlex: A multilingual lexicon of words to hurt EBassignana VBasile VPatti 5th Italian Conference on Computational Linguistics, CLiC-it 2018 CEUR-WS 2018 2253 Share: A lexicon of harmful expressions by spanish speakers FMPlaza-Del Arco AB PPortillo PLÚbeda BGil M.-TMartín-Valdivia Proceedings of the Thirteenth Language Resources and Evaluation Conference the Thirteenth Language Resources and Evaluation Conference 2022 Deep learning for hate speech detection in tweets PBadjatiya SGupta MGupta VVarma Proceedings of the 26th international conference on World Wide Web companion the 26th international conference on World Wide Web companion 2017 DSarkar MZampieri TRanasinghe AOrorbia arXiv:2109.05074 Fbert: A neural transformer for identifying offensive content 2021 arXiv preprint Deepblueai at semeval-2021 task 7: Detecting and rating humor and offense with stacking diverse language model-based methods BSong CPan SWang ZLuo Proceedings of the 15th international workshop on semantic evaluation the 15th international workshop on semantic evaluation

SemEval-

2021. 2021 Violencia identificada en el lenguaje (vil). creación de recurso para mensajes violentos BBotella RSepúlveda-Torres PMBarco ESaquete Procesamiento del Lenguaje Natural 70 2023 Fiero: Asistente virtual para la captación de insultos BBGil FM PDel Arco AB PPortillo YGutiérrez CEUR Workshop Proceedings 2968. 2021