=Paper=
{{Paper
|id=Vol-2936/paper-155
|storemode=property
|title=Profiling Spreaders of Hate Speech with N-grams and RoBERTa
|pdfUrl=https://ceur-ws.org/Vol-2936/paper-155.pdf
|volume=Vol-2936
|authors=Christopher Bagdon
|dblpUrl=https://dblp.org/rec/conf/clef/Bagdon21
}}
==Profiling Spreaders of Hate Speech with N-grams and RoBERTa==
Profiling Spreaders of Hate Speech with N-grams and RoBERTa Notebook for PAN at CLEF 2021 Christopher Bagdon Eberhard Karls Universität Tübingen, Fachschaft Sprachwissenschaft Wilhelmstr. 19 72074 Tübingen, Germany Abstract This paper outlines our approach to the 2021 CLEF Conference Shared Task, Profiling Hate Speech Spreaders on Twitter. Our approach uses the probability output of a logistic regression classifier and a RoBERTa based classifier as features for a linear support vector classifier. During a final cross validation analysis the Spanish meta-classifier performed better than any other single classifier. For English the meta-classifier performed slightly worse than the RoBERTa classifier. On the test set our system per- formed moderately well in comparison to other submissions, with 81% accuracy for Spanish and 67% for English. Overall our system placed 15th of 66 entries. Keywords N-grams, RoBERTa, SVM, TF-IDF, Transformer-model 1. Introduction Social media has taken a firm place in people’s lives around the world. The number of people which use social media continues to grow year over year. While this has many possible benefits, such as allowing people to express themselves and connect with others, it also comes with drawbacks, such as the proliferation of hate speech. The combination of anonymity, echo chambers, and ease of access helps to circulate hate speech on different platforms [1]. These platforms have a need to be able to automatically detect hate speech and profile its spreaders. This paper details our submission to the 2021 PAN Author Profiling Shared Task, Profiling Hate Speech Spreaders on Twitter. The task is to classify Twitter users as a spreader of hate speech or not, given a sample of 200 tweets per user. In previous Author Profiling shared tasks Support Vector Machines (SVM) and n-grams have proven very successful across different tasks, while transformer based approaches have only seen moderate success [2], despite showing strong results in direct fake news [3] and hate speech detection [4]. Our approach attempts to combine the results from a n-gram-based logistic regression classifier with a transformer model based on RoBERTa via a SVM meta-classifier. The paper is structured as follows: Section 2 reviews related research and works. Section 3 dives into our approach by going through the preprocessing, the logistic regression and RoBERTa models, and CLEF 2021 – Conference and Labs of the Evaluation Forum, September 21–24, 2021, Bucharest, Romania " Christopher.Bagdon@student.uni-tuebingen.de (C. Bagdon) 0000-0002-0877-7063 (C. Bagdon) © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) finally the meta-classifier. Section 4 covers the results from training and of the test set. Finally section 6 shares our conclusions. 2. Related Work Hate speech detection has been a popular topic among Natural Language Processing researchers in recent years. Academic events such as IberEval 2018 [5] and SemEval-19 [6], to name a couple, show how strong interest in the topic is. The tasks from these events provide insight on successful approaches and common challenges when detecting hate speech. In the IberEval 2018 task, Automatic Misogyny Identification, the most successful approach used an SVM with combinations of stylistic, structural and lexical features, while other strong approaches used SVMs with n-gram features. Deep learning approaches were not as successful [5]. The majority of submissions to SemEval-19’s task, Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter, used some form of Deep Learning Model, including Recurrent Neural Networks (RNN) and large language models. Despite this, the highest performing systems across both sub tasks for English and Spanish datasets employed more traditional machine learning models, mostly SVMs [6]. Interestingly, the top three systems labeled the same 14.6% of texts incorrectly on the Spanish dataset and 19.1% of texts for English [6]. This could be caused by a common difficulty in identifying hate speech; when slurs or words commonly associated with hate speech are used in a humorous context [6], or by targeted communities reclaiming the words used to target them [7]. Machine learning systems often lack the context to determine if these words are being used in a manner that constitutes hate speech [8]. There are numerous other possible pitfalls for machine learning to fall into due to lack of contextual understanding, such as authors quoting historical texts or referring to specific instances of hate speech [9]. While a tweet or text might contain hate speech, that does not guarantee that the text as a whole is hate speech. In the 2020 PAN shared task, Profiling Fake News Spreaders on Twitter, there was a variety of methods used for classification, preprocessing, and feature selection [2]. The best performing approaches used word and/or character n-grams with SVM and/or Logistic Regression classifiers. This saw upwards of 0.75 and 0.82 accuracy scores on English and Spanish data sets respectively. These approaches were also effective in the 2019 task, seeing results as high as 0.95 accuracy on bot detection and 0.82 accuracy on gender profiling [10]. The top three approaches from 2020 directly showed to be effective on this year’s task with only small losses in performance, though without any tuning [11]. Recently NLP tasks have seen success using transformer based large language models and transfer learning [12][13]. Researchers have been successful in using models such as Google’s BERT in classification tasks such as detecting hate speech [4], fake news detection [3], and authorship attribution [14]. Over the last couple years variations of of these models have been made available, such as RoBERTa [13] and DistilBERT[15], which have shown improvements in both performance and accessibility. As researchers continue to pour resources into build- ing larger language models, their ability to perform down stream tasks via transfer learning continues to grow [16]. 3. Methodology The system is composed of three major parts. After preprocessing, the data it is sent to a logistic regression classifier and to a RoBERTa classifier separately. The probability outputs from each are then used as features for the final meta classifier, a linear SVM. Figure 1: System architecture 3.1. Preprocessing First each author’s tweets are concatenated into a single string per author, as this was found to be more effective than classifying each tweet separately in previous work[17]. Then the text is set to lowercase and repeated characters are removed. For data going to the RoBERTa models, emojis are replaced with #EMOJI# (following the dataset’s format for replacement tokens). 3.2. SVM and Logistic Regression To serve as a baseline, a Linear SVM is used, based on the success found in previous shared tasks [11][18]. The classifier takes two features; a Term Frequency-Inverse Document Frequency (TF-IDF) sparse matrix of character n-grams and a TF-IDF for word n-grams. The word TF-IDF uses a range of (1, 2) n-grams, limited to a minimum of 0.05 frequency and a max of 0.85. The character TF-IDF uses a range of (1, 6) n-grams with 0.001 minimum and no maximum. The model was optimized with a repeated stratified K-fold grid search, using 10 splits repeated 3 times. In order to provide probabilities rather than predictions for the meta-classifier, a logistic regression classifier is used in place of the SVM. It was optimized with the same methods as the SVM. The hyperparameters for both can be seen in Table 1. 3.3. RoBERTa The RoBERTa models are built using the Simple Transformers1 library. The English model is built with the Roberta-base pretrained model and the Spanish model with the Spanberta-base- 1 https://simpletransformers.ai Table 1 Hyperparameters found via Grid Search Model Language C Tol Class Weight Intercept Scaling Loss SVM (baseline) EN 22000 0.1 Balanced 0.877 Hinge Logistic Regression EN 100000 1e-05 Balanced 0.1 – Meta-Classifier SVM EN 0.015 0.5 None 5 Hinge SVM (baseline) ES 22000 0.1 Balanced 0.01 Hinge Logistic Regression ES 100000 1.53e-04 Balanced 0.1 – Meta-Classifier SVM ES 1 5 Balanced 5 Squared Hinge Table 2 Hyperparameters for RoBERTa and SpanBERTa models. Model Language LR Epochs Batch Size Eval-Batch Size Weight Decay Special Tokens RoBERTa EN 2.84E-05 3 8 4 0.1 [#EMOJI#, #HASHTAG#, #USER#, #URL#] SpanBERTa ES 2.86E-05 1 8 4 0.1 [#EMOJI#, #HASHTAG#, #USER#, #URL#] cased2 pretrained model. The models consist of 12 hidden layers, 12 attention heads, a single dense layer classifier, and uses Adam optimizer. Hyper parameters were found using the Sweeps function of Wandb3 and can be seen in Table 2. Each model was trained on an 80/20 split of their respective data-set. The max token length is set to 128, due to a lack of computational power, so a sliding window is used to break up the long concatenated strings. Each window uses a 20% overlap. The final classification output is a list containing a probability for each class for each window, per author. Each list is reduced to a single probability per class by taking the median value of all windows. Summing the values and averaging the values was also tested, but the median values showed marginally better results. 3.4. Meta-Classifier A Linear SVM is used as the meta-classifier. Four features are used as input; one probability from the RoBERTA classifier and from the logistic regression classifier each, per class. The meta-classifier was trained using the same 80/20 training splits as the RoBERTa models. Hyper parameter optimization was done using grid searches and the chosen parameters can be found in Table 1. 4. Results To analyze the effectiveness of the system each working part was put through a 10 fold cross validation using only the training set. The Spanish model performed far better than its English counterpart. The logistic regression and SpanBERTa models each performed better than the baseline SVM, and the SVM meta-classifier out performed all of them. Unfortunately the English 2 https://skimai.com/roberta-language-model-for-spanish 3 https://docs.wandb.ai/guides/sweeps Table 3 Cross Validation Results Model Language Accuracy SVM (baseline) EN 0.66 Logistic Regression EN 0.640 RoBERTa EN 0.695 Meta-Classifier EN 0.682 SVM (baseline) ES 0.796 Logistic Regression ES 0.825 SpanBERTa ES 0.817 Meta-Classifier ES 0.845 Table 4 Final test set results Model Language Accuracy EN 67% Meta-Classifer ES 81% models did not fare as well. Logistic regression saw a slight loss compared to the baseline. The RoBERTa model was the strongest, scoring a point higher than the meta-classifier, but failing to break into 0.70 accuracy. On the test set the system performed very well compared to the training set. The meta- classifier only lost 1% accuracy on the English dataset and still outperformed the SVM and logistic regression parts. For Spanish the meta-classifier lost 3% but maintained a better score than the SVM and roughly the same score as the SpanBERTa model. Our system ranked 15th of 66 submissions. It did especially well on the Spanish dataset, coming in just 4% below the top ranked submission. English was not far behind, scoring 7% under the leading system. Overall it seems there could be a benefit to combining the results of a transformer based model and simpler models such as logistic regression. In the future it will be interesting to try this again with different datasets and different transformer models, such as OpenAI’s ginormous GPT-3. References [1] M. Mondal, L. A. Silva, F. Benevenuto, A measurement study of hate speech in social media, in: Proceedings of the 28th ACM Conference on Hypertext and Social Media, HT ’17, Association for Computing Machinery, New York, NY, USA, 2017, p. 85–94. URL: https://doi.org/10.1145/3078714.3078723. doi:10.1145/3078714.3078723. [2] F. Rangel, A. Giachanou, B. Ghanem, P. Rosso, Overview of the 8th author profiling task at pan 2020: Profiling fake news spreaders on twitter, in: CLEF, 2020. [3] H. Jwa, D. Oh, K. Park, J. M. Kang, H. Lim, exbake: Automatic fake news detection model based on bidirectional encoder representations from transformers (bert), Applied Sciences 9 (2019). URL: https://www.mdpi.com/2076-3417/9/19/4062. doi:10.3390/app9194062. [4] M. Mozafari, R. Farahbakhsh, N. Crespi, A bert-based transfer learning approach for hate speech detection in online social media, CoRR abs/1910.12574 (2019). URL: http: //arxiv.org/abs/1910.12574. arXiv:1910.12574. [5] E. Fersini, P. Rosso, M. Anzovino, Overview of the task on automatic misogyny identifica- tion at ibereval 2018, in: IberEval@SEPLN, 2018. [6] V. Basile, C. Bosco, E. Fersini, D. Nozza, V. Patti, F. M. Rangel Pardo, P. Rosso, M. Sanguinetti, SemEval-2019 task 5: Multilingual detection of hate speech against immigrants and women in Twitter, in: Proceedings of the 13th International Workshop on Semantic Evaluation, Association for Computational Linguistics, Minneapolis, Minnesota, USA, 2019, pp. 54–63. URL: https://www.aclweb.org/anthology/S19-2007. doi:10.18653/v1/S19-2007. [7] C. Bianchi, Slurs and appropriation: An echoic account, Journal of Pragmatics 66 (2014) 35–44. doi:https://doi.org/10.1016/j.pragma.2014.02.009. [8] T. Davidson, D. Warmsley, M. W. Macy, I. Weber, Automated hate speech detection and the problem of offensive language, CoRR abs/1703.04009 (2017). URL: http://arxiv.org/abs/ 1703.04009. arXiv:1703.04009. [9] S. MacAvaney, H.-R. Yao, E. Yang, K. Russell, N. Goharian, O. Frieder, Hate speech detection: Challenges and solutions, PLOS ONE 14 (2019) 1–16. doi:10.1371/journal. pone.0221152. [10] F. Rangel, P. Rosso, Overview of the 7th Author Profiling Task at PAN 2019: Bots and Gender Profiling, in: L. Cappellato, N. Ferro, D. Losada, H. Müller (Eds.), CLEF 2019 Labs and Workshops, Notebook Papers, CEUR-WS.org, 2019. URL: http://ceur-ws.org/Vol-2380/. [11] C. Bagdon, S. Grässel, Examining hate speech spreaders and fake news spreaders through pan shared tasks (2021). URL: https://www.researchgate.net/publication/351881197_ Examining_Hate_Speech_Spreaders_\and_Fake_News_Spreaders_Through_PAN_ Shared_Tasks?channel=doi&linkId=60ae7e01a\6fdcc647ede8894&showFulltext=true. doi:10.13140/RG.2.2.12308.22404. [12] J. Devlin, M. Chang, K. Lee, K. Toutanova, BERT: pre-training of deep bidirectional transformers for language understanding, CoRR abs/1810.04805 (2018). URL: http://arxiv. org/abs/1810.04805. arXiv:1810.04805. [13] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, V. Stoy- anov, Roberta: A robustly optimized BERT pretraining approach, CoRR abs/1907.11692 (2019). URL: http://arxiv.org/abs/1907.11692. arXiv:1907.11692. [14] M. Fabien, E. VILLATORO-O, P. Motlicek, S. Parida, Bertaa: Bert fine-tuning for authorship attribution, Proceedings of the 17th International Conference on Natural Language Processing (2020). URL: http://infoscience.epfl.ch/record/285045. [15] V. Sanh, L. Debut, J. Chaumond, T. Wolf, Distilbert, a distilled version of BERT: smaller, faster, cheaper and lighter, CoRR abs/1910.01108 (2019). URL: http://arxiv.org/abs/1910. 01108. arXiv:1910.01108. [16] T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, D. Amodei, Language models are few-shot learners, CoRR abs/2005.14165 (2020). URL: https://arxiv. org/abs/2005.14165. arXiv:2005.14165. [17] A. Baruah, K. Das, F. Barbhuiya, K. Dey, Automatic Detection of Fake News Spreaders Using BERT—Notebook for PAN at CLEF 2020, in: L. Cappellato, C. Eickhoff, N. Ferro, A. Névéol (Eds.), CLEF 2020 Labs and Workshops, Notebook Papers, CEUR-WS.org, 2020. URL: http://ceur-ws.org/Vol-2696/. [18] J. Pizarro, Using N-grams to detect Fake News Spreaders on Twitter—Notebook for PAN at CLEF 2020, in: L. Cappellato, C. Eickhoff, N. Ferro, A. Névéol (Eds.), CLEF 2020 Labs and Workshops, Notebook Papers, CEUR-WS.org, 2020. URL: http://ceur-ws.org/Vol-2696/.