=Paper=
{{Paper
|id=Vol-3159/T1-50
|storemode=property
|title=Feature Selection with Pretrained-BERT for Hate Speech and Offensive Content Identification in English and Hindi Languages
|pdfUrl=https://ceur-ws.org/Vol-3159/T1-50.pdf
|volume=Vol-3159
|authors=Surya Agustian,Reski Saputra,Aidil Fadhilah
|dblpUrl=https://dblp.org/rec/conf/fire/AgustianSF21
}}
==Feature Selection with Pretrained-BERT for Hate Speech and Offensive Content Identification in English and Hindi Languages==
“Feature Selection” with Pretrained-BERT for Hate Speech and Offensive Content Identification in English and Hindi Languages Surya Agustian1, Reski Saputra 1 and Aidil Fadhilah1 1 UIN Sultan Syarif Kasim, Jl. H.R. Soeberantas km 11.5 Panam, Pekanbaru, Riau, Indonesia Abstract The intensive use of social media has led people to express non-formal spoken language, in interactions with others on the internet through text posts. Often, people spill out their annoyance without concern about the use of hate speech, profanity, and abusive language, when is meant to attack and even oppress someone. HASOC 2021 is a shared task that aims to identify hate and abusive content in tweets. In this event, we proposed BERT (and FastText) based transfer learning approach to solve this classification problem. The results obtained by our team UINSUSKA, for English task 1A and 1B, and Hindi task 1A are in the rank 8, 5 and 12 respectively. As for the Hindi task 1B, due to time constraints, our team could not have enough time to develop experiments with BERT, and was ranked 18th for the result using FastText. Keywords 1 Hate speech, abusive content, profane words, BERT, FastText, transfer learning 1. Introduction The differences of personal preference in political, religious, gender, social, cultural and economic backgrounds, often become the source of contention on social media. Abusive and hateful expressions could be made in attacking the interlocutor on social media like Twitter, Facebook, YouTube comments and Instagram. Bullying in a group can also occur against certain person, which is sometime harmful to the person being attacked, so that he/she becomes stressed and depressed, and in some cases lead to suicide [1]. Hate speech, abusive language, profane words, and verbal violence that attack ethnicity, nation, religion, race, or gender are the main factors that are very damaging in social life [2]. They are the cause of hostility to severe bullying on social media [3]. Therefore, these harmful messages must be minimized, filtered and even blocked from social media posts. Detection of hate speech contents, profane words, and abusive languages in social media has attracted the interest of many researchers around the world in recent years [2, 3, 4, 5, 6]. Various studies and shared tasks show significant progress in English and other languages which have similar language structures [7, 8, 9] in [10]. Some of the most promising detection methods are language models using word embeddings that can recognize word contexts, such as word2vec [11], Glove [12], and FastText [13]. In recent years, language models that have been previously trained on a very large corpus [14], have shown effective results for various NLP tasks, such as question answering, machine translation, automatic summarization, text classification and so on [15]. There are several pre-trained language models such as Universal Language Model Tine-Tuning (ULMFiT) [16], Embeddings from Language Models (ELMo) [17], OpenAI Generative Pre-trained Transformer (GPT) [18], and Google BERT [15]. Forum for Information Retrieval Evaluation, December 13-17, 2021, India surya.agustian@uin-suska.ac.id (S. Agustian); 11651101881@students.uin-suska.ac.id (R. Saputra); 11651103464@students.uin-suska.ac.id (A. Fadhilah) ©️2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) With transfer learning, the deep learning model can be used and modified for different NLP tasks [19]. This transfer learning can produce a good result without requiring a training on a large corpus for the new tasks. It often works well by using a small dataset, depends on the task we handle [20]. There are numbers of studies which utilized the pre-trained language model for classification tasks. Among the models that have been studied, BERT and its variants have been reported to produce state- of-the-art performance [4, 10, 17, 18] to be applied on various languages around the globe [20, 21, 22]. For this event, we developed a method implementing transfer learning with BERT [15] and use of FastText language model [14] for classification/detection [23] of hate speech and offensive content (HASOC) in English and Hindi. The next section of this paper describes the classification task in HASOC 2021, the available data provide by organizer, and then followed by the method we developed to solve it. In the fourth section, the results obtained and analysis are discussed. The last section is the conclusion of this study regarding the results among other participants in HASOC 2021. 2. HASOC Task Description HASOC 2021 offers two types of classification tasks. The first is hate, offensive, and profane content identification in English, Hindi and Marathi tweets. While the second task is to identify hate and offensive content in tweet conversations in mixed language (English and Hindi). In this event, we only focus on task 1, which is further divided into 2 subtasks [24] specifically as follows. Subtask 1A: Identifying hate, offensive and profane content from the posts. Sub-task A is to identify hate speech and offensive language in English, Hindi, and Marathi tweets. It is a coarse-grained binary classification which classify tweets into two classes, namely: Hate and Offensive (HOF) and Non-Hate and offensive (NOT). Subtask 1B: Discrimination between hate, profane and offensive posts The goal of this sub-task is a fine-grained identification in English, and Hindi tweets. If the tweets are classified as HOF from the sub-task A, then further classification is conducted to determine if the tweets fall into one of these three categories: • (HATE) Hate speech: The posts under this class contain hate speech content. • (OFFN) Offensive: The posts under this class contain offensive content. • (PRFN) Profane: These posts contain profane words. Tabel 1. The label distribution of HASOC 2020 [4] and HASOC 2021 [24] Subtask Language (year) Label Train 1A English (2021) NOT 1342 HOF 2501 English (2020) NOT 1852 HOF 1856 Hindi (2021) NOT 3161 HOF 1433 Hindi (2020) NOT 2116 HOF 847 Marathi (2021) NOT 1205 HOF 669 1B English (2021) HATE 683 PRFN 1196 OFFN 622 NONE 1342 English (2020) HATE 158 PRFN 1377 OFFN 321 NONE 1852 Hindi (2021) HATE 566 PRFN 213 OFFN 654 NONE 3161 Hindi (2020) HATE 234 PRFN 148 OFFN 465 NONE 2116 The data available for each sub-task is as shown in Table 1. In developing our system, we also considered using 2020 datasets and utilize them as training and validation data. We combine 2020 and 2021 train-set for training, and using 2020 test-set as validation data to obtain a model which perform the best. 3. System Method We developed two types of classification system based on word embedding, namely BERT [15] and FastText [13]. The reason behind BERT is that this model has been widely reported to provide state- of-the-art results in various NLP tasks. Since we don't understand the basics of Hindi at all, the FastText model was also developed based on the assumption, that Hindi has some different Language structures from English. For example, like German or Arabic, there are so many phrases (tokens) constructed on several words which are glued altogether without the use of a space as a separator. In processing the tweet texts, we perform several stages of text preprocessing as follows: 1. Case folding: normalize all tweet texts into lower case 2. Mention handling: transform all mentions into token “@USER” 3. Hyperlink removal: remove all hyperlinks in tweets 4. Emoticon conversion: transform some selected popular emoticons into their text descriptions2 5. Punctuation removal: remove all punctuation and special characters in texts 6. Number removal: remove all numbers in texts 3.1. BERT-based method The architecture of the BERT-based method is shown in Figure 1 for binary classification (Task 1A) and Figure 2 for multi-label classification (Task 1B) as adapted from [25]. We use pre-trained English language model namely BERT-base-uncased [26] with a maximum length (N) of input tweet is 150. While for Hindi we use the pre-trained RoBERTa-hindi-guj-san3, which was trained on Wikipedia articles in Hindi, Sanskrit and Gujarati. Before becoming an input sequence in the BERT block as seen in Figure 1 and 2, the text is preprocessed according to the experiment scenario. Then, the output of the BERT is fed to neural network, with the number of input nodes is corresponding to the dimensions of the BERT, which is 768. The sigmoid activation function is applied to produce output in binary classification. As for Task 1B, the outputs of 4 neurons with a sigmoid activation function in each, are converted back into a single class with 4 labels option, i.e., HATE, PRFN, OFFN, and NONE. The optimizer used for learning is Adam, with the lost function used is L1-norm regularization. Since combination variations of the text preprocessing act as the feature selection for BERT input, the best model is chosen if it has the highest classification accuracy on validation dataset. 2 https://unicode.org/emoji/charts/full-emoji-list.html#1f4aa 3 https://huggingface.co/surajp/RoBERTa-hindi-guj-san Figure 1. BERT-NN Architecture for Task 1A Figure 2. BERT-NN Architecture for Task 1B 3.2. FastText-based method An alternative model developed for this task is also based on deep learning, namely FastText [13]. It is combined with conventional machine learning: KNN, Logistic Regression and Random Forrest. The classification process for tasks 1A and 1B is carried out in 2 phases, namely the phase of constructing the language model, and the phase of classification. In phase 1, a separate training process is carried out to generate 128-dimensional word embeddings from sentences in the corpus. Each word should at least have 3 occurrences. Training on English and Hindi with 1000 iterations and window size of 4 is performed to produce FastText word embeddings. The corpus used in these trainings are merged of the 2020 and 2021 HASOC train datasets. The rationale is that the word vector produced should be better than using the 2021 dataset alone, in regards of the size of corpus source. For Hindi, we specifically implement stopword removal, which the stoplist is collected from github4. Because the language model is trained within this condition, we also remove stopwords in tweet inputs during classification. Since we utilize Google Colab5 for all task computations, we did not use a FastText pre-trained model due to resource usage limitation. The phase 2 is the classification process, which is actually carried out by a conventional machine learning method. It takes input from the FastText language model. A sentence embedding is generated by the vectorizer block, by calculating the resultant norm of the vector of tweet words. For training the ML module, we use the 2021 train-set, or the merge of 2020 and 2021 train-set. As for validation, we use the 2020 testing dataset. The experimental diagram for this FastText-based method can be found in Figure 3. Figure 3. FastText-based system architecture 3.3. Experiment Setup To get the models of both methods and both tasks that have the best performance, we conducted experiment with scenario, i.e.: 1. The use of mention handling: retain all the mentions or transform into “@USER” 2. The use of Emoticon conversion: transform into word definition text or leave it removed by punctuation removal 3. The use of train-set: 2021 only, or the merge of 2020 and 2021 dataset. 4. The use of stopword removal (for FastText in Hindi only) 5. Variation of machine learning methods in FastText-based system. 4 https://github.com/stopwords-iso/stopwords-hi 5 https://colab.research.google.com 4. Results 4.1. Ranked Results From our submission to the HASOC platform, in general the BERT-based best model has better performance than the FastText-based method. For English task 1A and 1B and Hindi task 1A, we submitted the BERT-based classification results, while for Hindi task 1B, we did not have enough time to complete the BERT training process. Therefore, we only submitted the best model of FastText with Logistic Regression classifier. As for Marathi language, we didn't have time to do experiment at all. The system performance along with our team's ranking can be seen in Table 2 below. Table 2. Result on Testing 2021 dataset, compared to Rank #1 Language Task System Result Compare to Rank #1 Macro F1 Macro Rank Macro F1 Macro Precision Precision English 1A BERT+NN 0.8024 0.8010 8 0.8305 0.8414 1B BERT+NN 0.6417 0.6487 5 0.6657 0.6688 Hindi 1A BERT+NN 0.7555 0.7784 12 0.7825 0.7862 1B FastText+LR 0.4257 0.4864 18 0.5603 0.5873 From the results obtained, we suspect that the lower performances in the Hindi dataset are caused by the drastic imbalance between the portion of NOT (3161) and HOF (1433) label. Moreover, using the entire dataset in training for Hindi task 1B will cause the classification result get worse. It is because the inequality between the NONE label is very large against the HATE, PRFN, OFFN labels, which is around 5.5:1, 14.8:1, and 4.8:1 respectively. We predict that a balancing scheme of training data should be carried out before performing the training process, especially for HINDI Task 1B. For English, there is also a large discrepancy between the amount of NOT (1342) and HOF(2501) labels. This condition is inversely proportional to Hindi, where the portion of the NOT label is about twice as larger than HOF label. The balancing process should also benefit the training process in regards to improve the classification results, specifically in English task 1B, where the imbalance between the portions of the targeting labels are quite significant, which is around 1:2 between the small amount labels (HATE, OFFN) and the large amount labels (PRFN, NONE). In terms of the language models, the pre-trained BERT have better text representation (in word vectors) compared to FastText. This is because the training process uses a very large corpus and larger word embeddings size (768). While FastText in this study only uses the HASOC dataset for training, with dimension is set to 128. 4.2. Other Runs As each team is given 5 runs for each sub task, we also submitted other results based on Naïve Bayes for English task, which is multinomial Naïve bayes with word count vectorizer. In order to seek for the best model, we did some experiments with variation on the word cases (cased or uncased), stopword and punctuation (use or remove), choosing to use or leave as it is. We also considered the length of tokens, which are words (space separated tokens) with minimum 2 characters. We didn’t explore Naïve Bayes on Hindi because we do not have any knowledge about words in Hindi, is it similar with English (space separated) or not. Table 3 shows unranked runs in our team submission compared to the closest ranks of other teams. These includes Bert-based and FastText-based with certain “feature selection” scenarios. Previously, best model on validation data (test 2020) has been explored on each method with its ‘feature variations’. For English Task 1A, we did further ‘feature selection’ from the BERT+NN v1 method (Run1, ranked 8), by transforming emoji into text description, and replacing mentioned users in tweet by ‘@USER’. Other settings in BERT+NN v1 are remained untouched, e.g. capital letters are changed into lowercases, all hyperlinks and punctuations are removed, and only use train 2021 dataset for training. We notice that replacing emoticon into words can reduce the detection accuracy for BERT-based method, as for this work, the F1 score is only 0.7876 (Run4). FastText+KNN and Multinomial NB were submitted as Run2 and Run3 respectively, with F1 scores are 0.7395 and 0.6634. Table 3. Unranked Runs Language Task System Macro F1 Compare to closest ranks Upper Macro Lower Macro Rank F1 Rank F1 English 1A BERT+NN v2 0.7876 22 0.7894 23 0.7823 Multinomial NB 0.7395 42 0.7413 43 0.7389 FastText+KNN 0.6634 53 0.6813 54 0.5999 1B Multinomial NB v1 0.5378 30 0.5638 31 0.4969 Multinomial NB v2 0.5236 30 0.5638 31 0.4969 Hindi 1A FastText+LR 0.6914 31 0.7181 32 0.6848 FastText+RF 0.6668 33 0.6762 34 0.6628 FastText+KNN 0.6435 - - - - 1B FastText+LR v2 0.4237 18 0.4257 19 0.4077 As FastText+KNN has lower F1 score than Multinomial NB in Task 1A, for English Task 1B we left FastText unexplored. While BERT+NN method for English Task 1B was still in training process, we develop Multinomial NB with one-versus-all scheme to solve multiclass classification. The result using merged train 2020 and 2021 dataset (v2) is lower than using train 2021 dataset only (v1), but not significant. For Hindi, we only explore word embedding based method as the input features for machine learning block. Our runs in Hindi Task 1A show that BERT-based embedding has higher result than FastText- based with some ML combinations (i.e., logistic regression, random forest and K-Nearest Neighbor) significantly. For FastText, we only train the small size of tweet data to produce word embeddings. We curious if using pre-trained FastText in Hindi could yield competitive results with BERT-based. In our experiments, combination of FastText with LR, RF and KNN yield F1 score of 0.6914, 0.6668, and 0.6455 respectively, lower than BERT+NN which is 0.7555. While for Task 1B, we only submit FastText based method in two runs. Both runs use the same methods, only differ in using stopword removal in Run2 (v2), which is lower but not significant. 5. Conclusion This paper explains the description of the systems participating in hate speech and offensive content identification (HASOC) 2021. In general, the results obtained using BERT-based transfer learning have a good robustness when implemented in different languages, English and Hindi. With the same architecture, and almost the same text preprocessing as feature selection, the BERT-based method for binary classification (task 1A) produces good F1 scores, i.e., 0.8024 and 0.7555 for English and Hindi respectively. As for the multi-label classification, the F1 score obtained for English task 1B is also quite good, i.e., 0.6417, with gap about 0.2 from rank #1. The developed method based on BERT, are ranked 8 of 56 and 5 of 37 for English task 1A and 1B respectively, and got rank 12 of 34 for Hindi task 1A. References [1] D. Luxton, J. D. June, & J. M. Fairall, (2012). Social media and suicide: a public health perspective. American journal of public health, 102 Suppl 2, S195–S200. doi:10.2105/AJPH.2011.300608 [2] P. Nakov, V. Nayak, K. Dent, A. Bhatawdekar, S.M. Sarwar, M. Hardalov, Y. Dinkov, D. Zlatkova, G. Bouchard, I. Augenstein (2021). Detecting Abusive Language on Online Platforms: A Critical Analysis, arXiv:2103.00153, 2021 [3] R. Kumar, AK. Ojha, S. Malmasi, M. Zampieri (2018). Benchmarking Aggression Identification in Social Media. In: Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018), ACL 2018. p. 1–11. [4] T. Mandl, S. Modha, G.K. Shahi, A.K. Jaiswal, D. Nandini, D. Patel, P. Majumder, J. Schäfer (2020). Overview of the HASOC track at FIRE 2020: Hate Speech and Offensive Content Identification in Indo-European Languages, arXiv:2108.05927v1, 2021 [5] S. Modha, T. Mandl, G.K. Shahi, H. Madhu, S. Satapara, T. Ranasinghe, M. Zampieri (2021). Overview of the HASOC Subtrack at FIRE 2021: Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages and Conversational Hate Speech, In: FIRE 2021: Forum for Information Retrieval Evaluation, Virtual Event, 13th-17th December 2021, ACM 2021 [6] V. Basile, C. Bosco, E. Fersini, D. Nozza, V. Patti, R. Pardo, F. Manuel, P. Rosso, and M. Sanguinetti, (2019). SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter, in Basile et al (2019), In: The 13th International Workshop on Semantic Evaluation, Association for Computational Linguistics 2019, doi:10.18653/v1/S19-2007 [7] P. Fortuna, S. Nunes (2018). A Survey on Automatic Detection of Hate Speech in Text. ACM Computing Surveys 2018; 51(4):85:1–85:30. doi:10.1145/3232676 [8] T. Davidson, D. Warmsley, M.W. Macy, I. Weber (2017). Automated Hate Speech Detection and the Problem of Offensive Language. In: The International AAAI Conference on Web and Social Media ICWSM 2017 [9] S. Zimmerman, U. Kruschwitz, C. Fox (2018). Improving Hate Speech Detection with Deep Learning Ensembles. In: Language Resources and Evaluation Conference (LREC) 2018. [10] S. MacAvaney, H-R. Yao, E. Yang, K. Russell, N. Goharian, O. Frieder (2019). Hate speech detection: Challenges and solutions. PLoS ONE 14 (8): e0221152. doi:10.1371/journal.pone.0221152 [11] T. Mikolov, K. Chen, G. Corrado, J. Dean (2013). Efficient Estimation of Word Representations in Vector Space, arXiv:1301.3781 [12] J. Pennington, R. Socher, C. Manning (2014). GloVe: Global Vectors for Word Representation In: The 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) ACL 2014. doi:10.3115/v1/D14-1162 [13] P. Bojanowski, E. Grave, A. Joulin, T. Mikolov (2016). Enriching Word Vectors with Subword Information, arXiv:1607.04606 [14] M. Mozafari, R. Farahbakhsh, N. Crespi (2020). A BERT-Based Transfer Learning Approach for Hate Speech Detection in Online Social Media. Studies in Computational Intelligence, 881 SCI, 928–940. doi:10.1007/978-3-030-36687-2_77 [15] J. Devlin, M.W. Chang, K. Lee, K. Toutanova, (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In: The North American Chapter of the Association XIV for Computational Linguistics: Human language Technologies (NAACL-HLT) 2019 [16] J. Howard, S. Ruder (2018). Universal Language Model Fine-tuning for Text Classification, arXiv:1801.06146 [17] M.E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, L. Zettlemoyer (2018). Deep contextualized word representations. arXiv:1802.05365 [18] A. Radford, K. Narasimhan, T. Salimans, I. Sutskever (2018). Improving language understanding by generative pre-training, Technical report, OpenAI [19] R. Liu, Y. Shi, C. Ji, M. Jia (2019). A Survey of Sentiment Analysis Based on Transfer Learning, Advanced Optical Imaging for Extreme Environments Vol 7 Special Section, IEEE Access 2019. doi:10.1109/ACCESS.2019.2925059 [20] E. Bataa, J. Wu (2019). An Investigation of Transfer Learning-Based Sentiment Analysis in Japanese, In: The 57th Annual Meeting of the Association for Computational Linguistics, ACL 2019, Italy, July 28 - August 2, 2019 [21] B. Chan, S. Schweter, T. Moller (2020). German’s Next Language Model, In: The 28th International Conference on Computational Linguistics, Barcelona, Spain (Online), December 8- 13, 2020 [22] I.A. Farha, W. Magdy (2021). Benchmarking Transformer-based Language Models for Arabic Sentiment and Sarcasm Detection, In: The Sixth Arabic Natural Language Processing Workshop, Kyiv, Ukraine (Virtual), April 19, 2021. [23] A. Joulin, E. Grave, P. Bojanowski, T. Mikolov (2016). Bag of Tricks for Efficient Text Classification, arXiv:1607.01759 [24] T. Mandl, S. Modha, G. K. Shahi, H. Madhu, S. Satapara, P. Majumder, J. Schäfer, T. Ranasinghe, M. Zampieri, D. Nandini, A. K. Jaiswal, Overview of the HASOC subtrack at FIRE2021: Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages, in: Working Notes of FIRE 2021 - Forum for Information Retrieval Evaluation,CEUR, 2021. URL:http://ceur- ws.org/. [25] H.M. Zahera (2019). Fine-tuned BERT Model for Multi-Label Tweets Classification, In: The 28th Text REtrieval Conference (TREC) 2019 [26] J. Devlin, M-W. Chang, K. Lee, K. Toutanova (2018). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, arXiv:1810.04805