Transformer Based Model For Offensive Content Recognition In Dravidian Languages S Divya 1, N Sripriya 2 1,2 Department of Information Technology, Sri Sivasubramaniya Nadar College of Engineering, Chennai, India Abstract This paper describes a model for spotting offensive data from the comments being collected from social media. The comments posted will include expressions, emoticons and will mostly be in code mixed language and classifying these code-mixed language comments is tricky. The proposed system uses a multi-head attention model to extract features from the code- mixed Tamil input data. Various classification algorithms are applied to these extracted features to categorize offensive comments. The generated labels are optimized by performing majority voting on labels generated by different algorithms. This system is validated on the validation set and is evaluated by applying the Tamil CodeMix test data from the dataset published by the HASOC task (Task2-subtask1) at FIRE 2021. The evaluation yields an average weighted F1 score of 0.83 and is ranked 3rd position in the official ranking. Keywords 1 Offensive, emoticons, Code-mix, Natural Language Processing. 1. Introduction Universal internet access provides users with the liberty to share their thoughts and expressions in various mediums like business pages, blogs, etc. This allows interaction among people of different cultures and origins. Information sharing among various people can be done within seconds. People are open to express their positive and negative opinions about the content available on social media. The negative opinion on a given content can be expressed by using profane words, abusive language, or displeased expressions. This negative expression may be aggressive or degrade the self-esteem of the one who views the comment. The increase in the utilization of offensive words in social media accounts leads to the identification of abusive and offensive words which is an anti-harassment policy on social media [1]. This hate and offensive content posted on the web is a threat and a challenge to society. Many countries forbid offensive words in social media intending to avoid annoying viewers or trigger any misdeed. As negative opinions are mostly expressed with hate or offensive words, a variety of online forums define their policies to detect and ignore abusive words that create a negative impact on the community. Since there is no explicit definition for hate or offensive speech, and as it can vary depending on its context, detecting such abusive words is challenging. Even in the absence of ubiquitous clarity for identifying abusive words [2] stated a definition that, hate or abusive speech is defined as "any kind of communication that disparages a target group of people based on their specific characteristics such as culture, complexion, gender, nationality, religion or any other characteristics [1]. Manual identification and elimination of abusive words in online content are complex due to the availability of fast-growing online users. This task becomes still more complex due to the presence of unknown users in such forums who share their opinion in a language that is self-understandable and FIRE 2021, Forum for Information Retrival Evaluation, December 13-17, 2021, India EMAIL: divyas@ssn.edu.in@ssn.edu.in(S.Divya); sripriyan@ssn.edu.in (N.Sripriya) 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) maybe code mixed. This paved a demand for a model or technique that automates the identification of abusive and insulting remarks promptly and quickly from the online content. This paper describes a model that could identify the hate speech and offensive content from the Indo-European language (HASOC) track at FIRE 2021. The overview of the existing systems in HASOC is presented in section 3. The transformer model for embedding is explained in section 4. Various classification algorithms applied to classify the offensive and non-offensive comments are detailed in section 5. The task description for the proposed model for HASOC and the system evaluation is explained in section 6. The concluding section concludes the work with few remarks and future scope. 2. Related work Even when describing and realizing objectionable words is tedious, several kinds of research have been carried out in the opinion extraction domain to automate the identification of abusive words in online content. A combination of Artificial Intelligence and Natural Language Processing has paved the way for a variety of approaches in this task. In most scenarios, determining the level of intensity of mood or moods (positive/negative) can be an effective attribute in exploring the views of hate speech identification. Machine learning algorithms are used to classify the content based on their essential or relevant words and phrases [5],[6]. Subjective and non-substantive functions in the input are detected and are used for the conceptual classification of the input. These feature detections are done using methods such as parts of speech tagging, a bag of words, character n-grams, word, and character frequency, and so on. Once the contextual feature extraction is done, machine learning algorithms are used to classify text based on the component terms and expressions. Challenges in automating the offensive word identification are analyzed and a multi-view Support Vector Machine (SVM) [4] model is proposed that attains a performance closer to the state-of-the-art model. A multi- class classifier is proposed to categorize tweets into various classes such as abusive words, either abusive or offensive, neither abusive nor offensive, etc.[3]. A combination of SVM and Logistic Regression (LR) [7] is applied to the extracted features to detect abusively or hate speech. These techniques perform better in languages that have regular grammar. Most of the comments posted by the social media users are in non-formal language code-mixing or Code Switching will not be in proper grammar format. The challenge in the processing of tasks using Code-mixed data is the lack of text data in such languages. a corpus incorporating Tamil-English Code-mixed data [17] is collected and is annotated to perform a task in such data. This dataset comprises comments from YouTube which is being annotated and benchmarked for the Sentiment Analysis task [15], [16]. Various Deep Learning models such as Recursive Neural Networks (RNN), Long Short-Term Memory (LSTM) are applied to detect hate speech by considering the dependency among the previous content in the input. Embedding models like the Fast Text, Glove, BERT [8] are used to translate the highly scattered n-dimensional vectors in a comparatively low-dimensional space. These embeddings make it easier for machine learning algorithms to be applied on vast volumes of data in which the terms are represented as vectors. The representation of this data is generated using Term Frequency Inverse Document Frequency (TF/IDF) and this is then utilized for training various traditional ML algorithms. Corpus collection of Kannada-English Code-mixed dataset for multi-tasking such as Sentiment Analysis and offensive language identification is done [18],[19]. This dataset comprises 7,671 comments that are annotated and are benchmarked using computational models. As a basic system, various traditional ML algorithms are applied to this data and are evaluated[8], [9], [10]. A collection of 1200 Hindi and Marathi documents from comments [23] was generated on social media. This dataset is applied on a model derived with the combination of Naive Bayes (NB), Support Vector Machine (SVM) using Radial Basis Function (RBF), and Linear Kernels[11], [12]. An accuracy of 90% is obtained on the Marathi dataset and a range of 70 % to 80% is obtained on the Hindi dataset[13], [14]. A model based on Contrastive learning using twin BiLSTM networks and a clustering-based method to extract Code-mixed transliterated words. Based on the variation in configurations of language pairs, accuracy in a range of 70& 79% is obtained [20]. A sub-word LSTM architecture for learning representations in the sub-word level for sentiment analysis is proposed[21]. This supports the ability to learn sentiment value information about important morphemes. Effective learning on the input data to generate representation[22] helps in better performance of the required task. The proposed method learns the code-mixed input using a model that can understand multiple languages and then subjects the representation to a certain model for classification. 3. Embedding using Multilingual BERT The comments shared online will be in a code-mixed language that can facilitate easy public opinion. This becomes a challenge for automating the rating of moderate and abusive terms in the comments. For instance, the comments may be posted in English or Tanglish (Tamil + English). A system to identify the language, understand the comments, and detect the harsh or offensive terms in this input is challenging. Creating representation for each sentence is done using a language-specific element that identifies the sentence language and specific features that extract the meaning of the sentence. With an assumption that each sentence is in the same language, the information conveyed in the sentences is extracted through this representation. Sentence representation is generated using multilingual BERT in which the BERT base model is additionally pre-trained with randomly selected phrases from 104 languages to make it applicable to different language classifications. Five exploring tasks are incorporated to estimate how Multilingual BERT handles diverse language sentences as a single language sentence.  To identify the language, a classifier is added on top of the representation for identifying the language of the sentence.  Since sentences with similar language tend to have similar representations, evaluation of similar language sentences being cluster together is done using V-measure over hierarchical clustering [9].  The distance between representations is calculated for each sentence in the multi-parallel corpus and the sentences with the least distance are retrieved. In each language, linear regression is equipped to project the other language representations into English representation space with a minimum set of parallel sentences.  Processing bilingual statements require communication on a word level, even when sentence retrieval can be done with keyword recognition. Word alignment is determined as a minimum weighted bipartite graph. The tokens in the sentences of dissimilar languages are connected and the edges are weighted using the cosine distance between the representation.  The quality of the Machine Translation system has been determined without accessing the reference translation. The correlation with the number of edits that a human must do to attain the Human targeted summary is the evaluation metric for the Machine Translation system. The quality of translation is determined using the cosine distance between the source sentence representation and the device translation. This embedding technique is applied to the input data to generate a representation for sentences with dissimilar languages. A classifier is placed on top of this representation to categorize the sentences based on the requirements. 4. Classification Algorithms Three different machine learning procedures are applied to classify the data based on the offensive content available in the input data. The classifiers are given with certain labeled data from which the model extracts the characteristics that support learning the offensive and non-offensive content. 4.1. Support Vector Machine Classifier To categorize the information can be found points being projected in the n-dimensional plane, Support Vector Classifier (SVM) [10]. This is accomplished by obtaining an optimum hyperplane among dissimilar categories of data points. A binary class classification problem can be solved by placing a hyper-plane between two categories of data points. This can be done by selecting many possible hyper-planes between two classes. This tends to find a plane that has a maximum distance between data points of two different classes. Increased marginal distance between two classes helps in the convenient classification of additional existing data points. These hyper-planes tend to be the decision perimeter that supports classifying the data points. Data points that fall on either side of the plane are assigned to various categories. The hyperplane size is based on the number of input features. Data points that appear closer to the hyper-planes are named as support vectors and this controls the location and angle of the hyper-planes. This helps to increase the marginal distance between the two classes. 4.2. Extreme Gradient Boosting Classifier (XGBoost) A gradient boosting framework that works based on the decision tree ensemble learning model. XGBoost algorithm [11] is evolved from decision trees along with some additional features summed up to outperform all other frameworks. A decision tree is utilized to produce feasible solutions as a decision on certain conditions and generates a graphical representation. A combination of multiple solutions derived from different decision trees is done through voting methods that are a meta-algorithm known as bootstrap aggregation or bagging. Random Forest is a bagging-based technique where only a selective batch of features are chosen to produce a group of decision trees. The effectiveness of this model is enhanced by successive constructing models by diagnosing the disadvantages of the previous model. This is known as boosting. Gradient descent is incorporated with the boosting technique to reduce the errors that occur in the sequentially constructed model. As an update, optimization of gradient boosting is done using additional features like parallel processing, tree-pruning, handling missing values, standardization to avoid biasing or overfitting. 4.3. Linear Discriminant Analysis (LDA) For each class, the statistical property of the data is estimated. For every single input data, the meaning and variance of the data in each class are determined. In the case of multiple variables, a bell-shaped curve, known as Gaussian, is used to estimate means and covariance matrix. These statistical properties are fed into the LDA equation to facilitate classification. The input data must be prepared before that is applied to the LDA [12]. The outliers must be removed and the input data must be standardized. The LDA model uses Bayes' theorem to evaluate the probability of input data belonging to each class. The prediction is done on the basis that the class with the greatest probability is the output class. 5. Task Description and Proposed Model Hate Speech and Offensive Content identification HASOC in Indo-European languages focus on identifying the abusive content in code-mixed languages such as English, Malayalam, Hindi. To identify the offensive content, the input data are gathered from posts and comments shared on Twitter and Facebook. This task comprises two sub-tasks. Task 1 is a message-level classification task that develops a system to categorize the comments produced in Tamil. Task 2 is a message-level classification task that builds a system to automatically classify the code-mixed Tamil and Malayalam tweets into offensive and non-offensive classes. In this proposed model, an automatic classification system is built to classify the offensive and non-offensive content in the code-mixed Tamil comments. The sample for the code-mixed Tamil data is shown below. Table 1 Sample for CodeMix dataset ID Text Label tl_1 Yarayellam FDFS ppga ippove ready agitinga Off tl_2 Ennada viswasam mersal sarkar madhri time la likes and views create Not pannalayae The proposed system to automatically classify the offensive content from the given dataset is described below. The input data must be pre-processed before being converted into an n- dimensional vector and being classified into offensive and non-offensive content. The pre- processing includes the removal of duplicate words, web links, emoticons, symbols, hashtags, numbers, names, etc. The architecture diagram for the proposed model is given below. Code-mixed input Input Data Cleaning Multilingual BERT as encoder representation SVM Xgboost LDA Perform majority voting in the labels Figure 1: Architecture diagram for the proposed model The pre-processed data is now fed as input to the embedding system. Multilingual BERT is used for generating embeddings. Since the input data may be of code-mixed language, the embedding model that processes various languages is used for generating embeddings. This projects the input sentence in an n-dimensional plane to generate an n-dimensional feature vector. These vectors are fed as input to the three different classifiers. Support Vector Machine, Xgboost, Linear Discriminant Analysis are applied with these n- dimensional features for classifying the data into two classes. One class holds the sentences that do not have offensive words and the other holds the offensive sentences. Three various classification algorithms generate three distinct labels for each sentence. The label which is generated in the majority will be the final class label for the sentence. (Off- Offensive sentence, Not- Non-offensive sentence). The example for optimizing the class label for a sentence is given below. Table 2 Optimizing the labels from different classification algorithms Input SVM Xgboost LDA Final Label Sentence 1 Off Off Not Off Sentence 2 Off Not Not Not Sentence 3 Off Not Off Off Thus, the final label to classify the offensive and non-offensive input sentences from the input is performed based on the majority voting on the labels generated by the various classification algorithms. 5.1. System Evaluation In the proposed model, sentence embeddings are done using a system that understands and interprets multiple languages. The embedded values are classified using three distinct classification algorithms and the optimized label is considered as the final label for each input data. The dataset used for the identification of offensive language identification is Tamil code-mixed train data from the dataset published by the HASOC task (Task2-subtask1) at FIRE 2021 The dataset comprises of 4000code-mixed training data, 940 validation data, and 1001 test data. The proposed model is trained with a 4000 training set and is validated with a 940 validation set. The labels generated for the validation dataset are evaluated using weighted average classification metrics. The classification report is given below. Table 3 Classification Report Precision Recall F1-Score Support Not 0.82 0.85 0.83 465 Off 0.84 0.81 0.83 475 Accuracy - - 0.83 940 Macro avg 0.83 0.83 0.83 940 Weighted avg 0.83 0.83 0.83 940 The above table indicated the precision, recall, and F1-score of the "NOT" and ÖFF" class labeled for each input data. Here, the metric "support" shows the count of test samples given for evaluating the system. The system is evaluated using a weighted average F1-score. This is the weighted average of precision and recall. F1 score is calculated with the relative contribution from precision and recall. 6. Conclusion and Future work Automatic classification of offensive contents in code-mix Tamil data is done by optimizing the labels generated by three different classification algorithms. The labels generated by Machine learning approaches produce a promising output. As future work, the identification of offensive content must be done using Deep learning approaches to exactly identify the abusive words and classify the data exactly. 7. Acknowledgements We sincerely thank the management of SSN Institutions for the infrastructure and lab facilities to carry out this research work. 8. References [1] de Gibert, Ona, Naiara Perez, Aitor García-Pablos, and Montse Cuadros. "Hate speech dataset from a white supremacy forum." arXiv preprint arXiv:1809.04444 (2018). [2] Nockleyby, J. "‘Hate speech in Encyclopedia of the American Constitution." Electronic Journal of Academic and Special librarianship (2000). [3] Davidson, Thomas, Dana Warmsley, Michael Macy, and Ingmar Weber. "Automated hate speech detection and the problem of offensive language." In Proceedings of the International AAAI Conference on Web and Social Media, vol. 11, no. 1. 2017. [4] MacAvaney, Sean, Hao-Ren Yao, Eugene Yang, Katina Russell, Nazli Goharian, and Ophir Frieder. "Hate speech detection: Challenges and solutions." PloS one 14, no. 8 (2019): e0221152. [5] Bruce, Rebecca F., and Janyce M. Wiebe. "Recognizing subjectivity: a case study in manual tagging." Natural Language Engineering 5, no. 2 (1999): 187-205. [6] Wiebe, Janyce, Rebecca Bruce, and Thomas P. O’Hara. "Development and use of a gold- standard data set for subjectivity classifications." In Proceedings of the 37th annual meeting of the Association for Computational Linguistics, pp. 246-253. 1999. [7] Waseem, Zeerak. "Are you a racist or am i seeing things? annotator influence on hate speech detection on twitter." In Proceedings of the first workshop on NLP and computational social science, pp. 138-142. 2016. [8] Devlin, J., Chang, M.W., Lee, K. and Toutanova, K., 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. [9] Rosenberg, Andrew, and Julia Hirschberg. "V-measure: A conditional entropy-based external cluster evaluation measure." In Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL), pp. 410-420. 2007. [10] Tong, Simon, and Daphne Koller. "Support vector machine active learning with applications to text classification." Journal of machine learning research 2, no. Nov (2001): 45-66. [11] Qi, Zhang. "The text classification of theft crime based on TF-IDF and XGBoost model." In 2020 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA), pp. 1241-1246. IEEE, 2020. [12] Sharma, Alok, and Kuldip K. Paliwal. "Linear discriminant analysis for the small sample size problem: an overview." International Journal of Machine Learning and Cybernetics 6, no. 3 (2015): 443-454. [13] Banerjee, Shubhanker, Bharathi Raja Chakravarthi, and John P. McCrae. "Comparison of pretrained embeddings to identify hate speech in Indian code-mixed text." In 2020 2nd International Conference on Advances in Computing, Communication Control and Networking (ICACCCN), pp. 21-25. IEEE, 2020. [14] Mandl, Thomas, Sandip Modha, Anand Kumar M, and Bharathi Raja Chakravarthi. "Overview of the hasoc track at fire 2020: Hate speech and offensive language identification in tamil, malayalam, hindi, english and german." In Forum for Information Retrieval Evaluation, pp. 29- 32. 2020. [15] Hande, Adeep, Siddhanth U. Hegde, Ruba Priyadharshini, Rahul Ponnusamy, Prasanna Kumar Kumaresan, Sajeetha Thavareesan, and Bharathi Raja Chakravarthi. "Benchmarking multi-task learning for sentiment analysis and offensive language identification in under-resourced dravidian languages." arXiv preprint arXiv:2108.03867 (2021). [16] Chakravarthi, Bharathi Raja, Navya Jose, Shardul Suryawanshi, Elizabeth Sherly, and John P. McCrae. "A sentiment analysis dataset for code-mixed Malayalam-English." arXiv preprint arXiv:2006.00210 (2020). [17] Chakravarthi, Bharathi Raja, Vigneshwaran Muralidaran, Ruba Priyadharshini, and John P. McCrae. "Corpus creation for sentiment analysis in code-mixed Tamil-English text." arXiv preprint arXiv:2006.00206 (2020). [18] Hande, Adeep, Ruba Priyadharshini, and Bharathi Raja Chakravarthi. "KanCMD: Kannada CodeMixed dataset for sentiment analysis and offensive language detection." In Proceedings of the Third Workshop on Computational Modeling of People's Opinions, Personality, and Emotion's in Social Media, pp. 54-63. 2020. [19] Hande, Adeep, Siddhanth U. Hegde, Ruba Priyadharshini, Rahul Ponnusamy, Prasanna Kumar Kumaresan, Sajeetha Thavareesan, and Bharathi Raja Chakravarthi. "Benchmarking multi-task learning for sentiment analysis and offensive language identification in under-resourced dravidian languages." arXiv preprint arXiv:2108.03867 (2021). [20] Mandl, Thomas, Sandip Modha, Anand Kumar M, and Bharathi Raja Chakravarthi. "Overview of the hasoc track at fire 2020: Hate speech and offensive language identification in tamil, malayalam, hindi, english and german." In Forum for Information Retrieval Evaluation, pp. 29- 32. 2020. [21] Ghanghor, Nikhil, Parameswari Krishnamurthy, Sajeetha Thavareesan, Ruba Priyadharshini, and Bharathi Raja Chakravarthi. "IIITK@ DravidianLangTech-EACL2021: Offensive Language Identification and Meme Classification in Tamil, Malayalam and Kannada." In Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages, pp. 222- 229. 2021. [22] Banerjee, Shubhanker, Arun Jayapal, and Sajeetha Thavareesan. "NUIG-Shubhanker@ Dravidian-CodeMix-FIRE2020: Sentiment Analysis of Code-Mixed Dravidian text using XLNet." arXiv preprint arXiv:2010.07773 (2020).