<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Hate Speech Detection in Marathi and Code-Mixed Languages using TF-IDF and Transformers-Based BERT-Variants</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Sakshi</forename><surname>Kalra</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Department of CSIS</orgName>
								<orgName type="institution">BITS Pilani</orgName>
								<address>
									<postCode>333031</postCode>
									<region>Rajasthan</region>
									<country key="IN">INDIA</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Kushank</forename><surname>Maheshwari</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Department of CSIS</orgName>
								<orgName type="institution">BITS Pilani</orgName>
								<address>
									<postCode>333031</postCode>
									<region>Rajasthan</region>
									<country key="IN">INDIA</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Saransh</forename><surname>Goel</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Department of CSIS</orgName>
								<orgName type="institution">BITS Pilani</orgName>
								<address>
									<postCode>333031</postCode>
									<region>Rajasthan</region>
									<country key="IN">INDIA</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Yashvardhan</forename><surname>Sharma</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Department of CSIS</orgName>
								<orgName type="institution">BITS Pilani</orgName>
								<address>
									<postCode>333031</postCode>
									<region>Rajasthan</region>
									<country key="IN">INDIA</country>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff1">
								<orgName type="department">Forum for Information Retrieval Evaluation</orgName>
								<address>
									<addrLine>December 9-13</addrLine>
									<postCode>2022</postCode>
									<country key="IN">India</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Hate Speech Detection in Marathi and Code-Mixed Languages using TF-IDF and Transformers-Based BERT-Variants</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">7234BCDD9EA8BA0CE37244ED04BE2A0D</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-06-19T14:45+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Cyber hate</term>
					<term>Social Media</term>
					<term>MuRIL</term>
					<term>HASOC</term>
					<term>BERT</term>
					<term>Distil-BERT</term>
					<term>Code Mixed</term>
					<term>Transformers model</term>
					<term>Text Classification</term>
					<term>Tokenizer</term>
					<term>TF-IDF</term>
					<term>Multilingual BERT</term>
					<term>Machine Learning</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>People now express their ideas on social media on a global scale. Online attacks against others can be made without fear of repercussions due to the increased sense of freedom provided by the anonymity feature, which eventually leads to the spread of hate speech. The current attempts to filter online information and stop the propagation of hatred are insufficient. Regional languages' popularity on social media and the lack of hate speech detectors that can be used in multiple languages are two aspects that contribute to this. This paper discusses two aspects of fake news detection namely: Identification of Conversational Hate-Speech in Code-Mixed Languages like Hindi, English and German, while second part discusses about Offensive Language Identification in Marathi. Our approach uses TF-IDF word embedding combined with Machine Learning models and transformer based BERT models for the classification of hate speech in each of the two sub tasks. The MuRIL-BERT model produces the best results, with an accuracy of 73.1% and a Macro-F1 score of 0.727 for the code-mixed language and a macro F1-score of 0.8306 on Marathi data, which is 6% more from previous year.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>In the past few years, academics have become more interested in the topic of hate speech. This is shown by the fact that the number of Web of Science (WOS)-indexed publications went from 42 in 2013 to 162 in 2018 <ref type="bibr" target="#b0">[1]</ref>. According to the Encyclopedia of the American Constitution, "Hate speech is speech that attacks a person or group on the basis of attributes such as race, religion, ethnic origin, national origin, sex, disability, sexual orientation, or gender identity. " <ref type="bibr" target="#b1">[2]</ref>. The hate speech on social media is becoming the new normal and is devastating for our society. Hate speech divides society and sometimes even leads to communal disharmony and violence. In recent years, it has been seen that some terrorist attacks motivated by hate had a long history of hateful posts on social media, which led to radicalization <ref type="bibr" target="#b2">[3]</ref>. In some cases, social media even plays a more direct role, such as in the 2019 attack in Christchurch, New Zealand, and the recent shooting in a mall in the USA, where the suspect live broadcast the shootings on social media platforms <ref type="bibr" target="#b2">[3]</ref>. The only way to stop this spread of hatred is to quickly identify the hate speech, which is impossible to do manually and must instead be done computationally.</p><p>By setting up assignments and seminars, online communities, social media businesses, and technology firms are making significant investments and promoting research in this field of Hate Speech Detection. FIRE is one such group, and it has been actively managing the HASOC responsibilities since 2019 <ref type="bibr" target="#b3">[4]</ref>. HASOC 2022 is looking for technology that can detect inflammatory language and hate speech without human intervention. The competition is broken up into two subtracks.</p><p>For the first task, the dataset contains code-mix tweets in more than one language (Hinglish and German), along with comments and replies to those comments. When the language is coded, it is difficult to tell what is hate speech. Code mixed text uses the vocabulary and grammar of more than one language <ref type="bibr" target="#b4">[5]</ref>. For example, in the dataset used, the Hinglish data has Hindi written in both roman and devanagari script, which makes it harder to find hate speech in this data. The proposed model uses two methods for text classification one is machine learning approach using TF-IDF feature extraction and other is deep learning approach using different BERT variants, which are based on the transformers model; BERT has been shown to be the best in understanding the right and left context in a text up to this point.</p><p>For the second task of Hate Speech and Offensive Content Identification in Marathi Language aims at Binary classification to classify a tweet by a user as either offensive and hate or not offensive. The overview of FIRE 2022 subtasks is presented in <ref type="bibr" target="#b5">[6]</ref> and <ref type="bibr" target="#b6">[7]</ref>. We approached the task using the Transformers-based models namely MuRIL, Distil-BERT and Multilingual-BERT which have displayed impressive outcomes in NLP tasks like text classification. The provided Marathi dataset is fine-tuned using a pre-trained transformer model from the HuggingFace library <ref type="foot" target="#foot_0">1</ref> . We demonstrate that using transfer learning on pre-trained BERT models is preferable to using conventional machine learning algorithms.The code is available from the github repository<ref type="foot" target="#foot_1">2</ref> .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related Work</head><p>For the code-mixed languages, various approaches in the past have been used. The authors of <ref type="bibr" target="#b7">[8]</ref> explain how we can extract features from text data using TF-IDF. They examined the performance of the TF-IDF implementation using 1400 papers from the United Nations Parallel Text Corpus for LDCs and only returned the top 100 relevant texts. In a further study <ref type="bibr" target="#b8">[9]</ref>, researchers went into greater detail about TF-IDF feature extraction and compared character n-grams to word n-grams, concluding that character n-grams were more useful for detecting hate speech. Another paper by <ref type="bibr" target="#b9">[10]</ref> describes how the BERT model can be used for text classification; this paper covers the architecture of the BERT model, which is trained on a large corpus of data and input tokenized text, as well as an attention mask. They achieved GLUE scores of 80.5%, 86.7% MultiNLI accuracy, 93.2% on the SQuAD v1.1 question-answering test, and 83.1% on the SQuAD v2.0 test. The approach of utilising BERT for classification in <ref type="bibr" target="#b9">[10]</ref> is further explained by using the output corresponding to the [CLS] token and adding an Feed Forward Network above it. Another study <ref type="bibr" target="#b10">[11]</ref> used soft voting technique on three transformer-based architectures (urduhack, BERT, and XLM-RoBERTa) to achieve an accuracy of 93.6The authors in <ref type="bibr" target="#b11">[12]</ref> make an attempt to identify threatening posts using deep learning based models on transformers, they essentially employed the pretrained BERT model (RoBERTa) for classifying text as threatening and non-threatening and obtained an F1 score of 53.46% and ROC AUC of 81.99%.</p><p>Another paper in <ref type="bibr" target="#b12">[13]</ref> fine-tuned monolingual and multilingual transformers over Urdu text, and used ensembling techniques to combine the results of RoBERTa-urdu-small, XLM-RoBERTa, bert-based-multilingual-case, and Alberta-urdu-large, yielding an accuracy of 0.596 and an F1 score of 0.449. In an another attempt by <ref type="bibr" target="#b13">[14]</ref> got the highest F1 score of 0.7993 by using pre-trained BERT models with a fine-tuning classification layer over them. They also used data augmentation to make the models generalise better and used both machine learning and deep learning techniques for the task of recognising hate and offensive speech. The effectiveness of several pre-trained multilingual BERT models in the detection of threats and hate speech, which are also types of emotions, was discussed in <ref type="bibr" target="#b12">[13]</ref> and <ref type="bibr" target="#b13">[14]</ref>. <ref type="bibr" target="#b2">[3]</ref> used a variety of datasets, the majority of which were based on data from Twitter, including TRAC, hatebase Twitter, Kaggle, etc., and suggested an SVM-based model called mSVM, which on the TRAC dataset produced state-of-the-art results with 80% accuracy and a 53.68% macro F1 score. They also employed the BERT model, which produced results that were 2 percent better but could not explain the interpretability of the choice.</p><p>For the Marathi Language, Automated offensive and hate speech detection has been tested using a variety of machine learning and deep learning techniques <ref type="bibr" target="#b14">[15]</ref>. The bulk of conventional machine learning techniques extract features from voice text, such as lexical and linguistic features, n-grams, and bags of words <ref type="bibr" target="#b15">[16]</ref>. Word embedding techniques have also recently been presented for these tasks <ref type="bibr" target="#b16">[17]</ref>. However, these methods fall short of capturing the speech's whole context. Deep learning methods <ref type="bibr" target="#b17">[18]</ref> are currently becoming more and more popular in a variety of fields, including machine translation, sentiment analysis, text classification, and language modelling. Recurrent Neural Networks (RNNs) <ref type="bibr" target="#b18">[19]</ref>, Convolutional neural networks (CNNs) <ref type="bibr" target="#b19">[20]</ref>, long short-term memories (LSTMs) <ref type="bibr" target="#b20">[21]</ref>, and the newest approach, bidirectional encoder representations (BERT) <ref type="bibr" target="#b21">[22]</ref>, are a few of these methods. A combination of Machine Learning models and transformers based models is presented in <ref type="bibr" target="#b22">[23]</ref>.</p><p>In <ref type="bibr" target="#b10">[11]</ref> both ML models as well as Transformer based models have been applied for Urdu Language. Additionally, BERT models for Hate Speech detection for Urdu Language has also been applied in FIRE 2021 <ref type="bibr" target="#b11">[12]</ref>. Another study <ref type="bibr" target="#b23">[24]</ref> for identifying hate speech phrases on Twitter was done. In order to comprehend semantics, the deep convolutional neural network model and GloVe embedding vectors have been combined. With an F1-score of 0.92, the findings explain that their model performed better than the other models. In <ref type="bibr" target="#b13">[14]</ref> techniques like TF-IDF weightings as well as word embeddings ae used, which is then fed into machine learning algortihms namely random forest, logistic regression and support vector classifier.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 1 Dataset Statistics</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Data Type HOF NOT Total Entries</head><p>Training Data 2612 2609 5221</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Dataset</head><p>Task A (Code-Mixed Language):</p><p>The dataset used in this task is collected from HASOC (2022) <ref type="foot" target="#foot_2">3</ref> which is one of the subtracks of the Forum for Information Retrieval Evaluation (FIRE) <ref type="foot" target="#foot_3">4</ref> 2022. It is a collection of tweets; each instance of the dataset <ref type="bibr" target="#b24">[25]</ref>, <ref type="bibr" target="#b25">[26]</ref> includes a main tweet that is labelled as HOF or NOT. Additionally, each tweet may obtain multiple comments, each of which is also labelled "HOF" or "NOT. " Finally, each comment may receive multiple replies, each of which is also labelled "HOF" or "NOT. " The dataset differs in that the determination of whether a comment or a reply falls under the category of hate speech depends on both the main tweet and the comment in the case of a reply. For instance, a comment of "yes" is meaningless by itself, but if it is made in response to a main tweet that is hate speech, then it is considered hate speech, while a comment of "no" for the same tweet is not. Therefore, the modification we made to get it ready for the model (to capture the context of the tweet, comment, and reply) is that the text for the main tweet remains the same, the main tweet is appended to the comment, and the main tweet as well as the comment are appended to the reply text (separated by blank space). This way, it will be able to capture the context of the comment and reply.</p><p>• Main tweet: &lt;main tweet&gt; • Comment: &lt;main tweet&gt; &lt;comment&gt; • Reply: &lt;main tweet&gt; &lt;comment&gt; &lt;reply&gt; Table <ref type="table">1</ref> shows the dataset statistics. The graphical representation of statistics for Hinglish+German twitter dataset and the exact data distribution is shown in Figure <ref type="figure" target="#fig_0">1</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Task B (Marathi Language):</head><p>The datasets for the tasks are provided by the organizers of HASOC'22 <ref type="foot" target="#foot_4">5</ref> . The subtask A in the HASOC challenge for Marathi Language is a binary classification task. We need to categorize the sentences in the Marathi Language dataset into the following classes: The data statistics are as follows: The graphical representation of statistics for the dataset are listed in Figure <ref type="figure" target="#fig_1">2</ref>. Twitter's definition of the term "Offensive" refers to abusive remarks made to people or groups with the intention of intimidating them or silencing their voice. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Handling the Class Imbalanced Issue</head><p>For the Task A, the dataset was balanced while for Task B the dataset was imbalanced. To solve this issue, Resampling the training dataset randomly is one way to deal with the issue of data imbalance. The dataset can be resampled using two different techniques: undersampling, which involves removing examples from the majority class, and oversampling, which involves repeating samples from the minority class <ref type="bibr" target="#b13">[14]</ref>. We oversampled the dataset using the imblearn <ref type="bibr" target="#b26">[27]</ref> library because the training instances are already rather few and removing examples from the majority class will further reduce them. Making the ratio of the minority to the majority class 0.5 by using RandomOverSampler with a sampling method of 0.5.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">TF-IDF for Text Classification</head><p>TF(term frequency) explains the importance of a word for a particular document <ref type="bibr" target="#b7">[8]</ref>.</p><p>(𝑇 𝑒𝑟𝑚 𝐹 𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦) 𝑇 𝐹 (𝑚) = 𝑁 𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑖𝑚𝑒𝑠 𝑡𝑒𝑟𝑚 𝑚 𝑖𝑛 𝑑𝑜𝑐 𝑡𝑜𝑡𝑎𝑙 𝑡𝑒𝑟𝑚𝑠 𝑖𝑛 𝑑𝑜𝑐 IDF (inverse document frequency) describes the relevance of a word for a corpus. For instance, stopwords are included in every document, making them the least relevant for classifying the whole corpus. As a result, their IDF value will be lower. On the other hand, a word's IDF value will be high if it appears in a small number of documents.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>(𝐼 𝑛𝑣𝑒𝑟𝑠𝑒 𝐷𝑜𝑐𝑢𝑚𝑒𝑛𝑡 𝐹 𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦)𝐼 𝐷𝐹 (𝑚) = log( 𝑇 𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑜𝑐𝑠 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑜𝑐𝑠 𝑤𝑖𝑡ℎ 𝑡𝑒𝑟𝑚 𝑚 )</head><p>Then finally we combine both TF and IDF to form TF-IDF:</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>𝑇 𝐹 − 𝐼 𝐷𝐹 (𝑚) = 𝑇 𝐹 (𝑚) * 𝐼 𝐷𝐹 (𝑚)</head><p>For the classification of an input tweet, the voting method is used. For each word in the input text, we calculated the &lt;HOF score&gt; and &lt;NOT score&gt;. The code iterates over the entire training set tweet by tweet. For each word in the input text; if the word is present in the tweet, then check for the tweet's label. If label is 1, then the tf-idf value of the word for that tweet is added to its &lt;HOF score&gt; otherwise to its &lt;NOT score&gt;. The &lt;HOF score&gt; and &lt;NOT score&gt; values of all words thus calculated are added to the full input text, and the label with the higher score is the predicted label.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">BERT Model and its Variants for Text Classification</head><p>[10]There are two main steps related to the BERT architecture for classification: pre-training and fine-tuning. Pre-training involves training the model on unlabeled data using several pretrained tasks. An English teacher teaches a language to a child by using "fill in the blanks" , "question and answer" types of exercises. The BERT model is pre-trained in a similar way by giving it tokenized text and masking part of the text's tokens; the model's job is to discover the missing word. Another method used for pre-training BERT is next sentence prediction. It starts with choosing two sentences A and B, 50% of the time B is the actual sentence following A and 50% of the time it is a random sentence from the corpus. This teaches the model to identify the relationship between two sentences, which will help in "question answering" tasks. The next step is the fine-tuning of various tasks, such as classification and question answering, for which two sentences are appended with a [SEP] token between them and only one sentence is passed as input. The fine-tuning task will require some additional layers over the output from the BERT model for training for a particular task, for example, for classification, the output corresponding to the [CLS] token is taken as input for the Feed Forward Network (FFN). This Feed Forward Network is called the fine tuning layer, and during fine tuning, the weights of this classification layer are trained without changing the weights inside the BERT model. So, we can say that the fine tuning layer is using the knowledge of the BERT model to train for classification; in our case, there are two nodes in the output layer for binary classification. BERT architecture is shown in Figure <ref type="figure" target="#fig_2">3</ref>. The following BERT variants are used in the proposed task:</p><p>• MuRIL<ref type="foot" target="#foot_5">6</ref> -MuRIL <ref type="bibr" target="#b27">[28]</ref> is a BERT based model trained over 17 Indian languages using Wikipedia data. • Multilingual-BERT <ref type="foot" target="#foot_6">7</ref> -M-BERT <ref type="bibr" target="#b28">[29]</ref> has 104 languages pre-trained from large wikipedia data. WordPiece is used to tokenize and lowercase the texts, and a common vocabulary with a size of 110,000 is used. This model is case sensitive. • Distil-BERT <ref type="foot" target="#foot_7">8</ref> -DistilBERT <ref type="bibr" target="#b29">[30]</ref> model is based on small, cheap and fast transformers used knowledge distilling during pre-training and reduced the size of BERT by 40%</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.">Proposed Techniques and Algorithms</head><p>Task A (Code-Mixed Language):</p><p>The suggested model, as shown in Figure <ref type="figure" target="#fig_4">4</ref>, first takes the multilingual and code mixed text as input and preprocesses it by deleting stopwords( sklearn library provides the list of stopwords for English and German language and Kaggle provided for Hindi language, a custom function is used to remove the stopwords from dataset one by one using the lists of stopwords) for the dataset presented in Figure <ref type="figure" target="#fig_0">1</ref>. The hyperlinks, emojis and hashtags are also removed. The text is made lowercase to handle names in the text. Following that, the preprocessed data is used to train two different models, the TF-IDF feature extraction model and the BERT model (all models are trained independently). The HOF and NOT scores for test data are determined using the TF-IDF feature extraction approach, which is described in the next section. The following are the phases related to the text classification using TF-IDF:</p><p>• Data Pre-Processing • Extracting TF-IDF features • Calculating TF-IDF score for classification  To determine whether text input is HOF or NOT, a tokenizer is applied first, followed by a fine-tuning layer over the four pre-trained BERT models (Distill BERT, Multilingual BERT, RoBERTa, and Muril BERT). Figure <ref type="figure" target="#fig_4">4</ref> shows the proposed architecture for the hate speech classification. The four key phases of the process are:</p><p>•  To obtain the word embeddings, just the encoder component of the transformer design is used. To calculate the probability for binary classes, an additional output layer is implemented. The different word embedding models that have been used are mentioned above in the BERT explanation part.</p><p>The Flowchart in Figure <ref type="figure" target="#fig_6">5</ref> shows the complete approach. In Brief, the main 4 steps of the process are: The hyperparameters for training the model are mentioned in Table <ref type="table" target="#tab_3">4</ref>.  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="8.">Results and Evaluations</head><p>Task A (Code-Mixed Language):</p><p>The performance of each model is evaluated using various evaluation metrics. Table <ref type="table" target="#tab_4">5</ref> lists the accuracy, precision, recall, and F1-measure using the TF-IDF model. Table <ref type="table">6</ref> lists the accuracy for Micro-F1 and Macro-F1 using BERT and its variants, Of the three BERT versions, MuRIL produced the best outcomes. Distil-BERT and Multilingual-BERT produced nearly identical results, but Multilingual-BERT performed better. The code is available from the github repository<ref type="foot" target="#foot_8">9</ref> </p><p>Task B (Marathi Language): Accuracy and Macro F1 are used to evaluate each model's performance. MuRIL gave the best results among all 3 BERT models. While MuRIL and Multilingual-BERT almost gave similar results, but MuRIL performed better than Multilingual-BERT. While Distil-BERT performed the worst. The test data provided by HASOC is only for the following Hyperparamters: Number  The results show that MuRIL gives the best results in all the scenarios. When the Learning Rate is decreased the accuracy of all three models increases, while when the Learning Rate is increased accuracy of MuRIL and mBERT decreases while accuracy for Distil-BERT increases. At the same time the changes seen when changing Batch size is similar to Learning Rate. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="9.">Conclusion and Future Work</head><p>Task A (Code-Mixed Language):</p><p>The proposed results demonstrate that the BERT model performs better than the TF-IDF feature extraction model. This is because the BERT model takes into account the right and left context in the text, allowing it to detect hate speech more accurately by taking into account the context of each sentence; additionally, BERT takes subwords as tokens as well; for example, "playing" is broken into "play" and "ing," and then separate embeddings are calculated for each token; this extra quality also helps the BERT model perform better. In this scenario, Muril-BERT outperforms multilingual-BERT and Distil-BERT. The next stage for detecting hate speech would be viewed as a multimodal technique. Some social context-based features can also be investigated in future research. One could even go much farther in the TF-IDF feature extraction process to employ character and word n-grams for hate speech detection. There must be a BERT model trained over a large dataset that performs better for code mixed languages, particularly Hindi written in roman script.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Task B (Marathi Language):</head><p>The results presented above show that pre-trained BERT models perform better and are better able to grasp the meaning of a given sentence, serving as better learning representations. Therefore, compared to conventional feature extraction approaches, the transfer learning strategy using pre-trained BERT models is better suitable for identifying offensive and hate</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Training set distribution in the Hinglish+German Dataset</figDesc><graphic coords="5,154.66,84.19,283.48,190.13" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Training set distribution in the Marathi Dataset</figDesc><graphic coords="5,225.53,536.48,141.74,94.00" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: BERT Model</figDesc><graphic coords="8,154.66,84.19,283.46,283.46" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head></head><label></label><figDesc>Data Pre-Processing • Tokenization • Using Pre-Trained BERT Model • Fine-Tuning Classifier for the Pre-Trained ModelTable 3 lists the various hyperparameters used while training of the proposed models.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: The Proposed Architecture</figDesc><graphic coords="9,126.31,84.19,340.15,254.50" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_5"><head>•</head><label></label><figDesc>Data Pre-Processing • Tokenization • Using Pre-Trained BERT Model • Fine-Tuning Classifier for the Pre-Trained Model</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_6"><head>Figure 5 :</head><label>5</label><figDesc>Figure 5: Flowchart of our methodology and techniques</figDesc><graphic coords="10,73.78,84.19,447.72,231.84" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2</head><label>2</label><figDesc>Dataset Statistics on the basis of Binary Label Data</figDesc><table><row><cell>Data</cell><cell cols="2">NOT OFF Total Entries</cell></row><row><cell cols="2">Training Data 2034 1069</cell><cell>3103</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 3</head><label>3</label><figDesc>Various Hyperparameters and its Descriptions</figDesc><table><row><cell cols="2">Hyperparameter Description</cell></row><row><cell>Learning Rate</cell><cell>1e-05</cell></row><row><cell cols="2">Number of Epochs 4</cell></row><row><cell>Batch Size</cell><cell>2</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 4</head><label>4</label><figDesc>Hyper-parameters used in Training</figDesc><table><row><cell cols="2">Hyper-parameter Description</cell></row><row><cell>Learning Rate</cell><cell>1.00742e-05</cell></row><row><cell>Number of Epochs</cell><cell>4</cell></row><row><cell>Batch Size</cell><cell>2</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_4"><head>Table 5</head><label>5</label><figDesc>Performance Evaluatuion using TF-IDF Batch size = 2 and Learning Rate = 1.00742e-05. The results are shown in the following below tables namely Table7, Table8, Table9 and Table 10:</figDesc><table><row><cell cols="5">Model Accuracy Precision Recall F1-Measure</cell></row><row><cell>TF-IDF</cell><cell>0.685</cell><cell>0.676</cell><cell>0.698</cell><cell>0.687</cell></row><row><cell>Table 6</cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell cols="3">Performance Evaluatuion using BERT Variants</cell><cell></cell><cell></cell></row><row><cell cols="2">BERT Variants</cell><cell cols="3">Accuracy Micro-F1 Macro-F1</cell></row><row><cell cols="2">MuRIL</cell><cell>0.731</cell><cell>0.695</cell><cell>0.727</cell></row><row><cell cols="2">Multilingual-BERT</cell><cell>0.69</cell><cell>0.676</cell><cell>0.69</cell></row><row><cell cols="2">Distil-BERT</cell><cell>0.69</cell><cell>0.67</cell><cell>0.69</cell></row><row><cell>of Epochs = 4,</cell><cell></cell><cell></cell><cell></cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_5"><head>Table 7</head><label>7</label><figDesc>Final Results for given Hyper-parameters</figDesc><table><row><cell></cell><cell></cell><cell cols="4">Epochs = 4, batch size = 2, Learning Rate = 1.00742e-05</cell><cell></cell></row><row><cell>Data</cell><cell></cell><cell>Training Data</cell><cell></cell><cell></cell><cell>Testing Data</cell><cell></cell></row><row><cell>Metrics</cell><cell cols="2">Accuracy Macro-F1</cell><cell>Macro-</cell><cell>Macro-F1</cell><cell>Macro-</cell><cell>Macro-</cell></row><row><cell></cell><cell></cell><cell></cell><cell>Precision</cell><cell></cell><cell>Precision</cell><cell>Recall</cell></row><row><cell>MuRIL</cell><cell>0.9198</cell><cell>0.9197</cell><cell>0.9202</cell><cell>0.9450</cell><cell>0.9464</cell><cell>0.9446</cell></row><row><cell>Multilingual</cell><cell>0.9103</cell><cell>0.9103</cell><cell>0.9103</cell><cell>0.9291</cell><cell>0.9332</cell><cell>0.9285</cell></row><row><cell>BERT</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell>Distil-Bert</cell><cell>0.7724</cell><cell>0.7712</cell><cell>0.7777</cell><cell>0.8015</cell><cell>0.8145</cell><cell>0.8021</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_6"><head>Table 8</head><label>8</label><figDesc>Final Results for given Hyper-parameters</figDesc><table /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_7"><head>Epochs = 4, batch size = 4, Learning Rate = 1.00742e-05</head><label></label><figDesc></figDesc><table><row><cell>Data</cell><cell></cell><cell>Training Data</cell><cell></cell></row><row><cell>Metrics</cell><cell>Accuracy</cell><cell>Macro-F1</cell><cell>Macro-</cell></row><row><cell></cell><cell></cell><cell></cell><cell>Precision</cell></row><row><cell>MuRIL</cell><cell>0.9198</cell><cell>0.9197</cell><cell>0.9205</cell></row><row><cell>Multilingual</cell><cell>0.9021</cell><cell>0.9020</cell><cell>0.9031</cell></row><row><cell>BERT</cell><cell></cell><cell></cell><cell></cell></row><row><cell>Distil-Bert</cell><cell>0.8549</cell><cell>0.8549</cell><cell>0.8552</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_8"><head>Table 9</head><label>9</label><figDesc>Final Results for given Hyper-parameters</figDesc><table /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_9"><head>Epochs = 4, batch size = 2, Learning Rate = 1.1e-05</head><label></label><figDesc></figDesc><table><row><cell>Data</cell><cell></cell><cell>Training Data</cell><cell></cell></row><row><cell>Metrics</cell><cell>Accuracy</cell><cell>Macro-F1</cell><cell>Macro-</cell></row><row><cell></cell><cell></cell><cell></cell><cell>Precision</cell></row><row><cell>MuRIL</cell><cell>0.9186</cell><cell>0.9186</cell><cell>0.9186</cell></row><row><cell>Multilingual</cell><cell>0.9009</cell><cell>0.9009</cell><cell>0.9010</cell></row><row><cell>BERT</cell><cell></cell><cell></cell><cell></cell></row><row><cell>Distil-Bert</cell><cell>0.8136</cell><cell>0.8128</cell><cell>0.8192</cell></row><row><cell>Table 10</cell><cell></cell><cell></cell><cell></cell></row><row><cell cols="2">Final Results for given Hyper-parameters</cell><cell></cell><cell></cell></row><row><cell></cell><cell cols="3">Epochs = 4, batch size = 4, Learning Rate = 1e-05</cell></row><row><cell>Data</cell><cell></cell><cell>Training Data</cell><cell></cell></row><row><cell>Metrics</cell><cell>Accuracy</cell><cell>Macro-F1</cell><cell>Macro-</cell></row><row><cell></cell><cell></cell><cell></cell><cell>Precision</cell></row><row><cell>MuRIL</cell><cell>0.9233</cell><cell>0.9233</cell><cell>0.9238</cell></row><row><cell>Multilingual</cell><cell>0.9127</cell><cell>0.9127</cell><cell>0.9130</cell></row><row><cell>BERT</cell><cell></cell><cell></cell><cell></cell></row><row><cell>Distil-Bert</cell><cell>0.7853</cell><cell>0.7853</cell><cell>0.7854</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">https://huggingface.co/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">https://github.com/Kushank24/Marathi 𝑓 𝑎𝑘𝑒𝑛𝑒𝑤𝑠</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">https://hasocfire.github.io/hasoc/2022/index.html</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_3">http://fire.irsi.res.in/fire/2022/home</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_4">https://hasocfire.github.io/hasoc/2022/index.html</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_5">https://huggingface.co/google/ MuRIL-base-cased</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="7" xml:id="foot_6">https://huggingface.co/bert-base-multilingual-cased</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="8" xml:id="foot_7">https://huggingface.co/distilroberta-base</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="9" xml:id="foot_8">https://github.com/saransh-goel/HASOC.git</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>speech. The MuRIL performed the best among the three models. On the public leaderboard rankings, we came in fourth place. Additionally, By focusing on both images and text and obtaining the visual components for better feature extraction, we may approach this hate speech detection issue from a multimodal perspective. With better word tokenization and specific tokens for Marathi language, the performance could be enhanced. In the future, models can be trained on a larger corpus to improve accuracy even further. Further, future experiments with deeper transformer architectures may be conducted.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Hate speech: A systematized review</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A</forename><surname>Paz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Montero-Díaz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Moreno-Delgado</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Sage Open</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="page">2158244020973022</biblScope>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">T</forename><surname>Nockleby</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">W</forename><surname>Levy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">L</forename><surname>Karst</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">J</forename><surname>Mahoney</surname></persName>
		</author>
		<title level="m">Encyclopedia of the american constitution</title>
				<meeting><address><addrLine>Detroit, MI</addrLine></address></meeting>
		<imprint>
			<publisher>Macmillan Reference</publisher>
			<date type="published" when="2000">2000</date>
			<biblScope unit="volume">3</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Hate speech detection: Challenges and solutions</title>
		<author>
			<persName><forename type="first">S</forename><surname>Macavaney</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H.-R</forename><surname>Yao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Russell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Goharian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Frieder</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">PloS one</title>
		<imprint>
			<biblScope unit="volume">14</biblScope>
			<biblScope unit="page">e0221152</biblScope>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Report on the fire 2020 evaluation initiative</title>
		<author>
			<persName><forename type="first">P</forename><surname>Mehta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Mandl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Majumder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Gangopadhyay</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ACM SIGIR Forum</title>
				<meeting><address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2021">2021</date>
			<biblScope unit="volume">55</biblScope>
			<biblScope unit="page" from="1" to="11" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<author>
			<persName><forename type="first">N</forename><surname>Choudhary</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Singh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Bindlish</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Shrivastava</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1804.00806</idno>
		<title level="m">Sentiment analysis of code-mixed languages leveraging resource rich languages</title>
				<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Overview of the hasoc subtrack at fire 2021: Hate speech and offensive content identification in english and indo-aryan languages and conversational hate speech</title>
		<author>
			<persName><forename type="first">S</forename><surname>Modha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Mandl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">K</forename><surname>Shahi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Madhu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Satapara</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Ranasinghe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Zampieri</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Forum for Information Retrieval Evaluation</title>
				<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="1" to="3" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<author>
			<persName><forename type="first">T</forename><surname>Mandl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Modha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">K</forename><surname>Shahi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Madhu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Satapara</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Majumder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Schäfer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Ranasinghe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Zampieri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Nandini</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2112.09301</idno>
		<title level="m">Overview of the hasoc subtrack at fire 2021: Hate speech and offensive content identification in english and indo-aryan languages</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Using tf-idf to determine word relevance in document queries</title>
		<author>
			<persName><forename type="first">J</forename><surname>Ramos</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the first instructional conference on machine learning</title>
				<meeting>the first instructional conference on machine learning</meeting>
		<imprint>
			<publisher>Citeseer</publisher>
			<date type="published" when="2003">2003</date>
			<biblScope unit="volume">242</biblScope>
			<biblScope unit="page" from="29" to="48" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">A survey on hate speech detection using natural language processing</title>
		<author>
			<persName><forename type="first">A</forename><surname>Schmidt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Wiegand</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the fifth international workshop on natural language processing for social media</title>
				<meeting>the fifth international workshop on natural language processing for social media</meeting>
		<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="1" to="10" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<author>
			<persName><forename type="first">J</forename><surname>Devlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M.-W</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Toutanova</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1810.04805</idno>
		<title level="m">Bert: Pre-training of deep bidirectional transformers for language understanding</title>
				<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<title level="m" type="main">Detection of abusive records by analyzing the tweets in urdu language exploring transformer based models</title>
		<author>
			<persName><forename type="first">S</forename><surname>Kalraa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Bansala</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Sharmaa</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<monogr>
		<title level="m" type="main">Detection of threat records by analyzing the tweets in urdu language exploring deep learning transformer-based models</title>
		<author>
			<persName><forename type="first">S</forename><surname>Kalraa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Agrawala</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Sharmaa</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<monogr>
		<title level="m" type="main">Ensembling of various transformer based models for the fake news detection task in the urdu language</title>
		<author>
			<persName><forename type="first">S</forename><surname>Kalraa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Vermaa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Sharmaa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">S</forename><surname>Chauhanb</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<monogr>
		<title level="m" type="main">Applying transfer learning using bert-based models for hate speech detection</title>
		<author>
			<persName><forename type="first">S</forename><surname>Kalraa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">N</forename><surname>Inania</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Sharmaa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">S</forename><surname>Chauhanb</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Automated hate speech detection and the problem of offensive language</title>
		<author>
			<persName><forename type="first">T</forename><surname>Davidson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Warmsley</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Macy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Weber</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the international AAAI conference on web and social media</title>
				<meeting>the international AAAI conference on web and social media</meeting>
		<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="volume">11</biblScope>
			<biblScope unit="page" from="512" to="515" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<monogr>
		<author>
			<persName><forename type="first">A</forename><surname>Gaydhani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Doma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Kendre</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Bhagwat</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1809.08651</idno>
		<title level="m">Detecting hate speech and offensive language on twitter using machine learning: An n-gram and tfidf based approach</title>
				<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b16">
	<monogr>
		<author>
			<persName><forename type="first">R</forename><surname>Kshirsagar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Cukuvac</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Mckeown</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Mcgregor</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1809.10644</idno>
		<title level="m">Predictive embeddings for hate speech detection on twitter</title>
				<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Deep learning for hate speech detection in tweets</title>
		<author>
			<persName><forename type="first">P</forename><surname>Badjatiya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Gupta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Gupta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Varma</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 26th international conference on World Wide Web companion</title>
				<meeting>the 26th international conference on World Wide Web companion</meeting>
		<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="759" to="760" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Effective hate-speech detection in twitter data using recurrent neural networks</title>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">K</forename><surname>Pitsilis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Ramampiaro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Langseth</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Applied Intelligence</title>
		<imprint>
			<biblScope unit="volume">48</biblScope>
			<biblScope unit="page" from="4730" to="4742" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Hate speech detection: A solved problem? the challenging case of long tail on twitter</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Luo</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Semantic Web</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="page" from="925" to="945" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Detection of hate speech and offensive language in twitter data using lstm model</title>
		<author>
			<persName><forename type="first">A</forename><surname>Bisht</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Singh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Bhadauria</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Virmani</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Recent trends in image and signal processing in computer vision</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="243" to="264" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<monogr>
		<author>
			<persName><forename type="first">J</forename><surname>Devlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M.-W</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Toutanova</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1810.04805</idno>
		<title level="m">Bert: Pre-training of deep bidirectional transformers for language understanding</title>
				<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Predicting the type and target of offensive social media posts in marathi</title>
		<author>
			<persName><forename type="first">M</forename><surname>Zampieri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Ranasinghe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Chaudhari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Gaikwad</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Krishna</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Nene</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Paygude</surname></persName>
		</author>
		<idno type="DOI">10.1007/s13278-022-00906-8</idno>
		<ptr target="https://doi.org/10.1007/s13278-022-00906-8.doi:10.1007/s13278-022-00906-8" />
	</analytic>
	<monogr>
		<title level="j">Social Network Analysis and Mining</title>
		<imprint>
			<biblScope unit="volume">12</biblScope>
			<biblScope unit="page">77</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">A framework for hate speech detection using deep convolutional neural network</title>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">K</forename><surname>Roy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">K</forename><surname>Tripathy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">K</forename><surname>Das</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X.-Z</forename><surname>Gao</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Access</title>
		<imprint>
			<biblScope unit="volume">8</biblScope>
			<biblScope unit="page" from="204951" to="204962" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<monogr>
		<author>
			<persName><forename type="first">Shrey</forename><surname>Satapara</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Prasenjit</forename><surname>Majumder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Thomas</forename><surname>Mandl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sandip</forename><surname>Modha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Hiren</forename><surname>Madhu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Tharindu</forename><surname>Ranasinghe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Marcos</forename><surname>Zampieri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kai</forename><surname>North</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Damith</forename><surname>Premasiri</surname></persName>
		</author>
		<title level="m">Overview of the HASOC Subtrack at FIRE 2022: Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages</title>
				<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2022-12-13">9th-13th December 2022. 2022</date>
		</imprint>
	</monogr>
	<note>FIRE 2022: Forum for Information Retrieval Evaluation, Virtual Event</note>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">Overview of the HASOC Subtrack at FIRE 2022: Identification of Conversational Hate-Speech in Hindi-English Code-Mixed and German Language</title>
		<author>
			<persName><forename type="first">S</forename><surname>Modha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Mandl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Majumder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Satapara</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Patel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Madhu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Working Notes of FIRE 2022 -Forum for Information Retrieval Evaluation</title>
				<imprint>
			<publisher>CEUR</publisher>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<monogr>
		<author>
			<persName><forename type="first">S</forename><surname>Prabhu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Mohamed</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Misra</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2104.14289</idno>
		<title level="m">Multi-class text classification using bert-based active learning</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b27">
	<monogr>
		<author>
			<persName><forename type="first">S</forename><surname>Khanuja</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Bansal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Mehtani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Khosla</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Dey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Gopalan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">K</forename><surname>Margam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Aggarwal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">T</forename><surname>Nagipogu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Dave</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2103.10730</idno>
		<title level="m">Muril: Multilingual representations for indian languages</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b28">
	<monogr>
		<title level="m" type="main">How multilingual is multilingual bert?</title>
		<author>
			<persName><forename type="first">T</forename><surname>Pires</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Schlinger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Garrette</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1906.01502</idno>
		<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b29">
	<monogr>
		<author>
			<persName><forename type="first">V</forename><surname>Sanh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Debut</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Chaumond</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Wolf</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1910.01108</idno>
		<title level="m">Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
