<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Detect Hate and Offensive Content in English and Indo-Aryan Languages based on Transformer</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author role="corresp">
							<persName><forename type="first">Yongyi</forename><surname>Kui</surname></persName>
							<email>3964438@qq.com</email>
							<affiliation key="aff0">
								<orgName type="institution">Information Institute of Yunnan University</orgName>
								<address>
									<postCode>650504</postCode>
									<settlement>Yunnan</settlement>
									<country key="CN">China</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="department">Forum for Information Retrieval Evaluation</orgName>
								<address>
									<addrLine>December 13-17</addrLine>
									<postCode>2021</postCode>
									<country key="IN">India</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Detect Hate and Offensive Content in English and Indo-Aryan Languages based on Transformer</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">DEC8D0DA56DAAAF769512E340A90EF28</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-25T01:35+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Text Classification</term>
					<term>Hate and Offensive Content Analysis</term>
					<term>pre-trained model</term>
					<term>Transformers</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>This paper describes my submission to the Subtask 1A and Subtask 1B tasks of the HASOC (2021) Hate Speech and Offensive Content Identification Challenge. In the experiment, I applied different pre-training and common neural network models for this task and integrated them. According to the official evaluation results, the test results of the solution proposed in this article are ranked fourteenth and fifteenth on English Subtask A and English Subtask B, the rankings on Hindi Subtask A and Hindi Subtask B are sixth and fifth, respectively, and Marathi Subtask A is ranked eleventh. The source code for the evaluated models in this paper is shared openly.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>In recent years, offensive language on social media platforms has surged. Because the Internet has a certain degree of anonymity, people are more likely to publish hate speech <ref type="bibr" target="#b0">[1]</ref> on online platforms than in reality.</p><p>Hate speech will bring challenges to social civilization and harmony. Similarly, insulting offensive speech will lead to the radicalization of communication. Therefore, it is necessary to find an appropriate way to automatically recognize such content to enhance the public opinion environment of social media. Human beings are more sensitive to hate speech and offensive content, so people can easily identify such speech. However, the computer can only detect whether the text is hateful or offensive after learning via unsupervised, self-supervised, or supervised methods that are based on large amount of data.</p><p>In the challenges of HASOC 2019 <ref type="bibr" target="#b1">[2]</ref> and HASOC 2020 <ref type="bibr" target="#b2">[3]</ref>, there is a task of identifying hate speech and offensive content in English and Hindi. In addition, in 2019, SemEval <ref type="bibr" target="#b3">[4]</ref> has a task to identify the offensive and non-aggressiveness of English tweets. They use convolutional neural networks and the BERT model to solve them. SemEval proposed a task called OffensEval 2020 <ref type="bibr" target="#b4">[5]</ref> in 2020, to identify offensive content in multiple languages, including English. In order to identify offensive content, Risch et al. <ref type="bibr" target="#b5">[6]</ref> used a BERT model with distinct random seeds, while Subhanshu et al. <ref type="bibr" target="#b6">[7]</ref> fine-tuned the BERT-based network model. Many existing text content recognition systems are based on Transformer <ref type="bibr" target="#b7">[8]</ref> models.</p><p>The rest of the paper is structured as follows: In the second part of the paper, we give an overview of the tasks and datasets of this challenge; the third part describes the models used in this challenge; the fourth part describes the experimental process of Subtask A and Subtask B; in the fifth part, the official evaluation results of these two tasks are listed. In the last part, I summarized the evaluation results and the paper.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Task and Data Description</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.">Subtasks</head><p>HASOC (2021) <ref type="bibr" target="#b8">[9]</ref> Subtask 1 includes two subtasks, Subtask A and Subtask B. The main purpose is to detect the Hate Speech and Offensive Content of the text. The two subtasks are defined as follows:</p><p>Subtask A: Tweets predicted as hate speech and offensive speech are further divided into hate speech, offensive speech, and profane content. Therefore, it is a multiclass classification problem.</p><p>Subtask B: Tweets predicted to be hate speech and offensive speech in English and Hindi corpus are further identified into three categories: Hate speech, Offensive, and Profane. Subtask B is a text multi-classification task.</p><p>The evaluation standards of the prediction results of Subtask A and Subtask B are both Macro F1 and Macro Precision.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.">Dataset</head><p>The data for this challenge comes from comments on the Twitter platform, the corpus involves Marathi data <ref type="bibr" target="#b9">[10]</ref>, English and Hindi data . The Subtask A and Subtask B tasks of the HASOC (2021) Hate Speech and Offensive Content Identification Challenge <ref type="bibr" target="#b10">[11]</ref> provide training datasets and testing datasets.</p><p>In order to get more data to train the model better, to make the model have a better generalization and reduce the risk of overfitting, I collected data on English and Hindi corpus in HASOC (2019) and HASOC (2020) challenges. After integrating the collected data, the amount of training data about English corpus and Hindi corpus is 12035 and 10215 respectively. Next, we used the shuffle function in the Sklearn package to shuffle the order of the data and finally divided the integrated data into Training Dataset and Validation Dataset at a ratio of 4/1. Table <ref type="table">1</ref> specifically lists the data volume of the three languages in Subtask A and Subtask B.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3.">Data pre-process</head><p>The datasets of Subtask A and Subtask B are sampled from Twitter, and the format of the data is informal. Tweets can be long or short, and there are a lot of emojis or URL links in the text, and even spelling errors in words.</p><p>Pre-processing step is applied on the data to make the model better extract the information carried by the text and help enhance the accuracy of the classifier. In this challenge, some of the methods we used include: deleting URL links in the text, deleting emoticons and punctuation marks in the text.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">System Description</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Pre-trained Model</head><p>In this hate speech and offensive content detection challenge, I tried to use six pre-trained models: BERT, ALBERT, multilingual BERT (mBERT), DeBERTa, XLNet, and SqueezeBERT.</p><p>Here is a brief introduction to each pre-trained model.</p><p>The BERT <ref type="bibr" target="#b11">[12]</ref> model is a Deep Bidirectional model trained on the Transformer Encoder structure. The training process is divided into the pre-training stage and Fine-tuning stage, and its pre-training tasks include Masked LM (Language Model) and Next Sentence Prediction.</p><p>ALBERT <ref type="bibr" target="#b12">[13]</ref> uses word embedding parameter factorization and hidden layer parameter sharing methods to reduce the amount of model parameters, and uses Sentence Order Prediction Loss to optimize Next Sentence Prediction Loss. Therefore, compared with the BERT model, it significantly reduces the amount of model parameters, while the performance loss is tiny.</p><p>mBERT is a Cross-language <ref type="bibr" target="#b13">[14]</ref> model. It performs cross-language pre-training on data in 104 languages including English, Hindi, and Marathi. Texts in different languages share some common word blocks or vocabulary (such as numbers, links, etc.).</p><p>DeBERTa <ref type="bibr" target="#b14">[15]</ref> model uses two methods to enhance BERT. The first is Disentangled Attention, and in this way, each word uses two vectors to encode the text and position respectively, and the attention weights between words are calculated separately by using a matrix of text and relative positions; the second technique is to introduce absolute positions in the Decode Layer to predict masked tokens.</p><p>Compared with the Masked of the BERT model, XLNet <ref type="bibr" target="#b15">[16]</ref> introduces the new pre-training target of the Permutation Language Model during pre-training. In addition, XLNet introduces the Transformer XL mechanism, so it has an advantage over the Bert model for tasks where the input is a long text.</p><p>The SqueezeBERT <ref type="bibr" target="#b16">[17]</ref> model applies experience in the computer vision field to the Natural Language Processing tasks. The SqueezeBERT model replaces several operations in self-attention layers with grouped convolutions. It replaces several operations in self-attention layers with grouped convolutions. This model has achieved high accuracy on the GLUE Dataset.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Common neural network</head><p>In this part, I will give a brief overview of several common neural networks used in the paper.</p><p>A Recurrent Neural Network (RNN) <ref type="bibr" target="#b17">[18]</ref> is a neural network that can be used to process sequence data. The RNN model has a memory function, it can remember important words in the text.</p><p>Long Short-Term Memory (LSTM) <ref type="bibr" target="#b18">[19]</ref> has the same memory function as RNN. LSTM uses a gate mechanism, so it can solve the problem of gradient disappearance to a certain extent. In this paper, the Bidirectional LSTM (BiLSTM) model is used, which can extract the contextual information of the text.</p><p>TextCNN <ref type="bibr" target="#b19">[20]</ref> is a text classification model using convolutional networks. It passes the word vector through convolution and pooling operations, and finally sends the output to the softmax function to achieve classification. The structure of the TextCNN model is relatively simple, with fewer parameters, and good results can be achieved by introducing Pre-trained word vectors.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">Integrated</head><p>In the integration process, we did not freeze the data initialized by the pre-trained model, but integrated the pre-trained model with a smaller-scale model (RNN, LSTM, or TextCNN) for training.</p><p>The pre-trained model uses a lot of data for training, it can get high-quality word and sentence embedding vectors. Therefore, we plan to add models such as RNN, BiLSTM, or TextCNN to the output layer of the pre-trained model to further extract high-dimensional features. Next, the results obtained by these relatively small-scale Neural Networks are sent to the Fully Connected Neural Network for classification. From our subsequent experimental results, we can see that the accuracy of this method is similar to that of the pre-trained model, but the result of this integration strategy increases the Macro F1 value in the official evaluation score by 0.3% to 1%. In fact, many downstream tasks have been achieved in this way.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Experiments</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Subtask A Parameters Setting</head><p>In Subtask A, the optimizer selects AdamW; loss function uses Crossentropy; the epoch, max length, and batch size parameters are set to 10, 96, and 32 respectively; drop_out, learning rate, and weight_decay parameters are set to 0.4, 1e-5, and 1e-2, respectively.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Subtask A</head><p>First, I use six pre-trained models to conduct a text binary classification experiment under the parameters set above. These pre-trained models are BERT, ALBERT, mBERT, DeBERTa, XLNet, SqueezeBERT. Table <ref type="table" target="#tab_0">2</ref> lists the specific performance of each pre-trained model on the respective validation datasets of the three language corpora.</p><p>The experimental results show that among the six pre-trained models, the DeBERTa model performs best on English corpus, and the mBERT model achieves the highest accuracy on the text binary classification task of Hindi and Marathi corpus. In the classification experiment of Subtask A, the six models all use a learning rate of 1e-5 in the training phase. Next, I use the common learning rates of 1e-6, 5e-6, 1e-5, 3e-5, and 5e-5 to train the DeBERTa+ BiLSTM model separately, and ensure that the remaining parameters remain unchanged. Table <ref type="table" target="#tab_1">3</ref> lists the scores of the models trained with these five learning rates on the validation Dataset. The results show that the DeBERTa+ BiLSTM model uses a 5e-5 learning rate to obtain the best performance on the Dataset of this challenge. So in the subsequent experiments of Subtask A and Subtask B, I chose to use the 5e-5 learning rate.</p><p>Next, I integrated the 4-layer RNN, 2-layer BiLSTM, and TextCNN models after the pre-trained model with the highest score in the three-language verification data set for experiments. During this experiment, the learning rate is set to 5e-5, and the other parameters are unchanged. Finally, the output results of the RNN, BiLSTM, or TextCNN model are sent to the fully connected layer for classification. The final models are obtained after DeBERTa and mBERT integrate RNN, BiLSTM or TextCNN models. Table <ref type="table" target="#tab_2">4</ref>,5 list the scores of the final models on their respective Validation Dataset.</p><p>From the experimental results listed in Table <ref type="table" target="#tab_2">4</ref>, it can be seen that the DeBERTa+ BiLSTM model performs best on the Validation Dataset of the English corpus, and the accuracy and Macro F1 score are improved compared to the DeBERTa model alone. Therefore, I use the prediction result of the DeBERTa+ BiLSTM model as the final submission result on the English Subtask A task. Similarly, it can be seen from Table <ref type="table" target="#tab_3">5</ref> that the mBERT+ TextCNN model has the best performance on the Validation datasets of Hindi and Marathi corpus. So, I use the prediction results of the mBERT+ TextCNN model on the Hindi and Marathi test data sets as the final answer to the Hindi Subtask A and Marathi Subtask A tasks. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.">Subtask B</head><p>The experimental results of Subtask A show that the DeBERTa+ BiLSTM and mBERT+ TextCNN models perform best on the Validation datasets of English and Hindi corpus, respectively. Therefore, I still use these two models on the English and Hindi corpus in Subtask B. The difference from Subtask A is that the fully connected layer of Subtask B outputs a matrix of batch_size * 4, while Subtask A outputs a matrix of batch_size * 2. Tables 6,7 respectively list the scores of the DeBERTa+ BiLSTM and mBERT+ TextCNN models on the Validation datasets of English Subtask B and Hindi Subtask B. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Results</head><p>On Subtask A and Subtask B, among all teams, the solutions I put forward in the paper are ranked fourteenth and fifteenth on the English corpus, ranked sixth and fifth on the Hindi corpus, and ranked eleventh in the Marathi corpus.   </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Conclusion</head><p>In this paper, I describe the solution I proposed in the HASOC (2021) challenge, including the pre-preprocessing of the data before training the model, the selection of the learning rate, and the construction of the final model. This challenge mainly includes the text classification tasks of English, Hindi, and Marathi. I solved the classification problem of English corpus by integrating the DeBERTa and BiLSTM models, and the classification problem of Hindi and Marathi corpus was solved by integrating the mBERT and TextCNN models. The difficulties of Subtask A and Subtask B in this challenge are as follows. First of all, the text content is informal, the text length is generally short, and it lacks context, so it is difficult to obtain very high accuracy. Secondly, during the experiment, I found that the data distribution of each category in Subtask A and Subtask B is not uniform, which is also a reason why the model is biased to predict a category that appears more commonly. Finally, the amount of data in this challenge is not sufficient compared to large models like BERT, which leads to over-fitting, which makes the model perform poorly on the testing Dataset. In future research work, I will try to use a variety of fine-tuning strategies <ref type="bibr" target="#b20">[21]</ref>, or the idea of transfer learning <ref type="bibr" target="#b21">[22]</ref> to continue to improve my solution.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 2</head><label>2</label><figDesc>Evaluation results of six pre-trained models on subtask A's Validation datasets.</figDesc><table><row><cell>Model</cell><cell cols="3">Validation Accuracy English Hindi Marathi</cell></row><row><cell>ALBERT</cell><cell>0.8064</cell><cell>0.6754</cell><cell>0.6905</cell></row><row><cell>BERT</cell><cell>0.8092</cell><cell>0.7911</cell><cell>0.7872</cell></row><row><cell>DeBERTa</cell><cell cols="2">0.8212 0.7533</cell><cell>0.7083</cell></row><row><cell>mBERT</cell><cell cols="3">0.8122 0.8034 0.8908</cell></row><row><cell>XLNet</cell><cell>0.8094</cell><cell>0.6825</cell><cell>0.6899</cell></row><row><cell cols="2">SqueezeBERT 0.8055</cell><cell>0.7588</cell><cell>0.7542</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 3</head><label>3</label><figDesc>The evaluation results of the DeBERTa + BiLSTM models trained with five common learning rates on the subtask A's Validation Dataset of the English corpus.</figDesc><table><row><cell>Learning Rate</cell><cell>1e-6</cell><cell>5e-6</cell><cell>1e-5</cell><cell>3e-5</cell><cell>5e-5</cell></row><row><cell>Accuracy</cell><cell cols="5">0.8143 0.8149 0.8201 0.8194 0.8221</cell></row><row><cell>Macro F1</cell><cell cols="5">0.8093 0.8088 0.8024 0.8019 0.8182</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 4</head><label>4</label><figDesc>The final models are obtained after the DeBERTa model integrates RNN, BiLSTM, and TextCNN models respectively, and the evaluation result of the final models on the English corpus Validation Dataset.</figDesc><table><row><cell>Model</cell><cell cols="2">English Validation Dataset Accuracy Macro F1</cell></row><row><cell>DeBERTa+RNN</cell><cell>0.8163</cell><cell>0.8020</cell></row><row><cell>DeBERTa+BiLSTM</cell><cell>0.8201</cell><cell>0.8024</cell></row><row><cell>DeBERTa+TextCNN</cell><cell>0.8188</cell><cell>0.8059</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 5</head><label>5</label><figDesc>The final models are obtained after the mBERT model integrates RNN, BiLSTM, and TextCNN models respectively, and the scores of the final models on the Validation datasets of Hindi and Marathi corpus.</figDesc><table><row><cell></cell><cell cols="4">Hindi Validation Dataset Marathi Validation Dataset</cell></row><row><cell>Model</cell><cell>Accuracy</cell><cell cols="2">Macro F1 Accuracy</cell><cell>Macro F1</cell></row><row><cell>mBERT+RNN</cell><cell>0.8101</cell><cell>0.7906</cell><cell>0.8866</cell><cell>0.8712</cell></row><row><cell>mBERT+BiLSTM</cell><cell>0.8112</cell><cell>0.7951</cell><cell>0.8916</cell><cell>0.8748</cell></row><row><cell>mBERT+TextCNN</cell><cell>0.8124</cell><cell>0.7964</cell><cell>0.8980</cell><cell>0.8831</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_4"><head>Table 6</head><label>6</label><figDesc>DeBERTa+ BiLSTM model's accuracy and Macro F1 score on the Validation Dataset of the English corpus of Subtask B.</figDesc><table><row><cell>Model</cell><cell cols="2">English Validation Dataset Accuracy Macro F1</cell></row><row><cell>DeBERTa</cell><cell>0.7291</cell><cell>0.5968</cell></row><row><cell>DeBERTa+BiLSTM</cell><cell>0.7356</cell><cell>0.6051</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_5"><head></head><label></label><figDesc>Table 8 lists the best scores on the corpus</figDesc><table /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_6"><head>Table 7 mBERT+</head><label>7</label><figDesc>TextCNN model's accuracy and Macro F1 score on the Hindi Validation Dataset of Subtask B.</figDesc><table><row><cell>Model</cell><cell cols="2">Hindi Validation Dataset Accuracy Macro F1</cell></row><row><cell>mBERT</cell><cell>0.7266</cell><cell>0.5911</cell></row><row><cell>mBERT+TextCNN</cell><cell>0.7337</cell><cell>0.5954</cell></row><row><cell cols="3">of each language in Subtask A and Subtask B in the HASOC (2021) Challenge, as well as the</cell></row><row><cell cols="2">official evaluation results of the answers I finally submitted.</cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_7"><head>Table 8</head><label>8</label><figDesc>After the official evaluation, my final score on each task, my ranking, and the best result of each task.</figDesc><table><row><cell>Subtask</cell><cell cols="2">Macro F1 Best Score My Score</cell><cell>My Rank</cell></row><row><cell>English Subtask A</cell><cell>0.8305</cell><cell>0.8030</cell><cell>6 / 56</cell></row><row><cell>English Subtask B</cell><cell>0.6657</cell><cell>0.6116</cell><cell>15 / 37</cell></row><row><cell>Hindi Subtask A</cell><cell>0.7825</cell><cell>0.7725</cell><cell>6 / 34</cell></row><row><cell>Hindi Subtask B</cell><cell>0.5603</cell><cell>0.5509</cell><cell>5 / 24</cell></row><row><cell>Marathi Subtask A</cell><cell>0.9144</cell><cell>0.8611</cell><cell>11 / 25</cell></row></table></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">A survey on automatic detection of hate speech in text</title>
		<author>
			<persName><forename type="first">P</forename><surname>Fortuna</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Nunes</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Computing Surveys (CSUR)</title>
		<imprint>
			<biblScope unit="volume">51</biblScope>
			<biblScope unit="page" from="1" to="30" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Overview of the hasoc track at fire 2019: Hate speech and offensive content identification in indo-european languages</title>
		<author>
			<persName><forename type="first">T</forename><surname>Mandl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Modha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Majumder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Patel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Dave</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Mandlia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Patel</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 11th forum for information retrieval evaluation</title>
				<meeting>the 11th forum for information retrieval evaluation</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="14" to="17" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<author>
			<persName><forename type="first">T</forename><surname>Mandl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Modha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">K</forename><surname>Shahi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">K</forename><surname>Jaiswal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Nandini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Patel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Majumder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Schäfer</surname></persName>
		</author>
		<idno>CoRR abs/2108.05927</idno>
		<ptr target="https://arxiv.org/abs/2108.05927.arXiv:2108.05927" />
		<title level="m">Overview of the HASOC track at FIRE 2020: Hate speech and offensive content identification in indo-european languages</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<author>
			<persName><forename type="first">M</forename><surname>Zampieri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Malmasi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Nakov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Rosenthal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Farra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Kumar</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1903.08983</idno>
		<title level="m">Semeval-2019 task 6: Identifying and categorizing offensive language in social media (offenseval)</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<author>
			<persName><forename type="first">M</forename><surname>Zampieri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Nakov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Rosenthal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Atanasova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Karadzhov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Mubarak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Derczynski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Pitenis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ç</forename><surname>Çöltekin</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2006.07235</idno>
		<title level="m">Semeval-2020 task 12: Multilingual offensive language identification in social media (offenseval 2020)</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Bagging bert models for robust aggression identification</title>
		<author>
			<persName><forename type="first">J</forename><surname>Risch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Krestel</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Second Workshop on Trolling, Aggression and Cyberbullying</title>
				<meeting>the Second Workshop on Trolling, Aggression and Cyberbullying</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="55" to="61" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">3idiots at hasoc 2019: Fine-tuning transformer neural networks for hate speech identification in indo-european languages</title>
		<author>
			<persName><forename type="first">S</forename><surname>Mishra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Mishra</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">FIRE (Working Notes)</title>
				<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="208" to="213" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Improving arabic text categorization using transformer training diversification</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">A</forename><surname>Chowdhury</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Abdelali</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Darwish</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Soon-Gyo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Salminen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">J</forename><surname>Jansen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Fifth Arabic Natural Language Processing Workshop</title>
				<meeting>the Fifth Arabic Natural Language Processing Workshop</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="226" to="236" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Overview of the HASOC subtrack at FIRE 2021: Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages</title>
		<author>
			<persName><forename type="first">T</forename><surname>Mandl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Modha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">K</forename><surname>Shahi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Madhu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Satapara</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Majumder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Schäfer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Ranasinghe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Zampieri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Nandini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">K</forename></persName>
		</author>
		<ptr target="http://ceur-ws.org/" />
	</analytic>
	<monogr>
		<title level="m">Working Notes of FIRE 2021 -Forum for Information Retrieval Evaluation</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Cross-lingual offensive language identification for low resource languages: The case of marathi</title>
		<author>
			<persName><forename type="first">S</forename><surname>Gaikwad</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Ranasinghe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Zampieri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">M</forename><surname>Homan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of RANLP</title>
				<meeting>RANLP</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Overview of the HASOC Subtrack at FIRE 2021: Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages and Conversational Hate Speech</title>
		<author>
			<persName><forename type="first">S</forename><surname>Modha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Mandl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">K</forename><surname>Shahi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Madhu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Satapara</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Ranasinghe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Zampieri</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">FIRE 2021: Forum for Information Retrieval Evaluation, Virtual Event</title>
				<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2021-12">December 2021. 2021</date>
			<biblScope unit="page" from="13" to="17" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<monogr>
		<author>
			<persName><forename type="first">J</forename><surname>Devlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M.-W</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Toutanova</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1810.04805</idno>
		<title level="m">Bert: Pre-training of deep bidirectional transformers for language understanding</title>
				<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b12">
	<monogr>
		<author>
			<persName><forename type="first">Z</forename><surname>Lan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Goodman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Gimpel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Sharma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Soricut</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1909.11942</idno>
		<title level="m">Albert: A lite bert for self-supervised learning of language representations</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b13">
	<monogr>
		<author>
			<persName><forename type="first">Z</forename><surname>Chi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Dong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Wei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Singhal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Song</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X.-L</forename><surname>Mao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Zhou</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2007.07834</idno>
		<title level="m">Infoxlm: An information-theoretic framework for cross-lingual language model pre-training</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b14">
	<monogr>
		<author>
			<persName><forename type="first">P</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Gao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Chen</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2006.03654</idno>
		<title level="m">Deberta: Decoding-enhanced bert with disentangled attention</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Xlnet: Generalized autoregressive pretraining for language understanding</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Dai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Carbonell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">R</forename><surname>Salakhutdinov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><forename type="middle">V</forename><surname>Le</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Advances in neural information processing systems</title>
		<imprint>
			<biblScope unit="volume">32</biblScope>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<monogr>
		<title level="m" type="main">Squeezebert: What can computer vision teach nlp about efficient neural networks?</title>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">N</forename><surname>Iandola</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">E</forename><surname>Shaw</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Krishna</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">W</forename><surname>Keutzer</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2006.11316</idno>
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b17">
	<monogr>
		<author>
			<persName><forename type="first">K</forename><surname>Cho</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Van Merriënboer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Gulcehre</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Bahdanau</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Bougares</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Schwenk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Bengio</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1406.1078</idno>
		<title level="m">Learning phrase representations using rnn encoder-decoder for statistical machine translation</title>
				<imprint>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Lstm: A search space odyssey</title>
		<author>
			<persName><forename type="first">K</forename><surname>Greff</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">K</forename><surname>Srivastava</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Koutník</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">R</forename><surname>Steunebrink</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Schmidhuber</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE transactions on neural networks and learning systems</title>
		<imprint>
			<biblScope unit="volume">28</biblScope>
			<biblScope unit="page" from="2222" to="2232" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<monogr>
		<author>
			<persName><forename type="first">T</forename><surname>Lei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Barzilay</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Jaakkola</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1508.04112</idno>
		<title level="m">Molding cnns for text: non-linear, non-consecutive convolutions</title>
				<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">An autoregulated fine-tuning strategy for titer improvement of secondary metabolites using native promoters in streptomyces</title>
		<author>
			<persName><forename type="first">S</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Xiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Wang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACS synthetic biology</title>
		<imprint>
			<biblScope unit="volume">7</biblScope>
			<biblScope unit="page" from="522" to="530" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Reducing overfitting in diabetic retinopathy detection using transfer learning</title>
		<author>
			<persName><forename type="first">N</forename><surname>Barhate</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Bhave</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Bhise</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">G</forename><surname>Sutar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">C</forename><surname>Karia</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE 5th International Conference on Computing Communication and Automation (ICCCA), IEEE</title>
				<imprint>
			<date type="published" when="2020">2020. 2020</date>
			<biblScope unit="page" from="298" to="301" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
