<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Hate Speech Detection with Machine-Translated Data: The Role of Annotation Scheme, Class Imbalance and Undersampling</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Camilla</forename><surname>Casula</surname></persName>
							<email>ccasula@fbk.eu</email>
							<affiliation key="aff0">
								<orgName type="institution">Fondazione Bruno Kessler Trento</orgName>
								<address>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Sara</forename><surname>Tonelli</surname></persName>
							<email>satonelli@fbk.eu</email>
							<affiliation key="aff1">
								<orgName type="institution">Fondazione Bruno Kessler Trento</orgName>
								<address>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Hate Speech Detection with Machine-Translated Data: The Role of Annotation Scheme, Class Imbalance and Undersampling</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">34E7D277EE54024013D06AEF4C01A4D6</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-19T15:40+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>While using machine-translated data for supervised training can alleviate data sparseness problems when dealing with less-resourced languages, it is important that the source data are not only correctly translated, but also follow the same annotation scheme and possibly class balance as the smaller dataset in the target language. We therefore present an evaluation of hate speech detection in Italian using machine-translated data from English and comparing three settings, in order to understand the impact of training size, class distribution and annotation scheme. 1  </p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>The task of detecting hate speech on social media has been attracting increasing attention due to the negative effects this phenomenon can have on online communities and society as a whole. The development of systems which can effectively detect hate speech has therefore become increasingly important for academics and tech companies alike.</p><p>One of the difficulties of producing accurate hate speech detection systems is the need for large, high-quality datasets, the creation of which is time and resource-consuming. English can count on the highest number of hate speech detection datasets, as well as the ones with the largest sizes, with up to 150k posts for a single dataset <ref type="bibr" target="#b9">(Gomez et al., 2020)</ref>. Other languages such as Italian, on the other hand, can count on fewer datasets which tend to be smaller <ref type="bibr" target="#b19">(Vidgen and Derczynski, 2020)</ref>. Given that machine learning methods are typically used for this task, the use of small datasets can lead to overfitting problems due to the lack of linguistic variation <ref type="bibr" target="#b19">(Vidgen and Derczynski, 2020)</ref>.</p><p>One possible solution to alleviate data sparseness is the use of machine translated data from English to less resourced languages for training classifiers, exploiting the large amount of data available for English. This has already been used in the context of hate speech detection <ref type="bibr" target="#b17">(Sohn and Lee, 2019;</ref><ref type="bibr" target="#b3">Casula et al., 2020)</ref> but results have not been consistent across languages.</p><p>An additional issue is the fact that there is no shared fixed definition within the NLP community of what type of language constitutes hate speech. Indeed, there are typically large differences among hate speech and abusive language datasets in terms of annotation frameworks and their applications in practice <ref type="bibr" target="#b2">(Caselli et al., 2020)</ref>. In addition to this, there can be large variations between datasets in terms of size and class balance. Possible issues affecting the behaviour of classifiers trained on machine-translated data, such as different class distribution in source and target language, or different annotation scheme, have not been analysed.</p><p>In order to fill this gap, we explore the impact of these differences between datasets when performing hate speech detection in Italian using machinetranslated data from English. Our goal is to address the three following questions:</p><p>• What performance can we expect by using only machine translated data, given that translation quality for social media language may be problematic?</p><p>• Is it better to use a larger translated set for training, even by merging slightly different classes, or a smaller, more precise one?</p><p>• What is the impact of class imbalance, and to what extent can undersampling be effective?</p><p>The above questions are addressed by comparing three experimental settings that are described in Section 4 and evaluated in Section 5.</p><p>In recent years, the number of research works focused on the detection of hate speech on social media has remarkably increased, mostly due to the growing awareness regarding the societal impact these platforms can have.</p><p>Computational methods for detecting the presence of hate speech on the web have become necessary due to the extremely large amounts of usergenerated content being posted each day. These methods typically rely on supervised learning, in the form of both traditional machine learning (e.g. support vector classifiers) and deep learning approaches <ref type="bibr" target="#b16">(Schmidt and Wiegand, 2017)</ref>. Given the increased attention towards this topic, more and more shared tasks regarding hate speech and abusive language detection have emerged, such as the HaSpeeDe task at Evalita 2018 <ref type="bibr">(Bosco et al., 2018)</ref>, OffensEval <ref type="bibr" target="#b22">(Zampieri et al., 2019) and</ref><ref type="bibr">HatEval (Basile et al., 2019)</ref> at SemEval 2019, and the multilingual OffensEval at SemEval 2020 <ref type="bibr" target="#b23">(Zampieri et al., 2020)</ref>.</p><p>Systems based on Transformers architectures such as BERT <ref type="bibr" target="#b5">(Devlin et al., 2019)</ref> have proven effective for hate speech detection and classification in both English <ref type="bibr" target="#b22">(Zampieri et al., 2019)</ref> and Italian <ref type="bibr" target="#b13">(Polignano et al., 2019a)</ref>. These systems are generally pre-trained on large unlabeled corpora through two self-supervised tasks (next sentence prediction and masked language modeling) to create language models which can then be finetuned to a variety of downstream tasks using labeled data.</p><p>AlBERTo <ref type="bibr" target="#b14">(Polignano et al., 2019b</ref>) is a BERTbased system which was pre-trained on Italian Twitter data, and it currently defines the state of the art for hate speech detection in Italian <ref type="bibr" target="#b13">(Polignano et al., 2019a)</ref>.</p><p>Recently, more attention has been directed towards the quality of hate and abuse detection systems. <ref type="bibr" target="#b20">Vidgen et al. (2019)</ref> investigate the flaws presented by most abusive language detection datasets in circulation: they can contain systematic biases towards certain types and targets of abuse, they are subject to degradation over time, they typically present very low inter-annotator agreement, and they can vary greatly with respect to quality, size, and class balance. <ref type="bibr" target="#b19">Vidgen and Derczynski (2020)</ref> further analyse the role of datasets in the detection of abuse, addressing issues such as the use of different task descriptions and annotation schemes across corpora, as well as similar annotation schemes being applied in different ways.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Data</head><p>Since tweets containing hate speech or abusive language constitute a very small subset (between 0.1% and 3% depending on the label used) of all tweets being posted <ref type="bibr" target="#b7">(Founta et al., 2018)</ref>, random samples are generally not used for annotation, because the final datasets would contain an extremely low number of positive class examples, which would make classification difficult. The typical solution to this is to preselect posts that are likely to contain hateful language by searching for specific hate-related keywords. While this method is effective for gathering more instances of hate speech, it can make datasets biased, which is a main issue in hate speech datasets <ref type="bibr" target="#b21">(Wiegand et al., 2019)</ref>.</p><p>The dataset we chose for training our system is described in <ref type="bibr" target="#b7">Founta et al. (2018)</ref>. This dataset was not created starting from a set of predefined offensive terms or hashtags in order to reduce bias, which was an important factor in our choice. The method used by <ref type="bibr" target="#b7">Founta et al. (2018)</ref> to increase the percentage of hateful/abusive tweets is boosted random sampling, in which a portion of the dataset is "boosted" with tweets that are more likely to belong in the minority classes. The boosted set of tweets is created using text analysis and machine learning <ref type="bibr" target="#b7">(Founta et al., 2018)</ref>.</p><p>The dataset was annotated through crowdsourcing using the labels hateful, abusive, spam, and normal. The definition of hate speech given by <ref type="bibr" target="#b7">Founta et al. (2018)</ref> to the annotators, based on existing literature on the topic, is:</p><p>Hate Speech: Language used to express hatred towards a targeted individual or group, or is intended to be derogatory, to humiliate, or to insult the members of the group, on the basis of attributes such as race, religion, ethnic origin, sexual orientation, disability, or gender.</p><p>The abusive label, on the other hand, is the result of three separate labels (abusive, offensive, and aggressive) being combined. In preliminary annotation rounds, <ref type="bibr" target="#b7">Founta et al. (2018)</ref> found that these three labels were significantly correlated, so they grouped them together. The definition of abusive language given to the annotators is:</p><p>Abusive Language: Any strongly impolite, rude or hurtful language using profanity, that can show a debasement of someone or something, or show intense emotion.</p><p>While the Founta et al. ( <ref type="formula">2018</ref>) dataset was originally comprised of 80k tweets, Twitter datasets can often be subject to degradation due to tweets being removed over time and not accessible anymore through tweet IDs <ref type="bibr" target="#b20">(Vidgen et al., 2019)</ref>. After retrieving all available tweets and after removing tweets annotated as spam, the total number of tweets we use for training is 12,379, of which 727 are annotated as hateful and 1,792 as abusive. Before translating the data into Italian, we preprocess it using the Ekphrasis tool<ref type="foot" target="#foot_2">2</ref> to tokenise the text and normalise user mentions, URLs (replaced by &lt;user&gt; and &lt;url&gt; respectively), as well as numbers, which are substituted with a number tag. We then use the Google Translate API to translate the data into Italian, in order to use it as training data for our classifier.</p><p>For testing, we use the test portion of the Twitter dataset used in the Hate Speech Detection (HaSpeeDe) task at Evalita 2018 <ref type="bibr" target="#b1">(Bosco et al., 2018)</ref>, consisting of 1,000 Italian tweets manually annotated for hate speech against immigrants. This dataset is a simplified version of the dataset described in <ref type="bibr" target="#b15">(Sanguinetti et al., 2018)</ref>, in which more fine-grained labels are used.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Experimental Setup</head><p>We experiment with the fine-tuning of AlBERTo <ref type="bibr" target="#b14">(Polignano et al., 2019b)</ref>, a BERT-based language model pre-trained on Italian Twitter data, using data that was automatically translated from English. This model has achieved state-of-the-art results when fine-tuned on the training data from the HaSpeeDe task at Evalita 2018 <ref type="bibr" target="#b13">(Polignano et al., 2019a)</ref>.</p><p>Our goal is that of exploring the impact of different annotation schemes and class balance when using machine-translated data for hate speech detection. Indeed, merging fine-grained classes into coarser ones has been a common and accepted practice when creating larger training sets from a smaller one (e.g. Founta et al. ( <ref type="formula">2019</ref>)). This step has been performed also to compare classification in different languages <ref type="bibr" target="#b4">(Corazza et al., 2020)</ref>.</p><p>In order to investigate this, we compare three different experimental settings. In the first one, we fine-tune AlBERTo on the translated tweets in Founta et al. ( <ref type="formula">2018</ref>) after merging the hateful and abusive classes together, mapping them to a single hateful class as required by the binary classification task at Evalita 2018. In a second setting, AlBERTo is fine-tuned on the hateful class alone, discarding all tweets annotated as abusive in <ref type="bibr" target="#b7">Founta et al. (2018)</ref>. We hypothesize this setting may perform better when tested on the HaSpeeDe data, given the higher similarity in annotation framework.</p><p>Simply removing tweets annotated as abusive, however, can throw off the balance between classes. More specifically, when training the system on both abusive and hateful tweets the hate-ful+abusive class constitutes about 20% of our data, while when we only use tweets annotated as hateful this percentage drops to 7%, potentially affecting classification results. In particular, the data we use for testing has a different class balance, with 30% of tweets marked as hateful. In order to assess the impact of class imbalance on our results, we further evaluate each setting using undersampling <ref type="bibr" target="#b10">(Kubat, 2000;</ref><ref type="bibr" target="#b18">Sun et al., 2009)</ref>, a technique typically used for imbalanced classification, in which we reduce the number of tweets belonging to the majority class, so that the overall percentage of tweets containing hate increases.</p><p>Given that undersampling our data reduces the total size of tweets available for training, the resulting datasets for each annotation scheme considerably differ in size. We therefore consider a third setting, in which we use further random undersampling <ref type="bibr" target="#b10">(Kubat, 2000;</ref><ref type="bibr" target="#b18">Sun et al., 2009)</ref> to match the larger dataset (hateful+abusive) with the smaller one (hateful only), so that the two annotations can be effectively compared in a setting with equal class balance and sample size.</p><p>In summary, the three data settings we train our system on are:</p><p>1. Hateful and abusive tweets, using undersampling to progressively lower class imbalance;</p><p>2. Hateful only tweets, again using undersampling to progressively lower class imbalance;</p><p>3. Hateful and abusive tweets, both using undersampling to progressively lower class imbalance as in the previous settings, and using further random undersampling to match the low sample sizes of setting 2.</p><p>Our AlBERTo fine-tuning architecture consists of a pooling layer for extracting the AlBERTo hidden representation for each sequence, followed by a dropout layer (dropout rate 0.2), two dense layers of size 768 and 128 and, finally, a softmax layer. We use L2 regularization (λ=0.01), Adam optimizer (2e-5 learning rate), and categorical cross-entropy loss. We train the system for 5 epochs with batch size 32.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Results and Discussion</head><p>We measure the classification results using both macro-F1 score and minority class F1 score. We repeat each run five times in order to compensate for random initialization, and we report the average scores of these runs.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1">Setting 1: Hateful + Abusive Tweets</head><p>The classification results obtained when finetuning AlBERTo on both abusive and hateful tweets combined can be observed in Table <ref type="table" target="#tab_0">1</ref>. The class balance of the dataset prior to undersampling is 20% hateful + abusive tweets and 80% non-hateful, which amounts to 12,379 tweets total. With this class balance, the system performs the worst, classifying every tweet as belonging to the majority non-hateful class. On the other hand, with a higher percentage of minority class instances, the classification results improve, in spite of the considerably smaller amount of training data available. These results suggest that consistency in class balance can play a bigger role than training data size in classification results in this context.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2">Setting 2: Hateful Only Tweets</head><p>The performance of the system when fine-tuned on tweets labeled as hateful only is reported in Table <ref type="table" target="#tab_1">2</ref>. As previously mentioned, only 7% of tweets in the dataset we use are labeled as hateful. The classes are therefore extremely imbalanced before undersampling. Predictably, with the classes being this imbalanced, the system identifies all test instances as belonging to the majority class. This again happens with the minority class comprising 20% of the training data. Similarly to Setting 1, the best classification performance in this case is achieved with 30% of minority class tweets. Interestingly, the best performance is comparable to the one obtained in Setting 1, even though in this case the number of training samples available is much lower, suggesting that more task-specific training instances can impact performance. We can note a difference with the minority class at 40% of total data, in which the performance drops in terms of macro-F1 score, likely due to the very small number of samples available for training and the consequent lack of linguistic variation. The hate class F1 score, however, remains stable.</p><p>State-of-the-art results obtained by fine-tuning AlBERTo on the same Evalita dataset as reported in <ref type="bibr" target="#b13">Polignano et al. (2019a)</ref> reach 0.80 macro-F1 and 0.73 F1 on the hate class, which we can consider an upper-bound for our task, obtained in a fully-supervised monolingual setting. On the other hand, the most frequent label baseline is 0.40 macro-F1, which is clearly outperformed using only machine-translated data.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.3">Setting 3: Hateful + Abusive Tweets (Random Undersampling)</head><p>Since there are large differences in size between the hateful+abusive annotation and the hatefulonly annotation, we randomly undersample the hateful+abusive training data so that it matches the size of the hateful-only training data, in order to allow us to effectively compare the impact of each annotation framework on our results. The classification performance is reported in Table <ref type="table" target="#tab_3">3</ref>.</p><p>If we compare the results of Setting 3 with those of Setting 2, it is clear that using more task-  specific data, in this case hateful-only tweets, can lead to a larger improvement in performance when the amount of training data is the same. This suggests that consistency in annotation between training and test data can have a positive impact on classification, although it is not fundamental to help classification of hate speech detection with machine translated data. In fact, other aspects such as class balance can also play an important role.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.4">Qualitative Analysis</head><p>Another aspect affecting classification, which we have not considered so far, is the quality of machine translation, a particularly challenging task on social media data <ref type="bibr" target="#b11">(Michel and Neubig, 2018)</ref>. In order to assess the impact of translation quality on our results, two annotators with linguistic background manually analysed 500 samples from the training data, consisting of 300 tweets annotated as normal, 100 as hateful, and 100 as abusive. Each annotator checked manually 250 random tweets from this sample. Translation quality was evaluated using the semantic adequacy annotation scheme proposed in <ref type="bibr">Dorr et al. (2011, p. 807)</ref>. Annotations are judged on a scale between -3 and 3, with scores below 0 for inadequate translations and above 0 for adequate ones. The averaged annotations for each class are reported in Table <ref type="table" target="#tab_4">4</ref>. Overall, translations tend towards adequacy, but the average scores are below 1 for all classes. Interestingly, tweets annotated as abusive show poorer translation quality than other classes. This could help explain the small differences in classification performance between our experiments.</p><p>A major role is played in this context by profanities, which are often used to offend a target but can also appear in non derogatory messages exchanged among members of the same community <ref type="bibr" target="#b12">(Pamungkas et al., 2020)</ref>. In the case of abusive tweets, we observe that the offenses are less direct and therefore slurs tend to be translated poorly. See for example the following sentence, which is labeled as abusive in the Founta et al. (2018) dataset:</p><p>(1) use that ugly ass design <ref type="bibr">[...]</ref> utilizzare quel disegno asino brutto [...] use that design donkey ugly [...]</p><p>Here, "ass" is translated with "asino" ("donkey"), effectively removing the profanity in the translated tweet and changing completely the meaning of the message.</p><p>On the other hand, when profanities are used in a more direct way, or when they are expressed through unambiguous words such as "idiot" and "stupid", they tend to be translated correctly, contributing to a correct classification. Example 2 shows a hateful tweet which was translated almost correctly, retaining its offensiveness in the target language.</p><p>(2) what happens when you put idiots in charge cosa succede quando si mette idioti in carica</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6">Conclusions</head><p>In this paper we analysed the impact of machinetranslated data on Italian hate speech detection in a zero-shot setting. Our experiments show that when using machine-translated data for training it is possible to learn a classification model that clearly outperforms the most-frequent baseline, even if translation quality is affected by the jargon used in social media data. We found that using more task-specific data can have a positive impact on classification performance even with lower sample sizes compared to larger, less targeted datasets.</p><p>Consistency in class distribution of training and test data can have a bigger impact than the size of the training set, or the annotation scheme. Indeed, using only the original training set translated into Italian, without undersampling, classification performance would be poor.</p><p>In the future, we plan to extend this kind of evaluation to new language pairs and new datasets, to check whether the findings obtained on the English -Italian pair are confirmed also with other languages.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 :</head><label>1</label><figDesc>Scores obtained when fine-tuning Al-BERTo on both hateful and abusive tweets.</figDesc><table><row><cell></cell><cell cols="2">Setting 1: Hateful + abusive</cell><cell></cell></row><row><cell cols="4">% hate Size (tweets) Macro-F1 Hate class F1</cell></row><row><cell>20%</cell><cell>12,379</cell><cell>0.40</cell><cell>0</cell></row><row><cell>30%</cell><cell>8,397</cell><cell>0.64</cell><cell>0.52</cell></row><row><cell>40%</cell><cell>6,298</cell><cell>0.63</cell><cell>0.57</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2 :</head><label>2</label><figDesc>Scores obtained when fine-tuning Al-BERTo on tweets labeled as hateful only.</figDesc><table><row><cell></cell><cell cols="2">Setting 2: Hateful only</cell><cell></cell></row><row><cell cols="4">% hate Size (tweets) Macro-F1 Hate class F1</cell></row><row><cell>7%</cell><cell>10,587</cell><cell>0.40</cell><cell>0</cell></row><row><cell>20%</cell><cell>3,635</cell><cell>0.40</cell><cell>0</cell></row><row><cell>30%</cell><cell>2,423</cell><cell>0.65</cell><cell>0.54</cell></row><row><cell>40%</cell><cell>1,818</cell><cell>0.52</cell><cell>0.56</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 3 :</head><label>3</label><figDesc>Scores obtained when fine-tuning Al-BERTo on tweets labeled as hateful and abusive, after random undersampling.</figDesc><table /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_4"><head>Table 4 :</head><label>4</label><figDesc>Average translation quality scores.</figDesc><table><row><cell></cell><cell cols="4">Normal Hateful Abusive Overall</cell></row><row><cell>Average</cell><cell>0.438</cell><cell>0.527</cell><cell>-0.043</cell><cell>0.368</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">Copyright c</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2020" xml:id="foot_1">for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_2">https://github.com/cbaziotis/ekphrasis</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">SemEval-2019 task 5: Multilingual detection of hate speech against immigrants and women in twitter</title>
		<author>
			<persName><forename type="first">Cristina</forename><surname>Valerio Basile</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Elisabetta</forename><surname>Bosco</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Debora</forename><surname>Fersini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Viviana</forename><surname>Nozza</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Francisco</forename><surname>Patti</surname></persName>
		</author>
		<author>
			<persName><surname>Manuel Rangel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Paolo</forename><surname>Pardo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Manuela</forename><surname>Rosso</surname></persName>
		</author>
		<author>
			<persName><surname>Sanguinetti</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 13th International Workshop on Semantic Evaluation</title>
				<meeting>the 13th International Workshop on Semantic Evaluation<address><addrLine>Minneapolis, Minnesota, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computational Linguistics</publisher>
			<date type="published" when="2019-06">2019. June</date>
			<biblScope unit="page" from="54" to="63" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Overview of the evalita 2018 hate speech detection task</title>
		<author>
			<persName><forename type="first">Cristina</forename><surname>Bosco</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Dell'orletta</forename><surname>Felice</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Fabio</forename><surname>Poletto</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Manuela</forename><surname>Sanguinetti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Tesconi</forename><surname>Maurizio</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">EVALITA 2018-Sixth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian</title>
				<meeting><address><addrLine>Turin, Italy</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="volume">2263</biblScope>
			<biblScope unit="page" from="1" to="9" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">I feel offended, don&apos;t be abusive! implicit/explicit messages in offensive and abusive language</title>
		<author>
			<persName><forename type="first">Tommaso</forename><surname>Caselli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Valerio</forename><surname>Basile</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jelena</forename><surname>Mitrovic</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Inga</forename><surname>Kartoziya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Michael</forename><surname>Granitzer</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of The 12th Language Resources and Evaluation Conference, LREC 2020</title>
				<editor>
			<persName><forename type="first">Nicoletta</forename><surname>Calzolari</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Frédéric</forename><surname>Béchet</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Philippe</forename><surname>Blache</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Khalid</forename><surname>Choukri</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Christopher</forename><surname>Cieri</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Thierry</forename><surname>Declerck</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Sara</forename><surname>Goggi</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Hitoshi</forename><surname>Isahara</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Bente</forename><surname>Maegaard</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Joseph</forename><surname>Mariani</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Hélène</forename><surname>Mazo</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Asunción</forename><surname>Moreno</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Jan</forename><surname>Odijk</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Stelios</forename><surname>Piperidis</surname></persName>
		</editor>
		<meeting>The 12th Language Resources and Evaluation Conference, LREC 2020<address><addrLine>Marseille, France</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2020-05-11">2020. May 11-16, 2020</date>
			<biblScope unit="page" from="6193" to="6202" />
		</imprint>
	</monogr>
	<note>European Language Resources Association</note>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Fbk-dh at semeval-2020 task 12: Using multi-channel bert for multilingual offensive language detection</title>
		<author>
			<persName><forename type="first">Camilla</forename><surname>Casula</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Alessio</forename><surname>Palmero Aprosio</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Stefano</forename><surname>Menini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sara</forename><surname>Tonelli</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of Offenseval</title>
				<meeting>Offenseval</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">A multilingual evaluation for online hate speech detection</title>
		<author>
			<persName><forename type="first">Michele</forename><surname>Corazza</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Stefano</forename><surname>Menini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Elena</forename><surname>Cabrio</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sara</forename><surname>Tonelli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Serena</forename><surname>Villata</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Trans. Internet Techn</title>
		<imprint>
			<biblScope unit="volume">20</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page">22</biblScope>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">BERT: Pre-training of deep bidirectional transformers for language understanding</title>
		<author>
			<persName><forename type="first">Jacob</forename><surname>Devlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ming-Wei</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kenton</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kristina</forename><surname>Toutanova</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</title>
		<title level="s">Long and Short Papers</title>
		<meeting>the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies<address><addrLine>Minneapolis, Minnesota</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computational Linguistics</publisher>
			<date type="published" when="2019-06">2019. June</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="4171" to="4186" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<author>
			<persName><forename type="first">Bonnie</forename><forename type="middle">J</forename><surname>Dorr</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Joseph</forename><surname>Olive</surname></persName>
		</author>
		<author>
			<persName><forename type="first">John</forename><surname>Mccary</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Caitlin</forename><surname>Christianson</surname></persName>
		</author>
		<title level="m">Machine Translation Evaluation and Optimization</title>
				<meeting><address><addrLine>New York</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2011">2011</date>
			<biblScope unit="page" from="745" to="843" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Large scale crowdsourcing and characterization of twitter abusive behavior</title>
		<author>
			<persName><forename type="first">Maria</forename><surname>Antigoni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Constantinos</forename><surname>Founta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Despoina</forename><surname>Djouvas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ilias</forename><surname>Chatzakou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jeremy</forename><surname>Leontiadis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Gianluca</forename><surname>Blackburn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Athena</forename><surname>Stringhini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Michael</forename><surname>Vakali</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Nicolas</forename><surname>Sirivianos</surname></persName>
		</author>
		<author>
			<persName><surname>Kourtellis</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">12th International AAAI Conference on Web and Social Media</title>
				<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">A unified deep learning architecture for abuse detection</title>
		<author>
			<persName><forename type="first">Maria</forename><surname>Antigoni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Despoina</forename><surname>Founta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Nicolas</forename><surname>Chatzakou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jeremy</forename><surname>Kourtellis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Athena</forename><surname>Blackburn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ilias</forename><surname>Vakali</surname></persName>
		</author>
		<author>
			<persName><surname>Leontiadis</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 10th ACM Conference on Web Science, WebSci &apos;19</title>
				<meeting>the 10th ACM Conference on Web Science, WebSci &apos;19<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery</publisher>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="105" to="114" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Exploring hate speech detection in multimodal publications</title>
		<author>
			<persName><forename type="first">Raul</forename><surname>Gomez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jaume</forename><surname>Gibert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Lluis</forename><surname>Gomez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Dimosthenis</forename><surname>Karatzas</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE Winter Conference on Applications of Computer Vision (WACV)</title>
				<imprint>
			<date type="published" when="2020">2020. 2020</date>
			<biblScope unit="page">3</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Addressing the curse of imbalanced training sets: One-sided selection</title>
		<author>
			<persName><forename type="first">M</forename><surname>Kubat</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Fourteenth International Conference on Machine Learning</title>
				<imprint>
			<date type="published" when="2000">2000</date>
			<biblScope unit="volume">06</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">MTNT: A testbed for machine translation of noisy text</title>
		<author>
			<persName><forename type="first">Paul</forename><surname>Michel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Graham</forename><surname>Neubig</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing</title>
				<meeting>the 2018 Conference on Empirical Methods in Natural Language Processing<address><addrLine>Brussels, Belgium</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computational Linguistics</publisher>
			<date type="published" when="2018-10">2018. October-November</date>
			<biblScope unit="page" from="543" to="553" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Do you really want to hurt me? predicting abusive swearing in social media</title>
		<author>
			<persName><forename type="first">Endang</forename><surname>Wahyu Pamungkas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Valerio</forename><surname>Basile</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Viviana</forename><surname>Patti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">; Nicoletta</forename><surname>Calzolari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Frédéric</forename><surname>Béchet</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Philippe</forename><surname>Blache</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Khalid</forename><surname>Choukri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Christopher</forename><surname>Cieri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Thierry</forename><surname>Declerck</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sara</forename><surname>Goggi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Hitoshi</forename><surname>Isahara</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Bente</forename><surname>Maegaard</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Joseph</forename><surname>Mariani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Hélène</forename><surname>Mazo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Asunción</forename><surname>Moreno</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jan</forename><surname>Odijk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Stelios</forename><surname>Piperidis</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of The 12th Language Resources and Evaluation Conference, LREC 2020</title>
				<meeting>The 12th Language Resources and Evaluation Conference, LREC 2020<address><addrLine>Marseille, France</addrLine></address></meeting>
		<imprint>
			<publisher>Language Resources Association</publisher>
			<date type="published" when="2020-05-11">2020. May 11-16, 2020</date>
			<biblScope unit="page" from="6237" to="6246" />
		</imprint>
	</monogr>
	<note>editors. European</note>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Hate speech detection through alberto italian language understanding model</title>
		<author>
			<persName><forename type="first">Marco</forename><surname>Polignano</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Pierpaolo</forename><surname>Basile</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Marco</forename><surname>De Gemmis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Giovanni</forename><surname>Semeraro</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">NL4AI@ AI* IA</title>
				<imprint>
			<date type="published" when="2019">2019a</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">AlBERTo: Italian BERT Language Understanding Model for NLP Challenging Tasks Based on Tweets</title>
		<author>
			<persName><forename type="first">Marco</forename><surname>Polignano</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Pierpaolo</forename><surname>Basile</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Marco</forename><surname>De Gemmis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Giovanni</forename><surname>Semeraro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Valerio</forename><surname>Basile</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Sixth Italian Conference on Computational Linguistics (CLiC-it 2019)</title>
				<meeting>the Sixth Italian Conference on Computational Linguistics (CLiC-it 2019)</meeting>
		<imprint>
			<date type="published" when="2019">2019b</date>
			<biblScope unit="volume">2481</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">An Italian twitter corpus of hate speech against immigrants</title>
		<author>
			<persName><forename type="first">Manuela</forename><surname>Sanguinetti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Fabio</forename><surname>Poletto</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Cristina</forename><surname>Bosco</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Viviana</forename><surname>Patti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Marco</forename><surname>Stranisci</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)</title>
				<meeting>the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)<address><addrLine>Miyazaki, Japan</addrLine></address></meeting>
		<imprint>
			<publisher>ELRA</publisher>
			<date type="published" when="2018-05">2018. May</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">A survey on hate speech detection using natural language processing</title>
		<author>
			<persName><forename type="first">Anna</forename><surname>Schmidt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Michael</forename><surname>Wiegand</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media</title>
				<meeting>the Fifth International Workshop on Natural Language Processing for Social Media<address><addrLine>Valencia, Spain</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2017-04">2017. April</date>
			<biblScope unit="page" from="1" to="10" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Mc-bert4hate: Hate speech detection using multi-channel bert for different languages and translations</title>
		<author>
			<persName><forename type="first">Hajung</forename><surname>Sohn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Hyunju</forename><surname>Lee</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ternational Conference on Data Mining Workshops (ICDMW)</title>
				<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2019">2019. 2019</date>
			<biblScope unit="page" from="551" to="559" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Classification of imbalanced data: a review</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Wong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Kamel</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Int. J. Pattern Recognit. Artif. Intell</title>
		<imprint>
			<biblScope unit="volume">23</biblScope>
			<biblScope unit="page" from="687" to="719" />
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<monogr>
		<author>
			<persName><forename type="first">Bertie</forename><surname>Vidgen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Leon</forename><surname>Derczynski</surname></persName>
		</author>
		<idno>ArXiv, abs/2004.01670</idno>
		<title level="m">Directions in abusive language training data: Garbage in, garbage out</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Challenges and frontiers in abusive content detection</title>
		<author>
			<persName><forename type="first">Bertie</forename><surname>Vidgen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Alex</forename><surname>Harris</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Dong</forename><surname>Nguyen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Rebekah</forename><surname>Tromble</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Scott</forename><surname>Hale</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Helen</forename><surname>Margetts</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Third Workshop on Abusive Language Online</title>
				<meeting>the Third Workshop on Abusive Language Online<address><addrLine>Florence, Italy</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computational Linguistics</publisher>
			<date type="published" when="2019-08">2019. August</date>
			<biblScope unit="page" from="80" to="93" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Detection of Abusive Language: the Problem of Biased Datasets</title>
		<author>
			<persName><forename type="first">Michael</forename><surname>Wiegand</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Josef</forename><surname>Ruppenhofer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Thomas</forename><surname>Kleinbauer</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</title>
		<title level="s">Long and Short Papers</title>
		<meeting>the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies<address><addrLine>Minnesota</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computational Linguistics</publisher>
			<date type="published" when="2019-06">2019. June</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="602" to="608" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Semeval-2019 task 6: Identifying and categorizing offensive language in social media (offenseval)</title>
		<author>
			<persName><forename type="first">Marcos</forename><surname>Zampieri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Shervin</forename><surname>Malmasi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Preslav</forename><surname>Nakov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sara</forename><surname>Rosenthal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Noura</forename><surname>Farra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ritesh</forename><surname>Kumar</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 13th International Workshop on Semantic Evaluation</title>
				<meeting>the 13th International Workshop on Semantic Evaluation<address><addrLine>Minneapolis, Minnesota, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computational Linguistics</publisher>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="75" to="86" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media (Of-fensEval</title>
		<author>
			<persName><forename type="first">Marcos</forename><surname>Zampieri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Preslav</forename><surname>Nakov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sara</forename><surname>Rosenthal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Pepa</forename><surname>Atanasova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Georgi</forename><surname>Karadzhov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Hamdy</forename><surname>Mubarak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Leon</forename><surname>Derczynski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Zeses</forename><surname>Pitenis</surname></persName>
		</author>
		<author>
			<persName><surname>¸öltekin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 14th International Workshop on Semantic Evaluation. Association for Computational Linguistics</title>
				<meeting>the 14th International Workshop on Semantic Evaluation. Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2020">2020. 2020</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
