<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">TOBB ETU at CheckThat! 2022: Detecting Attention-Worthy and Harmful Tweets and Check-Worthy Claims</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Ahmet</forename><forename type="middle">Bahadir</forename><surname>Eyuboglu</surname></persName>
							<email>ahmetbahadireyuboglu@gmail.com</email>
						</author>
						<author>
							<persName><forename type="first">Mustafa</forename><forename type="middle">Bora</forename><surname>Arslan</surname></persName>
							<email>mustafaboraarslan@outlook.com</email>
						</author>
						<author>
							<persName><forename type="first">Ekrem</forename><surname>Sonmezer</surname></persName>
							<email>sonmezerekrem@outlook.com</email>
						</author>
						<author>
							<persName><forename type="first">Mucahid</forename><surname>Kutlu</surname></persName>
							<email>m.kutlu@etu.edu.tr</email>
						</author>
						<author>
							<affiliation key="aff0">
								<orgName type="institution">TOBB University of Economics and Technology</orgName>
								<address>
									<settlement>Ankara</settlement>
									<country key="TR">Turkey</country>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff1">
								<orgName type="department">Evaluation Forum</orgName>
								<address>
									<addrLine>September 5-8</addrLine>
									<postCode>2022</postCode>
									<settlement>Bologna</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">TOBB ETU at CheckThat! 2022: Detecting Attention-Worthy and Harmful Tweets and Check-Worthy Claims</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">4548A5C0EFF1E971CBE26B70C423DCF4</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T03:30+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Fact-Checking</term>
					<term>Check-worthiness</term>
					<term>Attention-worthy tweets</term>
					<term>Harmful tweets</term>
					<term>Factual Claims</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>In this paper, we present our participation in CLEF 2022 CheckThat! Lab's Task 1 on detecting checkworthy and verifiable claims and attention-worthy and harmful tweets. We participated in all subtasks of Task1 for Arabic, Bulgarian, Dutch, English, and Turkish datasets. We investigate the impact of fine-tuning various transformer models and how to increase training data size using machine translation. We also use feed-forward networks with the Manifold Mixup regularization for the respective tasks. We are ranked first in detecting factual claims in Arabic and harmful tweets in Dutch. In addition, we are ranked second in detecting check-worthy claims in Arabic and Bulgarian.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Social media platforms became one of the main information resource for people by enabling their users to easily share messages and follow others. While these platforms are extremely important to help people share their thoughts and make their voice heard, they can be also used in a very negative way by spreading misinformation and/or hateful messages which will negatively impact individuals and societies. We have especially observed this dark side of social media platforms during COVID-19 pandemic. For instance, misinformation and conspiracy theories about vaccines increased hesitation towards being vaccinated <ref type="bibr" target="#b0">[1]</ref>. Furthermore, the messages spread on social media platforms might impact public opinion on a particular issue and mobilize people, forcing government entities to take action. For instance, government entities of several countries had to regularly share information about vaccines to reduce the vaccine hesitation (e.g., <ref type="bibr" target="#b1">[2]</ref>).</p><p>In this paper, we explain our participation in Task 1 <ref type="bibr" target="#b2">[3]</ref> of the CLEF Check That! 2022 Lab <ref type="bibr" target="#b3">[4,</ref><ref type="bibr" target="#b4">5]</ref>. Task 1 covers four subtasks including 1) check-worthy claim detection (Subtask 1A), verifiable factual claim detection (Subtask 1B), harmful tweet detection (Subtask 1C), and attention-worthy tweet detection (Subtask 1D). Subtask 1A covers six languages including Arabic, Bulgarian, Dutch, English, Spanish, and Turkish while the other subtasks cover all the mentioned languages except Spanish. We participated in all subtasks for Arabic, Bulgarian, Dutch, English, and Turkish languages <ref type="foot" target="#foot_0">1</ref> , yielding 20 submissions in total.</p><p>In the development phase of the shared task, we explored three different research directions including i) fine-tuning various pre-trained transformer models, ii) increasing the training data for fine-tuning transformer models, and iii) applying the Manifold Mixup regularization technique <ref type="bibr" target="#b5">[6]</ref> for the subtasks we participated. In particular, we investigated 9, 3, 5, 13, and 3 different pre-trained transformer models for subtask 1A in Arabic, Bulgarian, Dutch, English, and Turkish, respectively. In addition, we explored increasing training data by back-translation and machine-translating datasets in other languages for subtask 1C. Next, we compared the Manifold Mixup approach, fine-tuning transformer models, and data augmentation by back-translation in all four subtasks to select models for our official submissions.</p><p>In our experiments with the development dataset, we find that the type of the transformer model causes dramatic changes in the performance, suggesting that researchers should select the models carefully. In addition, our findings about the impact of artificially increasing the data are mixed. In particular, we observe that increasing training data usually has a negative impact in Bulgarian and Turkish datasets in subtask 1C while using additional data for English and Dutch datasets improves the performance.</p><p>In the official ranking, we achieved mixed results. Considering tasks with at least three participants, we are ranked first in 1B-Arabic and second in 1A-Arabic and 1A-Bulgarian. We share our implementation for the Manifold Mixup method<ref type="foot" target="#foot_1">2</ref> for reproducibility of our results.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Approaches</head><p>We explore three different approaches for all subtasks including fine-tuning various transformer models, increasing dataset size via machine translation, and the Manifold Mixup regularization. In this section we explain each of them in detail.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.">Fine Tuning Various Transformer Models</head><p>Prior works show remarkable success of transformer models in various text classification tasks <ref type="bibr" target="#b6">[7]</ref>. Furthermore, the best-performing systems in previous check-worthy claim detection tasks of Check That! Lab <ref type="bibr" target="#b7">[8]</ref> usually exploited various transformer models <ref type="bibr" target="#b8">[9,</ref><ref type="bibr" target="#b9">10]</ref>. However, Kartal and Kutlu <ref type="bibr" target="#b10">[11]</ref> show that the performance of models varies dramatically across different transformer models. Therefore, in this approach, we explore several language-specific transformer models pre-trained with different datasets.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.">Increasing Training Data via Machine Translation</head><p>Training data has enormous impact on the performance of resultant models. Prior work on detecting check-worthy claim detection investigated several ways to increase the training data size such as back-translation <ref type="bibr" target="#b8">[9]</ref>, weak supervision <ref type="bibr" target="#b11">[12]</ref>, and utilizing datasets in other languages with multi-lingual models <ref type="bibr" target="#b10">[11]</ref>. In this approach, we explore increasing training data size by two different methods including 1) utilizing datasets in other languages by machine-translating them into the respective language, and 2) paraphrasing the training data via back-translation and using them as additional labeled data.</p><p>In the first method, we exploit datasets in several languages provided by the Check That! Lab organizers this year. In particular, in order to develop a model for a specific language, 𝐿 𝑂 , we first select a training dataset provided for another language and machine-translate its tweets to the language 𝐿 𝑂 using Google Translate. Subsequently, we fine-tune a language-specific transformer model using the original data and machine-translated data together. In subtask 1C, we machine translate only tweets labeled as harmful to reduce the imbalance in label distribution while increasing the training data size.</p><p>In our back-translation method, we first translate the original text to another language using Google Translate. Subsequently, we translate the resultant text back to the original language. This method is likely to create slightly different texts than the original ones with a same or similar meaning. Assuming that the change in the texts will not affect their label, we combine the original data with the back-translated data and fine-tune a language specific transformer model.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3.">Language Specific BERT with Manifold Mixup</head><p>Many of the annotations in the shared task are subjective. For instance, whether a tweet requires attention of government entities might depend on how much the annotators want governments to intervene their life. Similarly, prior work on check-worthiness points out the subjective nature of the task (e.g., <ref type="bibr" target="#b10">[11,</ref><ref type="bibr" target="#b12">13]</ref>) In order to focus on this problem, we apply the Manifold Mixup regularization proposed by Verma et al. <ref type="bibr" target="#b5">[6]</ref>. In particular, the Manifold Mixup trains neural networks on linear combinations of hidden representations of training examples, yielding flattened class-representations and smoother decision boundaries. Verma et al. <ref type="bibr" target="#b5">[6]</ref> demonstrate that their approach yields more robust solutions in image classification. In our work, we use BERT embeddings to represent tweets and then train a four-layer feed-forward network with the Manifold Mixup method.</p><p>In subtask 1-D, we apply a different approach than the other tasks due to its severely imbalanced label distribution. In particular, there are nine labels in subtask 1-D, but eight of them are about why a particular tweet is attention-worthy. In addition, the majority of the tweets have "not attention-worthy" label. Therefore, we first binarize labels by merging variants of attention-worthy labels into a single one, yielding only two labels: 1) attention-worthy and 2) not-attention-worthy. Subsequently, we under-sample negative class with the 1/5 ratio and train our Manifold Mixup model. Next, we build another model using eight labels for attentionworthy tweets. If a tweet is classified as attention-worthy, we use the second model to predict why it is attention-worthy. Otherwise, we do not use the second model and label it as "not attention-worthy". Note that we do not apply this two-step approach for other subtasks because they are already binary classification tasks.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Experiments</head><p>We first present statistics about the datasets and explain implementation details and our experimental setup in Section 3.1. Next, we explain how we selected our submissions in Section 3.2. Finally, we present the results of our submissions in Section 3.3.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Experimental Setup</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.1.">Implementation</head><p>In order to fine-tune and configure transformer models, we use PyTorch v.1.9.0<ref type="foot" target="#foot_2">3</ref> and Tensorflow<ref type="foot" target="#foot_3">4</ref> libraries. We import transformer models used in our experiments from Huggingface <ref type="foot" target="#foot_4">5</ref> . In addition, we use Google's SentencePiece library for machine translation <ref type="foot" target="#foot_5">6</ref> . We set the batch size to 32 in all our experiments with fine-tuned transformer models. In experiments on increasing dataset size using machine translation, we train the models for 5 epochs.</p><p>We implemented the Manifold Mixup <ref type="bibr" target="#b5">[6]</ref> method from scratch using PyTorch v.1.9.0, and set epoch and the batch size to 5 and 2, respectively. We use the following transformer models for each language: AraBERT.v02 <ref type="bibr" target="#b13">[14]</ref> for Arabic, RoBERTa-base-bulgarian <ref type="foot" target="#foot_6">7</ref> for Bulgarian, RobBERT <ref type="bibr" target="#b14">[15]</ref> for Dutch, the uncased version of BERT-base <ref type="foot" target="#foot_7">8</ref> for English, and DistilBERTurk<ref type="foot" target="#foot_8">9</ref> for Turkish.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.2.">Evaluation Metrics</head><p>We use the official metric for each subtask to evaluate and compare our methods. In particular, we use 𝐹 1 score of positive class in subtasks 1A and 1C, accuracy in subtask 1B, and weighted 𝐹 1 in subtask 1D.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.3.">Datasets</head><p>The shared task organizers provide train, development, test development, and test datasets for each language and subtask. The number of tweets for each label in train, development, test development, and test datasets in subtasks 1A, 1B, 1C, and 1D are presented in Table <ref type="table" target="#tab_0">1</ref>, 2, 3, and 4, respectively.</p><p>In our experiments during the development phase, we use the train and development datasets for training and validation of the Manifold Mixup model, respectively. In our experiments for fine-tuning various transformer models and increasing dataset size via machine translation, we combine train and development sets for each case and fine-tune models accordingly. In all experiments during the development phase, we use the development test dataset for testing. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Experimental Results in the Development Phase</head><p>We participate in all subtasks of Task 1 for five languages, yielding 20 different submissions. In addition, we explore three different approaches to determine our final submissions. Therefore, in order to reduce the complexity of experiments and meet the deadlines of the shared task, we first evaluate using various transformer models and increasing training data size in subtask 1A and 1C, respectively, on the respective test development datasets. Next, based on our experiments in subtask 1A and 1C, we compare three different approaches in all subtasks to determine our submissions for the official evaluation on the test data. We note that this is not an ideal way to select systems for submission, but we take this step to meet the deadlines.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.1.">Impact of Transformer Model on Detecting Check-Worthy Claims</head><p>In order to observe the impact of transformer models, we identify several transformer models available on the Huggingface platform based on their monthly download scores and evaluate their performance in subtask 1A. The number of transformer models we compare is 9, 3, 5, 13, and 3 for Arabic, Bulgarian, Dutch, English, and Turkish, respectively. We present the results in Table <ref type="table" target="#tab_4">5</ref>. Our observations based on our extensive experiments are as follows. Firstly, the results for English show the importance of evaluation metric to report the performance of systems. For instance, distilroberta-base-climate-f has the worst recall and 𝐹 1 scores, but achieves the best accuracy. Secondly, our results suggest that the text used in pre-training has a major impact on the models' performance. For instance, COVID-Twitter-BERT v1 achieves the best 𝐹 1 score among all English models. This should be because it is pretrained with tweets about COVID-19 while the tweets used in the shared task are also about COVID-19. Similarly, PubMedBERT, which is pretrained with research articles on PubMed, yields the second best results for English. However, we also observe some unexpected results in our experiments. For instance, AraBERT.v1, which is pre-trained on a smaller dataset compared to other variants of AraBERT (i.e., AraBERTv0.2-Twitter, AraBERTv0.2, and AraBERTv2), outperforms all Arabic specific models. In addition, while DarijaBERT is pre-trained with only texts in Moroccan Arabic, it outperforms all other Arabic specific models except AraBERT.v1. Furthermore, the best performing model in the Turkish dataset is the one with the smallest vocabulary size. Therefore, our results show that it is not easy to determine a pre-trained model by just comparing models' configurations and texts used in pre-training. We think that one of the reasons for having these unexpected results is the subjective nature of the task <ref type="bibr" target="#b10">[11]</ref>. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.2.">Impact of Training Data in Detecting Harmful Tweets</head><p>We use roberta-small-bulgarian 23 for Bulgarian, BERTje <ref type="bibr" target="#b19">[20]</ref> for Dutch, BERT-base-cased for English, and bert-base-turkish-sentiment-cased 24 for Turkish as language-specific transformer models. Table <ref type="table" target="#tab_5">6</ref> shows the performance of each model when a different dataset is machinetranslated to the corresponding language and respective language-specific model is fine-tuned with the original data and the machine-translated data. In this experiment, we are not able to report results for Arabic because we run into technical challenges (e.g., insufficient memory) preventing us to obtain results. We observe that increasing training data does not always improve the performance. In particular, using the original dataset for Turkish and Bulgarian yields the highest results while the performance of models usually increase in English and Dutch datasets by utilizing more labeled samples. The subjective nature of this task might be one of the reasons for having lower performance by using additional data from other languages. In particular, as each country is dealing with different social issues, it is likely that people living in different countries might disagree on what makes a message harmful for a society. For instance, Turkish annotators might be more sensitive to tweets about refugees compared to annotators for other languages because Turkey hosts nearly 3.8 million refugees, i.e., the largest refugee population worldwide 25 , and thereby, misinformation about refugees might have unpleasant consequences.</p><p>Another method to increase the traing data size is back-translation which does not deal with social differences across countries. Therefore, in our next experiment, we increase training data using various languages for back-translation. Again, we are not able to report results for Arabic due to technical challenges we encountered. In this experiment, we also use Spanish for back-translation of the Bulgarian dataset, but not the others to meet the deadlines of the lab. The results are shown in Table <ref type="table">7</ref> Table <ref type="table">7</ref> The impact of increasing train data using various languages for back-translation (BT). The best result for each language is written in bold. We again observe that we achieve the best result for Turkish when we use only the original dataset for training. However, back-translation improves the performance in the Dutch and English datasets. For Bulgarian, back-translation has a minimal impact. We do not observe a particular language which yields consistently higher results than others when used as the language for back-translation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Lang. used for BT Bulgarian</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.3.">Selecting Models for Submission</head><p>In order to select the models to submit for official ranking, we compare three different approaches for each subtask and language:</p><p>• Fine-tuning the best-performing pre-trained transformer model with the original dataset (FT-BP-TM). We use the best-performing pre-trained transformer model in our experiments in Section 3.2.1 for all subtasks except 1D. In particular, we fine-tune AraBERT.v1, RoBERTa-base-bulgarian, BERTje, COVID-Twitter-BERT v1, and BERTurk, for Arabic, Bulgarian, Dutch, English, and Turkish, respectively, using the corresponding datasets. • Fine-tuning a transformer model with back translation (FT-TM-BT). We use the best-performing model in our experiments in Section 3.2.2. In particular, we use Spanish, Turkish, Bulgarian, and English for back-translation to increase the size of Bulgarian, Dutch, English, and Turkish datasets, respectively. Note that the back-translation does not improve the performance in the Turkish dataset. However, the FT-BP-TM approach also uses the original dataset for fine-tuning. Therefore, in this approach, we increase the size of Turkish dataset using back-translation. In particular, we use English as the back-translation language because it yields the best results among others (See Table <ref type="table">7</ref>). • Manifold Mixup. We use the Manifold Mixup model explained in Section 2.3. Table <ref type="table" target="#tab_7">8</ref>, 9, 10, and 11, present results comparing three approaches for subtasks 1A, 1B, 1C, and 1D, respectively. Results for some cases are missing due to technical challenges we encountered and the limited time frame for submissions. In our submissions, we chose the best-performing method for each case and submitted our results accordingly. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">Results of Our Submissions</head><p>Table <ref type="table" target="#tab_11">12</ref> shows our results and ranking for each case we participated. We are ranked first in 1B Arabic and 1C Dutch. Focusing on subtasks with at least four participants, we are ranked second in Arabic 1A and Bulgarian 1A. We also observe that our rankings are generally higher in 1A than other subtasks. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Conclusion</head><p>In this paper, we present our participation in CLEF 2022 CheckThat! Lab's Task 1. We participated in all four subtasks of Task1 for Arabic, Bulgarian, Dutch, English, and Turkish, yielding 20 submissions in total. We explore which transformer model yields the highest performance, the impact of increasing training data size by machine translating datasets in other languages and back-translation, and the Manifold Mixup method proposed by Verma et al. <ref type="bibr" target="#b5">[6]</ref>. We are ranked first in subtask 1B for Arabic and in subtask 1C for Dutch. In addition, we are ranked second in subtask 1A for Arabic and Bulgarian.</p><p>Our observations based on our comprehensive experiments are as follows. Firstly, the performance of transformer models varies dramatically based on the text used for pre-training. Secondly, increasing training data does not always improve the performance. Therefore, it is important to consider biases existing in each dataset. Thirdly, we do not observe that a particular language used for back-translation yields consistently higher performance than others.</p><p>In the future, we plan to focus on the subjective nature of the tasks in this lab. In particular, we will first qualitatively analyze the datasets to better understand annotations. Subsequently, we plan to develop a model focusing on dealing with subjective annotations.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>Data &amp; Label Distribution for Each Language in Subtask 1A.</figDesc><table><row><cell cols="2">Language Label</cell><cell cols="4">Train Dev. Dev. Test Test</cell></row><row><cell>English</cell><cell cols="2">not check-worthy 1675 check-worthy 447</cell><cell>151 44</cell><cell>445 129</cell><cell>110 39</cell></row><row><cell>Bulgarian</cell><cell cols="2">not check-worthy 1493 check-worthy 378</cell><cell>141 36</cell><cell>413 106</cell><cell>73 57</cell></row><row><cell>Dutch</cell><cell>not check-worthy check-worthy</cell><cell>546 377</cell><cell>44 28</cell><cell>150 102</cell><cell>350 316</cell></row><row><cell>Turkish</cell><cell cols="2">not check-worthy 1995 check-worthy 422</cell><cell>177 45</cell><cell>427 84</cell><cell>289 14</cell></row><row><cell>Arabic</cell><cell cols="2">not check-worthy 1551 check-worthy 962</cell><cell>135 100</cell><cell>425 266</cell><cell>435 247</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2</head><label>2</label><figDesc>Data &amp; Label Distribution for Each Language in Task 1B.</figDesc><table><row><cell cols="2">Language Label</cell><cell cols="3">Train Dev. Dev. Test Test</cell></row><row><cell>English</cell><cell cols="2">not claim 3031 claim 292</cell><cell>276 31</cell><cell>828 82</cell></row><row><cell>Bulgarian</cell><cell>not claim claim</cell><cell>839 1871</cell><cell>74 177</cell><cell>217 519</cell></row><row><cell>Dutch</cell><cell cols="2">not claim 1021 claim 929</cell><cell>109 72</cell><cell>282 252</cell></row><row><cell>Turkish</cell><cell>not claim claim</cell><cell>828 1589</cell><cell>72 150</cell><cell>222 438</cell></row><row><cell>Arabic</cell><cell cols="2">not claim 1118 claim 2513</cell><cell>104 235</cell><cell>305 691</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 3</head><label>3</label><figDesc>Data &amp; Label Distribution for Each Language in Task 1C.</figDesc><table><row><cell cols="2">Language Label</cell><cell cols="4">Train Dev. Dev. Test Test</cell></row><row><cell>English</cell><cell cols="2">not harmful 3031 harmful 292</cell><cell>276 31</cell><cell>828 82</cell><cell>211 40</cell></row><row><cell>Bulgarian</cell><cell cols="2">not harmful 2341 harmful 248</cell><cell>209 18</cell><cell>636 67</cell><cell>314 11</cell></row><row><cell>Dutch</cell><cell cols="2">not harmful 1775 harmful 171</cell><cell>165 14</cell><cell>476 55</cell><cell>1145 215</cell></row><row><cell>Turkish</cell><cell cols="2">not harmful 1790 harmful 627</cell><cell>157 65</cell><cell>476 174</cell><cell>466 46</cell></row><row><cell>Arabic</cell><cell cols="2">not harmful 2946 harmful 678</cell><cell>276 60</cell><cell>805 189</cell><cell>1011</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 4</head><label>4</label><figDesc>Data &amp; Label Distribution in Training (Tr), Development (D), Test Development (TD), and Test (T) Sets for Each Language in Subtask 1D.</figDesc><table><row><cell></cell><cell></cell><cell cols="2">English</cell><cell></cell><cell>Bulgarian</cell><cell>Dutch</cell><cell>Turkish</cell><cell>Arabic</cell></row><row><cell>Label</cell><cell cols="4">Tr D TD T</cell><cell cols="2">Tr D TD T Tr D TD T Tr D TD T Tr D TD T</cell></row><row><cell cols="7">not interesting 2851 267 774 202 2341209636308 15451424051078 1698151466429</cell></row><row><cell>harmful</cell><cell cols="6">173 21 55 26 248 18 67 3 94 11 31 86 24 8 10 2 511 50 164 98</cell></row><row><cell cols="5">blame authorities 138 7 36 7</cell><cell cols="2">35 7 9 3 128 10 39 54 82 8 21 5 71 5 17 61</cell></row><row><cell>calls for action</cell><cell>48</cell><cell cols="3">3 12 4</cell><cell cols="2">4 1 3 1 27 5 11 22 15 1 5 4 36 6 19 53</cell></row><row><cell>discusses cure</cell><cell>42</cell><cell cols="3">3 15 5</cell><cell cols="2">56 12 11 8 5 1 2 13 38 5 14 6</cell></row><row><cell cols="2">discusses action 27</cell><cell>1</cell><cell>7</cell><cell>4</cell><cell cols="2">17 2 6 3 23 1 8 42 21 1 6 11 501 42</cell></row><row><cell cols="2">contains advice 12</cell><cell>2</cell><cell>4</cell><cell>1</cell><cell cols="2">6 1 3 1 38 2 10 12 4 1 5 0 79 3 20 48</cell></row><row><cell>asks question</cell><cell>5</cell><cell>1</cell><cell>1</cell><cell>1</cell><cell cols="2">1 0 0 1 84 6 26 29 16 2 5 7 98 14 17 47</cell></row><row><cell>other</cell><cell>25</cell><cell>1</cell><cell>5</cell><cell>1</cell><cell cols="2">2 1 1 1 5 1 1 20 6 1 1 1 8 2 5 27</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_4"><head>Table 5</head><label>5</label><figDesc>Results of Various Transformer Models in Detecting Check-Worthy Claims. For each language the best-performing case is shown in bold.</figDesc><table><row><cell></cell><cell>Model</cell><cell cols="3">Accuracy Precision Recall</cell><cell>𝐹 1</cell></row><row><cell></cell><cell>AraBERT.v1 [14]</cell><cell>0.413</cell><cell>0.390</cell><cell cols="2">0.932 0.550</cell></row><row><cell></cell><cell>DarijaBERT 10</cell><cell>0.499</cell><cell>0.420</cell><cell>0.789</cell><cell>0.548</cell></row><row><cell></cell><cell>Ara_DialectBERT 11</cell><cell>0.431</cell><cell>0.393</cell><cell>0.887</cell><cell>0.545</cell></row><row><cell>Arabic</cell><cell>arabert_c19 [16] AraBERTv0.2-Twitter [14] bert-base-arabic [17]</cell><cell>0.548 0.600 0.481</cell><cell>0.439 0.482 0.397</cell><cell>0.627 0.526 0.672</cell><cell>0.517 0.503 0.5</cell></row><row><cell></cell><cell>CAMeLBERT [18]</cell><cell>0.451</cell><cell>0.372</cell><cell>0.620</cell><cell>0.465</cell></row><row><cell></cell><cell>bert-base-arabertv2 12</cell><cell>0.534</cell><cell>0.399</cell><cell>0.417</cell><cell>0.408</cell></row><row><cell></cell><cell>bert-base-arabertv02 13</cell><cell>0.599</cell><cell>0.454</cell><cell>0.206</cell><cell>0.284</cell></row><row><cell>Bulg.</cell><cell>RoBERTa-base-bulgarian 7 RoBERTa-small-bulgarian-POS 14</cell><cell>0.776 0.485</cell><cell>0.451 0.259</cell><cell cols="2">0.443 0.447 0.820 0.394</cell></row><row><cell></cell><cell>bert-base-bg-cased [19]</cell><cell>0.784</cell><cell>0.448</cell><cell>0.245</cell><cell>0.317</cell></row><row><cell></cell><cell>BERTje [20]</cell><cell>0.619</cell><cell>0.516</cell><cell cols="2">0.941 0.666</cell></row><row><cell>Dutch</cell><cell>RobBERT [15] bert-base-nl-cased 15 bert-base-dutch-cased-finetuned-gem 16</cell><cell>0.650 0.559 0.638</cell><cell>0.549 0.469 0.582</cell><cell>0.764 0.676 0.382</cell><cell>0.639 0.554 0.461</cell></row><row><cell></cell><cell>COVID-Twitter-BERT v1 [21]</cell><cell>0.721</cell><cell>0.434</cell><cell cols="2">0.798 0.562</cell></row><row><cell></cell><cell>PubMedBERT [22]</cell><cell>0.745</cell><cell>0.447</cell><cell>0.558</cell><cell>0.496</cell></row><row><cell></cell><cell>BERT base model (uncased) [7]</cell><cell>0.634</cell><cell>0.343</cell><cell>0.689</cell><cell>0.458</cell></row><row><cell></cell><cell>LEGAL-BERT [23]</cell><cell>0.630</cell><cell>0.326</cell><cell>0.604</cell><cell>0.423</cell></row><row><cell></cell><cell>ALBERT Base v2 [24]</cell><cell>0.689</cell><cell>0.353</cell><cell>0.457</cell><cell>0.398</cell></row><row><cell>English</cell><cell>Bio_ClinicalBERT [25] BERT base model (cased) [7] bert-base-uncased-contracts 17</cell><cell>0.682 0.224 0.740</cell><cell>0.337 0.224 0.405</cell><cell>0.426 1.0 0.333</cell><cell>0.376 0.366 0.365</cell></row><row><cell></cell><cell>ALBERT Base v1 18</cell><cell>0.707</cell><cell>0.338</cell><cell>0.317</cell><cell>0.328</cell></row><row><cell></cell><cell>hateBERT [26]</cell><cell>0.770</cell><cell>0.476</cell><cell>0.232</cell><cell>0.312</cell></row><row><cell></cell><cell>COVID-Twitter-BERT v2 MNLI 19</cell><cell>0.667</cell><cell>0.265</cell><cell>0.271</cell><cell>0.268</cell></row><row><cell></cell><cell>RoBERTa base [27]</cell><cell>0.731</cell><cell>0.295</cell><cell>0.139</cell><cell>0.189</cell></row><row><cell></cell><cell>DistilRoBERTa-base-climate-f [28]</cell><cell>0.783</cell><cell>0.631</cell><cell>0.093</cell><cell>0.162</cell></row><row><cell>Turkish</cell><cell>BERTurk uncased 32K Vocabulary 20 BERTurk uncased 128K Vocabulary 21 BERTurk cased 128K Vocabulary 22</cell><cell>0.760 0.337 0.562</cell><cell>0.333 0.188 0.203</cell><cell cols="2">0.385 0.357 0.859 0.309 0.526 0.293</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_5"><head>Table 6</head><label>6</label><figDesc>Impact of increasing training data by machine-translating another dataset in a different language in detecting harmful tweets. We report 𝐹 1 score for each case. The best result for each language is written in bold.</figDesc><table><row><cell cols="5">Machine-Translated Data Bulgarian Dutch English Turkish</cell></row><row><cell>None</cell><cell>0.26</cell><cell>0.26</cell><cell>0.11</cell><cell>0.55</cell></row><row><cell>Bulgarian</cell><cell>-</cell><cell>0.39</cell><cell>0.23</cell><cell>0.13</cell></row><row><cell>Dutch</cell><cell>0.23</cell><cell>-</cell><cell>0.23</cell><cell>0.53</cell></row><row><cell>English</cell><cell>0.21</cell><cell>0.39</cell><cell>-</cell><cell>0.48</cell></row><row><cell>Turkish</cell><cell>0.19</cell><cell>0.25</cell><cell>0.25</cell><cell>-</cell></row><row><cell>Arabic</cell><cell>0.16</cell><cell>0.27</cell><cell>0.21</cell><cell>0.47</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_7"><head>Table 8</head><label>8</label><figDesc>Development Test Results in Subtask 1A for 𝐹 1 Score for the Positive Class</figDesc><table><row><cell>Model</cell><cell cols="5">Arabic Bulgarian Dutch English Turkish</cell></row><row><cell>Manifold Mixup</cell><cell>0.14</cell><cell>0</cell><cell>0.58</cell><cell>0.48</cell><cell>0.22</cell></row><row><cell>FT-TM-BT</cell><cell>-</cell><cell>0.42</cell><cell>0.64</cell><cell>0.48</cell><cell>0.40</cell></row><row><cell>FT-BP-TM</cell><cell>0.47</cell><cell>0.47</cell><cell>0.57</cell><cell>0.55</cell><cell>0.40</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_8"><head>Table 9</head><label>9</label><figDesc>Development Test Results in Subtask 1B for 𝐹 1 Score for the Positive Class</figDesc><table><row><cell>Model</cell><cell cols="5">Arabic Bulgarian Dutch English Turkish</cell></row><row><cell>Manifold Mixup</cell><cell>0.76</cell><cell>0.75</cell><cell>0.49</cell><cell>0.67</cell><cell>0.63</cell></row><row><cell>FT-TM-BT</cell><cell>-</cell><cell>0.86</cell><cell>0.73</cell><cell>-</cell><cell>0.78</cell></row><row><cell>FT-BP-TM</cell><cell>-</cell><cell>0.87</cell><cell>0.72</cell><cell>0.76</cell><cell>0.78</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_9"><head>Table 10</head><label>10</label><figDesc>Development Test Results in Subtask 1C for 𝐹 1 Score for the Positive Class</figDesc><table><row><cell>Model</cell><cell cols="5">Arabic Bulgarian Dutch English Turkish</cell></row><row><cell>Manifold Mixup</cell><cell>0.64</cell><cell>0</cell><cell>0.12</cell><cell>0.18</cell><cell>0.30</cell></row><row><cell>FT-TM-BT</cell><cell>0.12</cell><cell>0.27</cell><cell>0.41</cell><cell>0.30</cell><cell>0.54</cell></row><row><cell>FT-BP-TM</cell><cell>-</cell><cell>0.24</cell><cell>0.33</cell><cell>0.35</cell><cell>0.52</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_10"><head>Table 11</head><label>11</label><figDesc>Development Test Results in Subtask 1D for Average Weighted 𝐹 1 . We do not have results for FT-BP-TM case in this experiment.</figDesc><table><row><cell>Model</cell><cell cols="5">Arabic Bulgarian Dutch English Turkish</cell></row><row><cell>Manifold Mixup</cell><cell>0.65</cell><cell>0.80</cell><cell>0.65</cell><cell>0.78</cell><cell>0.79</cell></row><row><cell>FT-TM-BT</cell><cell>-</cell><cell>0.33</cell><cell>0.31</cell><cell>-</cell><cell>0.28</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_11"><head>Table 12</head><label>12</label><figDesc>Results for our official submissions. Results show 𝐹 1 , accuracy, 𝐹 1 , and weighted 𝐹 1 scores for tasks 1A, 1B, 1C, and 1D, respectively (i.e., the official evaluation metrics).</figDesc><table><row><cell cols="3">Task Language Submitted Model</cell><cell>Rank</cell><cell>Score</cell></row><row><cell></cell><cell>Arabic</cell><cell>FT-BP-TM</cell><cell>2 (out of 5)</cell><cell>0.495</cell></row><row><cell></cell><cell>Bulgarian</cell><cell>FT-BP-TM</cell><cell>2 (out of 6)</cell><cell>0.542</cell></row><row><cell>1A</cell><cell>Dutch</cell><cell>FT-TM-BT</cell><cell>3 (out of 6)</cell><cell>0.534</cell></row><row><cell></cell><cell>English</cell><cell>FT-BP-TM</cell><cell cols="2">4 (out of 14) 0.561</cell></row><row><cell></cell><cell>Turkish</cell><cell>FT-TM-BT</cell><cell>3 (out of 5)</cell><cell>0.118</cell></row><row><cell></cell><cell>Arabic</cell><cell>Manifold Mixup</cell><cell>1 (out of 4)</cell><cell>0.570</cell></row><row><cell></cell><cell>Bulgarian</cell><cell>FT-BP-TM</cell><cell>2 (out of 3)</cell><cell>0.742</cell></row><row><cell>1B</cell><cell>Dutch</cell><cell>FT-TM-BT</cell><cell>2 (out of 3)</cell><cell>0.658</cell></row><row><cell></cell><cell>English</cell><cell>FT-BP-TM</cell><cell cols="2">9 (out of 10) 0.641</cell></row><row><cell></cell><cell>Turkish</cell><cell>FT-TM-BT</cell><cell>4 (out of 4)</cell><cell>0.729</cell></row><row><cell></cell><cell>Arabic</cell><cell>Manifold Mixup</cell><cell>2 (out of 3)</cell><cell>0.268</cell></row><row><cell></cell><cell>Bulgarian</cell><cell>FT-TM-BT</cell><cell>2 (out of 3)</cell><cell>0.054</cell></row><row><cell>1C</cell><cell>Dutch</cell><cell>FT-TM-BT</cell><cell>1 (out of 3)</cell><cell>0.147</cell></row><row><cell></cell><cell>English</cell><cell>FT-BP-TM</cell><cell cols="2">5 (out of 12) 0.329</cell></row><row><cell></cell><cell>Turkish</cell><cell>FT-TM-BT</cell><cell>3 (out of 5)</cell><cell>0.262</cell></row><row><cell></cell><cell>Arabic</cell><cell>Manifold Mixup</cell><cell>2 (out of 2)</cell><cell>0.184</cell></row><row><cell></cell><cell>Bulgarian</cell><cell>Manifold Mixup</cell><cell>2 (out of 3)</cell><cell>0.887</cell></row><row><cell>1D</cell><cell>Dutch</cell><cell>Manifold Mixup</cell><cell>2 (out of 3)</cell><cell>0.694</cell></row><row><cell></cell><cell>English</cell><cell>Manifold Mixup</cell><cell>4 (out of 7)</cell><cell>0.670</cell></row><row><cell></cell><cell>Turkish</cell><cell>Manifold Mixup</cell><cell>3 (out of 3)</cell><cell>0.806</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">We could not participate for Spanish due to a technical problem we encountered during development.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">https://github.com/Carnagie/manifold-mixup-text-classification</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">https://pytorch.org/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_3">https://www.tensorflow.org/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_4">https://huggingface.co/docs/transformers/index</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_5">https://github.com/google/sentencepiece</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="7" xml:id="foot_6">https://huggingface.co/iarfmoose/roberta-base-bulgarian</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="8" xml:id="foot_7">https://huggingface.co/bert-base-uncased</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="9" xml:id="foot_8">https://huggingface.co/dbmdz/distilbert-base-turkish-cased</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="10" xml:id="foot_9">https://huggingface.co/Kamel/DarijaBERT</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="11" xml:id="foot_10">https://huggingface.co/MutazYoune/Ara_DialectBERT</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="12" xml:id="foot_11">https://huggingface.co/aubmindlab/bert-base-arabertv2</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="13" xml:id="foot_12">https://huggingface.co/aubmindlab/bert-base-arabertv02</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Susceptibility to misinformation about covid-19 around the world</title>
		<author>
			<persName><forename type="first">J</forename><surname>Roozenbeek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">R</forename><surname>Schneider</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Dryhurst</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Kerr</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">L</forename><surname>Freeman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Recchia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">M</forename><surname>Van Der Bles</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Van Der Linden</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Royal Society open science</title>
		<imprint>
			<biblScope unit="volume">7</biblScope>
			<biblScope unit="page">201199</biblScope>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<ptr target="https://covid19asi.saglik.gov.tr/?_Dil=2" />
		<title level="m">Republic of turkey ministry of health covid-19 vaccination information platform</title>
				<imprint>
			<date type="published" when="2022-06-22">2022. 2022-06-22</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Overview of the CLEF-2022 CheckThat! lab task 1 on identifying relevant claims in tweets</title>
		<author>
			<persName><forename type="first">P</forename><surname>Nakov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Barrón-Cedeño</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Da San Martino</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Alam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Míguez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Caselli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Kutlu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Zaghouani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Shaar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Mubarak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Nikolov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><forename type="middle">S</forename><surname>Kartal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Beltrán</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Working Notes of CLEF 2022-Conference and Labs of the Evaluation Forum, CLEF &apos;2022</title>
				<editor>
			<persName><forename type="first">N</forename><surname>Faggioli</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Guglielmo Andd Ferro</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Hanbury</surname></persName>
		</editor>
		<editor>
			<persName><surname>Potthast</surname></persName>
		</editor>
		<meeting><address><addrLine>Bologna, Italy</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">The CLEF-2022 CheckThat! Lab on fighting the covid-19 infodemic and fake news detection</title>
		<author>
			<persName><forename type="first">P</forename><surname>Nakov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Barrón-Cedeño</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Da San Martino</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Alam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">M</forename><surname>Struß</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Mandl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Míguez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Caselli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Kutlu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Zaghouani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Shaar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">K</forename><surname>Shahi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Mubarak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Nikolov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Babulkov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><forename type="middle">S</forename><surname>Kartal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Beltrán</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in Information Retrieval</title>
				<editor>
			<persName><forename type="first">M</forename><surname>Hagen</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Verberne</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">C</forename><surname>Macdonald</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">C</forename><surname>Seifert</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">K</forename><surname>Balog</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">K</forename><surname>Nørvåg</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">V</forename><surname>Setty</surname></persName>
		</editor>
		<meeting><address><addrLine>Cham</addrLine></address></meeting>
		<imprint>
			<publisher>Springer International Publishing</publisher>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="416" to="428" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Overview of the CLEF-2022 CheckThat! lab on fighting the COVID-19 infodemic and fake news detection</title>
		<author>
			<persName><forename type="first">P</forename><surname>Nakov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Barrón-Cedeño</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Da San Martino</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Alam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">M</forename><surname>Struß</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Mandl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Míguez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Caselli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Kutlu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Zaghouani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Shaar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">K</forename><surname>Shahi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Mubarak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Nikolov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Babulkov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><forename type="middle">S</forename><surname>Kartal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Beltrán</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Wiegand</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Siegel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Köhler</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 13th International Conference of the CLEF Association: Information Access Evaluation meets Multilinguality, Multimodality, and Visualization, CLEF &apos;2022</title>
				<editor>
			<persName><forename type="first">A</forename><surname>Barrón-Cedeño</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">G</forename><surname>Da San Martino</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Degli</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">F</forename><surname>Esposti</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">C</forename><surname>Sebastiani</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">G</forename><surname>Macdonald</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Pasi</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Hanbury</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">G</forename><surname>Potthast</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">F</forename><surname>Faggioli</surname></persName>
		</editor>
		<editor>
			<persName><surname>Nicola</surname></persName>
		</editor>
		<meeting>the 13th International Conference of the CLEF Association: Information Access Evaluation meets Multilinguality, Multimodality, and Visualization, CLEF &apos;2022<address><addrLine>Bologna, Italy</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Manifold mixup: Better representations by interpolating hidden states</title>
		<author>
			<persName><forename type="first">V</forename><surname>Verma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Lamb</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Beckham</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Najafi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Mitliagkas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Lopez-Paz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Bengio</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Machine Learning</title>
				<meeting><address><addrLine>PMLR</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="6438" to="6447" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Bert: Pre-training of deep bidirectional transformers for language understanding</title>
		<author>
			<persName><forename type="first">J</forename><surname>Devlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M.-W</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Toutanova</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</title>
		<title level="s">Long and Short Papers</title>
		<meeting>the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="4171" to="4186" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<author>
			<persName><forename type="first">S</forename><surname>Shaar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Hasanain</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Hamdan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><forename type="middle">S</forename><surname>Ali</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Haouari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Nikolov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Kutlu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><forename type="middle">S</forename><surname>Kartal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Alam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Da San</surname></persName>
		</author>
		<author>
			<persName><surname>Martino</surname></persName>
		</author>
		<title level="m">Overview of the clef-2021 checkthat! lab task 1 on check-worthiness estimation in tweets and political debates</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
	<note>CLEF (Working Notes)</note>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<author>
			<persName><forename type="first">E</forename><surname>Williams</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Rodrigues</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Tran</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2107.05684</idno>
		<title level="m">Accenture at checkthat! 2021: Interesting claim identification and ranking with contextually sensitive lexical training data augmentation</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Tobb etu at checkthat! 2021: Data engineering for detecting check-worthy claims</title>
		<author>
			<persName><forename type="first">M</forename><surname>Zengin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Kartal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Kutlu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CEUR Workshop Proceedings</title>
				<imprint>
			<publisher>CEUR-WS</publisher>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Re-think before you share: A comprehensive study on prioritizing check-worthy claims</title>
		<author>
			<persName><forename type="first">Y</forename><forename type="middle">S</forename><surname>Kartal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Kutlu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Computational Social Systems</title>
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Neural weakly supervised fact checkworthiness detection with contrastive sampling-based ranking loss</title>
		<author>
			<persName><forename type="first">C</forename><surname>Hansen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Hansen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">G</forename><surname>Simonsen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Lioma</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CLEF (Working Notes)</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Trclaim-19: The first collection for turkish check-worthy claim detection with annotator rationales</title>
		<author>
			<persName><forename type="first">Y</forename><forename type="middle">S</forename><surname>Kartal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Kutlu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 24th Conference on Computational Natural Language Learning</title>
				<meeting>the 24th Conference on Computational Natural Language Learning</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="386" to="395" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Arabert: Transformer-based model for arabic language understanding</title>
		<author>
			<persName><forename type="first">W</forename><surname>Antoun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Baly</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Hajj</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">LREC 2020 Workshop Language Resources and Evaluation Conference 11-16</title>
				<imprint>
			<date type="published" when="2020-05">May 2020</date>
			<biblScope unit="page">9</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">RobBERT: a Dutch RoBERTa-based Language Model</title>
		<author>
			<persName><forename type="first">P</forename><surname>Delobelle</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Winters</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Berendt</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2020.findings-emnlp.292</idno>
		<ptr target="https://www.aclweb.org/anthology/2020.findings-emnlp.292.doi:10.18653/v1/2020.findings-emnlp.292" />
	</analytic>
	<monogr>
		<title level="m">Findings of the Association for Computational Linguistics: EMNLP 2020, Association for Computational Linguistics</title>
				<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="3255" to="3265" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<monogr>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">S H</forename><surname>Ameur</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Aliane</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2105.03143</idno>
		<title level="m">Aracovid19-mfh: Arabic covid-19 multi-label fake news and hate speech detection dataset</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">KUISAIL at SemEval-2020 task 12: BERT-CNN for offensive speech identification in social media</title>
		<author>
			<persName><forename type="first">A</forename><surname>Safaya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Abdullatif</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Yuret</surname></persName>
		</author>
		<ptr target="https://www.aclweb.org/anthology/2020.semeval-1.271" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Fourteenth Workshop on Semantic Evaluation, International Committee for Computational Linguistics</title>
				<meeting>the Fourteenth Workshop on Semantic Evaluation, International Committee for Computational Linguistics<address><addrLine>Barcelona (online</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="2054" to="2059" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">The interplay of variant, size, and task type in Arabic pre-trained language models</title>
		<author>
			<persName><forename type="first">G</forename><surname>Inoue</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Alhafni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Baimukan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Bouamor</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Habash</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Sixth Arabic Natural Language Processing Workshop, Association for Computational Linguistics</title>
				<meeting>the Sixth Arabic Natural Language Processing Workshop, Association for Computational Linguistics<address><addrLine>Kyiv, Ukraine (Online</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Load what you need: Smaller versions of mutlilingual bert</title>
		<author>
			<persName><forename type="first">A</forename><surname>Abdaoui</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Pradel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Sigel</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">SustaiNLP / EMNLP</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<monogr>
		<author>
			<persName><forename type="first">W</forename><surname>Vries</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Van Cranenburgh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Bisazza</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Caselli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">V</forename><surname>Noord</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Nissim</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1912.09582</idno>
		<ptr target="http://arxiv.org/abs/1912.09582" />
		<title level="m">BERTje: A Dutch BERT Model</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<monogr>
		<author>
			<persName><forename type="first">M</forename><surname>Müller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Salathé</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">E</forename><surname>Kummervold</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2005.07503</idno>
		<title level="m">Covid-twitter-bert: A natural language processing model to analyse covid-19 content on twitter</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b21">
	<monogr>
		<title level="m" type="main">Domain-specific language model pretraining for biomedical natural language processing</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Gu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Tinn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Cheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lucas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Usuyama</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Naumann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Gao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Poon</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2007.15779</idno>
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">LEGAL-BERT: The muppets straight out of law school</title>
		<author>
			<persName><forename type="first">I</forename><surname>Chalkidis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Fergadiotis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Malakasiotis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Aletras</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Androutsopoulos</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2020.findings-emnlp.261</idno>
	</analytic>
	<monogr>
		<title level="m">Findings of the Association for Computational Linguistics: EMNLP 2020, Association for Computational Linguistics</title>
				<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="2898" to="2904" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<monogr>
		<title level="m" type="main">ALBERT: A lite BERT for self-supervised learning of language representations</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Lan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Goodman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Gimpel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Sharma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Soricut</surname></persName>
		</author>
		<idno>CoRR abs/1909.11942</idno>
		<ptr target="http://arxiv.org/abs/1909.11942.arXiv:1909.11942" />
		<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<monogr>
		<author>
			<persName><forename type="first">E</forename><surname>Alsentzer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">R</forename><surname>Murphy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Boag</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W.-H</forename><surname>Weng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Jin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Naumann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Mcdermott</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1904.03323</idno>
		<title level="m">Publicly available clinical bert embeddings</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">HateBERT: Retraining BERT for abusive language detection in English</title>
		<author>
			<persName><forename type="first">T</forename><surname>Caselli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Basile</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Mitrović</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Granitzer</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2021.woah-1.3</idno>
		<ptr target="https://aclanthology.org/2021.woah-1.3.doi:10.18653/v1/2021.woah-1.3" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021), Association for Computational Linguistics</title>
				<meeting>the 5th Workshop on Online Abuse and Harms (WOAH 2021), Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="17" to="25" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<monogr>
		<title level="m" type="main">Roberta: A robustly optimized BERT pretraining approach</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Ott</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Goyal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Du</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Joshi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Levy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lewis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zettlemoyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Stoyanov</surname></persName>
		</author>
		<idno>CoRR abs/1907.11692</idno>
		<ptr target="http://arxiv.org/abs/1907.11692.arXiv:1907.11692" />
		<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<monogr>
		<author>
			<persName><forename type="first">N</forename><surname>Webersinke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Kraus</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Bingler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Leippold</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2110.12010</idno>
		<title level="m">Climatebert: A pretrained language model for climate-related text</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
