<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Aschern at CheckThat! 2021: Lambda-Calculus of Fact-Checked Claims</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Anton</forename><surname>Chernyavskiy</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">HSE University</orgName>
								<address>
									<settlement>Moscow</settlement>
									<country key="RU">Russia</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Dmitry</forename><surname>Ilvovsky</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">HSE University</orgName>
								<address>
									<settlement>Moscow</settlement>
									<country key="RU">Russia</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Preslav</forename><surname>Nakov</surname></persName>
							<affiliation key="aff1">
								<orgName type="department">Qatar Computing Research Institute</orgName>
								<orgName type="institution">HBKU</orgName>
								<address>
									<settlement>Doha</settlement>
									<country key="QA">Qatar</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Aschern at CheckThat! 2021: Lambda-Calculus of Fact-Checked Claims</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">5F587EA01A7D214529BAA62729A55D8D</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T20:47+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>fact-checking</term>
					<term>lexical similarity</term>
					<term>semantic similarity</term>
					<term>sentence-BERT</term>
					<term>TF.IDF</term>
					<term>LambdaMART</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>We describe our system for the CLEF 2021 CheckThat! Lab Task 2 Subtask A on detecting previously factchecked claims. We developed a pipeline using TF.IDF, sentence-BERT fine-tuned on the training data, and reranking using LambdaMART and the predicted similarity scores and positions in the ranked list as features. We examined the quality of each model on the validation set and analyzed its contribution to the final result using the trained LambdaMART. The official evaluation ranked our system 1 𝑠𝑡 by a wide margin over other participants and the organizers' baseline.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Social media provide an easy way to share information online. However, this also causes problems since some users may share false claims. Such claims are often sensational, which further contributes to their fast spread. One possible solution is to fact-check suspicious claims, but this is a difficult and time-consuming task when done manually. Even if the process is automated, it is impossible to fact-check every claim on the web. One could also ask: is it really necessary to fact-check everything? For example, if we aim to limit the spread of some false claim, then it is enough to fact-check only one post where it is present. Then, we can try to find posts that repeat that claim.</p><p>The CLEF 2021 CheckThat! Lab Task 2 <ref type="bibr" target="#b0">[1]</ref> aims at solving that problem: given a tweet it asks to match it against a database of previously fact-checked claims. The participating systems are asked to rank the list of previously fact-checked claims according to their relevance, so that more useful ones are ranked higher. The task features two datasets for claims collected from tweets and from political debates, and it is offered in English and in Arabic. Below, we describe the system that we built for the English version of the dataset collected from tweets (Subtask 2A). At the core of our system is the sentence-BERT model <ref type="bibr" target="#b1">[2]</ref>, which was originally pre-trained on the Semantic Textual Similarity benchmark (STSb) data. We further fine-tuned it on the task data and then applied LambdaMART <ref type="bibr" target="#b2">[3]</ref> to rerank the top-20 results. As features, LambdaMART uses the relevance scores and ranks predicted by sentence-BERT and TF.IDF.</p><p>CLEF 2021 -Conference and Labs of the Evaluation Forum, September 21-24, 2021, Bucharest, Romania aschernyavskiy 1@edu.hse.ru (A. Chernyavskiy); dilvovsky@hse.ru (D. Ilvovsky); pnakov@hbku.edu.qa (P. Nakov)</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related Work</head><p>There are many studies that have addressed disinformation and misinformation <ref type="bibr" target="#b3">[4,</ref><ref type="bibr" target="#b4">5,</ref><ref type="bibr" target="#b5">6,</ref><ref type="bibr" target="#b6">7,</ref><ref type="bibr" target="#b7">8,</ref><ref type="bibr" target="#b8">9,</ref><ref type="bibr" target="#b9">10]</ref>. However, there are only a few directly related to our task. In the ClaimBuster system <ref type="bibr" target="#b10">[11]</ref>, the problem was mentioned as part of the general fact-checking pipeline, but no evaluation of its solution was provided. The ClaimKG dataset was presented in <ref type="bibr" target="#b11">[12]</ref>, where claims from different fact-checking websites can be retrieved by some keywords using a knowledge graph. The original task formulation together with a dataset aimed to address the problem of detecting previously fact-checked claims were presented in <ref type="bibr" target="#b12">[13]</ref>, where the authors used data from Snopes and PolitiFact. They proposed a solution, which combined Elasticsearch, sentence-BERT, and reranking using RankSVM. Their dataset was then used within the framework of the CLEF 2020 CheckThat! Lab Task 2 <ref type="bibr" target="#b13">[14]</ref>. Then, an expanded and cleaned-up dataset consisting of tweets was reused at the CLEF CheckThat! Lab 2021 Task 2A <ref type="bibr" target="#b0">[1]</ref>.</p><p>The winning team of the CLEF 2020 CheckThat! Lab Task 2 was Buster.AI <ref type="bibr" target="#b14">[15]</ref>, who proposed a solution based on RoBERTa, adversarial hard negative examples, and additional training on external data from FEVER, SciFact, and the Liar datasets. Team UNIPI-NLE <ref type="bibr" target="#b15">[16]</ref> performed a cascade training of sentence-BERT models with the preliminary use of Elasticsearch to prune the list of possible candidates. Team UB_ET <ref type="bibr" target="#b16">[17]</ref> applied DPH and LambdaMART over querydependent features. Other participants also used Elasticsearch and sentence-BERT as well as Terrier, KD search, Universal Sentence Encoder (USE), TF.IDF, and BM25 to perform retrieval, and to compute similarity scores <ref type="bibr" target="#b13">[14]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Dataset</head><p>We use the data presented within the CLEF CheckThat! Lab 2020 Task 2 Subtask A for English. The verified claims (VerClaims) database contains 13,825 claims. There are 1,000 positively labeled &lt;Claim, VerClaim&gt; pairs in the training set, and 200 input claims in the validation and the test sets. For each VerClaim, there is some additional information coming from the article that fact-checkers wrte about the claim: title, subtitle, author, and date of verification.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Evaluation Measures</head><p>The official evaluation measure is MAP@𝑘 for 𝑘 " 5 (Mean Average Precision for the top-𝑘 VerClaims in the ranked list). Additional evaluation measures computed by the scoring script include MAP, MRR (Mean Reciprocal Rank), and P@𝑘 (precision for the top-𝑘 in the same range) for 𝑘 P t1, 3, 10, 𝑎𝑙𝑙u.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Method</head><p>Our pipeline is similar to that in <ref type="bibr" target="#b12">[13]</ref>, but we changed and improved its components. It is presented schematically in Figure <ref type="figure" target="#fig_1">1</ref>. First, we independently calculate lexical and semantic similarity scores between the input Claim and each VerClaim using TF.IDF and sentence-BERT <ref type="bibr" target="#b1">[2]</ref>, respectively.  We calculate each score for three possible input options: (i) &lt;Claim, VerClaim&gt; (ii) &lt;Claim + Title, VerClaim&gt;, (iii) &lt;Claim + Title + Subtitle, VerClaim&gt;. Here "+" denotes concatenation using [SEP] as a separator. Thus, we obtain six independent models. After that, we use LambdaMART <ref type="bibr" target="#b2">[3]</ref> to re-rank the top-20 results selected by sentence-BERT trained on the input &lt;Claim + Title, VerClaim&gt;. Here, the features are predicted relevance scores and reciprocal ranks for each of the six models.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1.">Lexical Similarity</head><p>To estimate the lexical similarity, we use TF.IDF, as a base model. Since TF.IDF depends on the number of words in the document/corpus, we tried to apply some data-specific pre-processing, e.g., clean up the input text by removing URLs, but this did not improve the results. Thus, our final lexical similarity approach converts the input to lowercase and computes embeddings accounting for the frequency of terms on a logarithmic scale tf " 1 `logptfq. Then, we calculate the similarities of the input Claim and VerClaims as the cosine similarity between the corresponding embeddings. Finally, we use these scores in the re-ranker.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2.">Semantic Similarity</head><p>Our TF.IDF approach relies on word matching. However, there are positive examples in the dataset where such word matching score would be very low, e.g., when comparing the Claim "More Fake News. This was photoshopped, obviously, but the wind was strong and the hair looks good? Anything to demean!" to the VerClaim "The White House posted and then deleted an unflattering photograph of President Trump that displayed marked facial coloration." Thus, we also use sentence-BERT as an additional semantic similarity. This model is based on Siamese networks, where each component (BERT) independently computes embeddings for the Claim and for the VerClaim, and then the similarity between them is calculated using a cosine.  Since our task is an instance of the general task of determining the semantic similarity of two pieces of text, we fine-tune the model from the checkpoint that was trained on the STSb (Semantic Sentence Similarity benchmark).</p><p>Note, that using the sentence-BERT model to obtain sentence embeddings without any taskspecific fine-tuning leads to the bad results for this task <ref type="bibr" target="#b12">[13]</ref>. However, training with the MSE loss function is difficult due to the large class imbalance. Here, 𝑀 𝑆𝐸 " ř p𝑦 𝑖 ´cosp𝑓 pcq, 𝑓 pvcqqq 2 , where 𝑓 is the sentence-BERT encoder, 𝑦 𝑖 is the relevance score, and it equals 1 for positive &lt;Claim (c), VerClaim (vc)&gt; pairs, and 0 for negative ones. Note that there are many more negative pairs than positive ones. At the same time, if the triplets are composed of these pairs, then the problem of hard negative mining arises (the search for complex negative examples). Therefore, we apply Multiple Negatives Ranking (MNR) loss <ref type="bibr" target="#b17">[18]</ref>, which uses only positively marked pairs during training. To this end, it contrasts the similarities between the input Claim and the relevant VerClaim vs. Claim and all other VerClaims in the batch using softmax (Figure <ref type="figure">2</ref>). This allows to simultaneously maximize the relevance score for the input positive pair and to minimize the scores for all other possible pairs in the batch.</p><p>It was proved, that the MNR loss function selects hard negatives by itself by using a temperature parameter in the softmax <ref type="bibr" target="#b18">[19]</ref>. However, the model requires large batch sizes, since in order to find such an example in a batch, it must be present there. To overcome this limitation, we manually form the input training sequence at each epoch using the current model as follows. We choose an arbitrary &lt;Claim, VerClaim&gt; anchor pair from the training set (which contains only positive pairs). Then we select the top-𝑘 (𝑘 is a hyperparameter) of the closest Claims from the unused ones and we add them paired with the relevant VerClaims to the result sequence along with the anchor pair. The process ends when there are no unused Claims left.</p><p>We additionally make the MNR loss symmetric to be able to contrast to the positive pair all possible negative pairs: (Claim, VerClaim 𝑖 ) and (Claim 𝑖 , VerClaim).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.3.">Reranking</head><p>At the reranking stage, we apply the LambdaMART model, which is based on Gradient Boosted Decision Trees. This is a learning-to-rank approach, which achieved the best results in different tasks, e.g., in the Yahoo! Learning to Rank Challenge (2011) <ref type="bibr" target="#b19">[20]</ref>. To train the LambdaMART model, we use a 12-dimensional vector of features = 2 types of models * 3 types of input * 2 features (estimated relevance score and position in the ranked list of VerClaims).</p><p>To implement such a stacking approach, in order to prevent LabmdaMART from "peeping" into the labels encoded in the features, we use only the part of the training data that was not available when training sentence-BERT. In this part, for each claim, we select the top-50 candidates using a single model that achieved the best results on the validation set (it turned out to be the sentence-BERT model, trained on the input &lt;Claim + Title, VerClaim&gt;; see <ref type="bibr">Section 7)</ref>. Then, we supplement each of the resulting sets with the relevant VerClaim, if it was missing. Then, we train the model using all possible triplets that can be constructed in each set using the Claim as the anchor. At the inference stage, we only take the top-20 sentence-BERT results to minimize the final error. Note that we used LambdaMART, which can adjust the training procedure to optimize a specific evaluation measure (unlike RankSVM). To this end, the optimizer takes into account how much gain in the measure can be obtained by swapping two candidates from the triplet in the ranked list, while leaving the others untouched. In this case, we tuned the model to the main competition quality metric MAP@5.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Experimental Setup</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.1.">Data Split</head><p>To train sentence-BERT, we took the first 800 claims from the training dataset, and we used the remaining 200 claims for validation. Then, out of those 200, we took 170 to train LambdaMART, and we validated its quality against the remaining 30 claims.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.2.">Parameter Settings</head><p>We used the Sentence-transformers framework<ref type="foot" target="#foot_0">1</ref> to train sentence-BERT models. We used the pre-trained stsb-bert-base for the input &lt;Claim, Verclaim&gt;, and stsb-bert-large for two other variants. We used the following hyperparameter values: learning rate of 1e-5, batch size of 6, training for 20 epochs, and the default optimizer with the number of warm up steps equal to 10% of the total number of training steps. For the MNR loss, we set the temperature to 0.05 and 𝑘 to 7 to form the input sequence. We validated the model for each epoch, and we chose the best checkpoint. We used the LambdaMART implementation from the Python learning-to-rank toolkit, <ref type="foot" target="#foot_1">2</ref> and the following values for the hyperparameters: number of boosting stages of 1,500, maximum tree depth of 3, learning rate of 0.02, maximum leaf nodes of 12, fraction of queries to use for fitting the base learners of 0.3, fraction of features to use for selecting the best split of 0.3. We kept the best checkpoint as evaluated on the validation set. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.">Experiments and Results</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.1.">Lexical Similarity</head><p>A comparison of approaches to estimate the lexical similarity for each of the three input types is presented in Table <ref type="table" target="#tab_0">1</ref>. Here, we applied the source BM25 Okapi algorithm <ref type="bibr" target="#b20">[21]</ref> in addition to Elasticsearch, where it is used to build the index. We found that our best TF.IDF approach, which used Title and Subtitle to calculate scores, outperformed BM25 and Elasticsearch on MAP@5. We also evaluated TF.IDF with the standard tf term calculation, but the results were worse. The results also show the importance of using the title as an additional input.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.2.">Semantic Similarity</head><p>The results on the official development set for sentence-BERT are presented in Table <ref type="table" target="#tab_1">2</ref>. Note that we used the base model for the input &lt;Claim&gt;, and the large variant in the other cases.</p><p>The base model achieved a MAP@5 of 0.855 on the input &lt;Claim + Title, VerClaim&gt;.</p><p>Therefore, the gain from the use of the Title is not as large as in the case of the lexical component. Although the best quality on the development set was achieved by the model trained on the input &lt;Claim + Title + Subtitle, VerClaim&gt;, we chose the one trained on &lt;Claim + Title, VerClaim&gt; as the core model, as it achieved MAP@5 of 0.772 vs. 0.739 on our validation sample. Moreover, the training data from which we took part for validation turned out to be much more complicated than the development set. Finally, the results for our best semantic model are better than those for our best lexical model.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 3</head><p>Results on the development set. Here, shaar is a baseline submission (Elasticsearch) by the organizers.</p><p>Rank Team MAP@5 MAP@1 RR P@3 P@5 </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.3.">Reranking</head><p>Reranking with LambdaMART improved MAP@5 to 0.941 on the development set. The results for other participants are shown in Table <ref type="table">3</ref>. We further estimated the importance of each of the 12 features using the trained LambdaMART model (Table <ref type="table" target="#tab_2">4</ref>). These results confirm that the most important features come from sentence-BERT (the semantic component), which used the claim with the title as an input. However, TF.IDF approaches (the lexical component) also have relatively high importance. Thus, we can conclude that the importance of the similarity score predicted by the TF.IDF approach on the input &lt;Claim + Title + Subtitle, VerClaim&gt; is higher than for the sentence-BERT base estimated on the same input. If we completely exclude the results of the lexical component from the features, MAP@5 on the development set drop to 0.899.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.4.">Official Results on the Test Set</head><p>The official evaluation results on the test set are presented in Table <ref type="table">5</ref>. We can see that our system outperforms the systems by the other participants and also the organizers' baseline by a large margin. The table also demonstrates the stability of our solution. Thus, the test performance coincides with what we observed on the validation set.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 5</head><p>Official results on the test set. shaar is a baseline submission (Elasticsearch) of the competition organizers.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Rank Team</head><p>MAP@5 MAP@1 RR P@3 P@5 </p><formula xml:id="formula_0">1</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="8.">Conclusion and Future Work</head><p>We have described our system for the CLEF 2021 CheckThat! Lab Task 2 Subtask A English on detecting previously fact-checked claims. We developed a pipeline using TF.IDF, fine-tuned sentence-BERT, and reranking using LambdaMART, which used similarity scores and ranks as features. We examined the performance of each model on the validation set and analyzed its contribution to the final reranker. The official evaluation ranked our system 1 𝑠𝑡 by a wide margin ahead of other participants and the organizers' baseline.</p><p>In future work, we plan to experiment with other Transformer-based sentence encoders such as RoBERTa <ref type="bibr" target="#b21">[22]</ref> and MPNet <ref type="bibr" target="#b22">[23]</ref>. Another direction we want to explore is to use other potentially relevant data besides STSb for model pre-training.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: For the input Claim, TF.IDF and sentence-BERT independently evaluate the relevance of each VerClaim from the database, returning a similarity score and a position in a fully ranked list. The LambdaMART model then reranks the top-20 results from the sentence-BERT model using all predicted scores and positions as features.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Softmax</head><label></label><figDesc></figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>3 TFigure 2 :</head><label>32</label><figDesc>Figure 2:For the batch of positive pairs &lt;Claim, VerClaim&gt;, Mutiple Negatives Ranking loss contrasts the similarities between the input claim 𝑐 𝑖 and the relevant verified claim 𝑣𝑐 𝑖 vs. between 𝑐 𝑖 and all other 𝑣𝑐 𝑗 in the batch using softmax. ‚ denotes the dot-product.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>Lexical model comparison on the development set.</figDesc><table><row><cell>Method</cell><cell>Input type</cell><cell cols="3">MAP@5 MAP@1 P@3 P@5</cell></row><row><cell></cell><cell>Claim</cell><cell>0.728</cell><cell>0.683</cell><cell>0.260 0.161</cell></row><row><cell>Elasticsearch</cell><cell>Claim+Title</cell><cell>0.834</cell><cell>0.781</cell><cell>0.295 0.182</cell></row><row><cell></cell><cell>Claim+Title+Subtitle</cell><cell>0.859</cell><cell>0.822</cell><cell>0.300 0.184</cell></row><row><cell></cell><cell>Claim</cell><cell>0.414</cell><cell>0.352</cell><cell>0.159 0.105</cell></row><row><cell>BM25 Okapi</cell><cell>Claim+Title</cell><cell>0.586</cell><cell>0.528</cell><cell>0.214 0.137</cell></row><row><cell></cell><cell>Claim+Title+Subtitle</cell><cell>0.646</cell><cell>0.608</cell><cell>0.230 0.140</cell></row><row><cell></cell><cell>Claim</cell><cell>0.662</cell><cell>0.577</cell><cell>0.250 0.155</cell></row><row><cell>TF.IDF</cell><cell>Claim+Title</cell><cell>0.832</cell><cell>0.779</cell><cell>0.298 0.183</cell></row><row><cell></cell><cell>Claim+Title+Subtitle</cell><cell>0.861</cell><cell>0.819</cell><cell>0.305 0.184</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2</head><label>2</label><figDesc>Semantic model comparison on the development set.</figDesc><table><row><cell>Method</cell><cell>Input type</cell><cell cols="3">MAP@5 MAP@1 P@3 P@5</cell></row><row><cell></cell><cell>Claim</cell><cell>0.826</cell><cell>0.784</cell><cell>0.290 0.177</cell></row><row><cell>sentence-BERT</cell><cell>Claim+Title</cell><cell>0.872</cell><cell>0.839</cell><cell>0.302 0.185</cell></row><row><cell></cell><cell>Claim+Title+Subtitle</cell><cell>0.882</cell><cell>0.849</cell><cell>0.307 0.185</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 4</head><label>4</label><figDesc>Evaluation of the importance of 12 features produced by the pipeline components. It is estimated by the LambdaMART model. Each model provides two features: RR (Reciprocal Rank, that is the position in the ranked list) and Sim. score (the predicted similarity score).</figDesc><table><row><cell>1</cell><cell>aschern</cell><cell>0.941</cell><cell>0.932</cell><cell cols="2">0.940 0.318 0.191</cell></row><row><cell>2</cell><cell>simihaylova</cell><cell>0.936</cell><cell>0.927</cell><cell cols="2">0.935 0.315 0.190</cell></row><row><cell>3</cell><cell>gs_chm</cell><cell>0.902</cell><cell>0.857</cell><cell cols="2">0.901 0.318 0.192</cell></row><row><cell>4</cell><cell>shaar</cell><cell>0.818</cell><cell>0.776</cell><cell cols="2">0.820 0.286 0.177</cell></row><row><cell></cell><cell>Method</cell><cell>Input type</cell><cell></cell><cell>RR</cell><cell>Sim. score</cell></row><row><cell></cell><cell></cell><cell>Claim</cell><cell></cell><cell>0.070</cell><cell>0.054</cell></row><row><cell></cell><cell>TF.IDF</cell><cell>Claim+Title</cell><cell></cell><cell>0.075</cell><cell>0.084</cell></row><row><cell></cell><cell></cell><cell cols="3">Claim+Title+Subtitle 0.057</cell><cell>0.088</cell></row><row><cell></cell><cell></cell><cell>Claim</cell><cell></cell><cell>0.078</cell><cell>0.066</cell></row><row><cell></cell><cell>sentence-BERT</cell><cell>Claim+Title</cell><cell></cell><cell>0.081</cell><cell>0.188</cell></row><row><cell></cell><cell></cell><cell cols="3">Claim+Title+Subtitle 0.077</cell><cell>0.081</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">http://github.com/UKPLab/sentence-transformers</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">http://github.com/jma127/pyltr</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>Anton Chernyavskiy and Dmitry Ilvovsky performed this research in the framework of the HSE University Basic Research Program, funded by the Russian Academic Excellence Project 5-100.</p><p>Preslav Nakov contributed as part of the Tanbih mega-project (tanbih.qcri.org), developed at the Qatar Computing Research Institute, HBKU, which aims to limit the impact of "fake news", propaganda, and media bias by making users aware of what they are reading, thus promoting media literacy and critical thinking.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">The CLEF-2021 CheckThat! lab on detecting check-worthy claims, previously fact-checked claims, and fake news</title>
		<author>
			<persName><forename type="first">P</forename><surname>Nakov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Da San Martino</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Elsayed</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Barrón-Cedeño</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Míguez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Shaar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Alam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Haouari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Hasanain</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Babulkov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Nikolov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">K</forename><surname>Shahi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">M</forename><surname>Struß</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Mandl</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 43rd European Conference on Information Retrieval, ECIR &apos;21</title>
				<meeting>the 43rd European Conference on Information Retrieval, ECIR &apos;21<address><addrLine>Lucca, Italy</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="639" to="649" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Sentence-BERT: Sentence embeddings using Siamese BERTnetworks</title>
		<author>
			<persName><forename type="first">N</forename><surname>Reimers</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Gurevych</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP &apos;19</title>
				<meeting>the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP &apos;19<address><addrLine>Hong Kong, China</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="3982" to="3992" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Adapting boosting for information retrieval measures</title>
		<author>
			<persName><forename type="first">Q</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Burges</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Svore</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Gao</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Information Retrieval</title>
		<imprint>
			<biblScope unit="volume">13</biblScope>
			<biblScope unit="page" from="254" to="270" />
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Detection and resolution of rumours in social media: A survey</title>
		<author>
			<persName><forename type="first">A</forename><surname>Zubiaga</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Aker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Bontcheva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Liakata</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Procter</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Comput. Surv</title>
		<imprint>
			<biblScope unit="volume">51</biblScope>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">A survey on truth discovery</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Gao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Meng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Su</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Fan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Han</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">SIGKDD Explor. Newsl</title>
		<imprint>
			<biblScope unit="volume">17</biblScope>
			<biblScope unit="page" from="1" to="16" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">The spread of true and false news online</title>
		<author>
			<persName><forename type="first">S</forename><surname>Vosoughi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Roy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Science</title>
		<imprint>
			<biblScope unit="volume">359</biblScope>
			<biblScope unit="page" from="1146" to="1151" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Stance detection: A survey</title>
		<author>
			<persName><forename type="first">D</forename><surname>Küçük</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Can</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Comput. Surv</title>
		<imprint>
			<biblScope unit="volume">53</biblScope>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">A survey on computational propaganda detection</title>
		<author>
			<persName><forename type="first">G</forename><surname>Da San Martino</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Cresci</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Barrón-Cedeño</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">D</forename><surname>Pietro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Nakov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI-PRICAI &apos;20</title>
				<meeting>the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI-PRICAI &apos;20</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="4826" to="4832" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">CredEye: A credibility lens for analyzing and explaining misinformation</title>
		<author>
			<persName><forename type="first">K</forename><surname>Popat</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Mukherjee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Strötgen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Weikum</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Web Conference, WWW &apos;</title>
				<meeting>the Web Conference, WWW &apos;</meeting>
		<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="volume">18</biblScope>
			<biblScope unit="page" from="155" to="158" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<author>
			<persName><forename type="first">M</forename><surname>Hardalov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Arora</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Nakov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Augenstein</surname></persName>
		</author>
		<title level="m">A survey on stance detection for mis-and disinformation identification</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">ClaimBuster: The first-ever end-to-end fact-checking system</title>
		<author>
			<persName><forename type="first">N</forename><surname>Hassan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Arslan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Caraballo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Jimenez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Gawsane</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Hasan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Joseph</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Kulkarni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">K</forename><surname>Nayak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Sable</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Tremayne</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Proc. VLDB Endow</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="page" from="1945" to="1948" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">ClaimsKG: A knowledge graph of fact-checked claims</title>
		<author>
			<persName><forename type="first">A</forename><surname>Tchechmedjiev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Fafalios</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Boland</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Gasquet</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Zloch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Zapilko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Dietze</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Todorov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 18th International Semantic Web Conference, ISWC &apos;19</title>
				<meeting>the 18th International Semantic Web Conference, ISWC &apos;19<address><addrLine>Auckland, New Zealand</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="309" to="324" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">That is a known lie: Detecting previously fact-checked claims</title>
		<author>
			<persName><forename type="first">S</forename><surname>Shaar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Babulkov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Da San Martino</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Nakov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL &apos;20</title>
				<meeting>the 58th Annual Meeting of the Association for Computational Linguistics, ACL &apos;20</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="3607" to="3618" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<monogr>
		<author>
			<persName><forename type="first">A</forename><surname>Barrón-Cedeño</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Elsayed</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Nakov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">D S</forename><surname>Martino</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Hasanain</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Suwaileh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Haouari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Babulkov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Hamdan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Nikolov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Shaar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><forename type="middle">S</forename><surname>Ali</surname></persName>
		</author>
		<title level="m">Overview of CheckThat 2020: Automatic identification and verification of claims in social media</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
	<note>CLEF</note>
</biblStruct>

<biblStruct xml:id="b14">
	<monogr>
		<author>
			<persName><forename type="first">M</forename><surname>Bouziane</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Perrin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Cluzeau</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Mardas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Sadeq</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Team</forename><surname>Buster</surname></persName>
		</author>
		<title level="m">ai at CheckThat! 2020 insights and recommendations to improve fact-checking</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
	<note>CLEF</note>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">UNIPI-NLE at CheckThat! 2020: Approaching fact checking from a sentence similarity perspective through the lens of transformers</title>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">C</forename><surname>Passaro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Bondielli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Lenci</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Marcelloni</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CLEF</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<monogr>
		<author>
			<persName><forename type="first">E</forename><surname>Thuma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Motlogelwa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Leburu-Dingalo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Mudongo</surname></persName>
		</author>
		<title level="m">UB_ET at CheckThat! 2020: Exploring ad hoc retrieval approaches in verified claims retrieval</title>
				<imprint>
			<publisher>CLEF</publisher>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<monogr>
		<title level="m" type="main">Efficient natural language response suggestion for smart reply</title>
		<author>
			<persName><forename type="first">M</forename><surname>Henderson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Al-Rfou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Strope</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y.-H</forename><surname>Sung</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Lukács</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Guo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Kumar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Miklos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Kurzweil</surname></persName>
		</author>
		<idno>ArXiv 1705.00652</idno>
		<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<monogr>
		<author>
			<persName><forename type="first">P</forename><surname>Khosla</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Teterwak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Sarna</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Tian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Isola</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Maschinot</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Krishnan</surname></persName>
		</author>
		<idno>arXiv 2004.11362</idno>
		<title level="m">Supervised contrastive learning</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Yahoo! learning to rank challenge overview</title>
		<author>
			<persName><forename type="first">O</forename><surname>Chapelle</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Chang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Machine Learning Research -Proceedings Track</title>
		<imprint>
			<biblScope unit="volume">14</biblScope>
			<biblScope unit="page" from="1" to="24" />
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">The probabilistic relevance framework: BM25 and beyond</title>
		<author>
			<persName><forename type="first">S</forename><surname>Robertson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Zaragoza</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Found. Trends Inf. Retr</title>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="page" from="333" to="389" />
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<monogr>
		<title level="m" type="main">RoBERTa: A robustly optimized BERT pretraining approach</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Ott</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Goyal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Du</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Joshi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Levy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lewis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zettlemoyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Stoyanov</surname></persName>
		</author>
		<idno>ArXiv 1907.11692</idno>
		<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<monogr>
		<title level="m" type="main">MPNet: Masked and permuted pre-training for language understanding</title>
		<author>
			<persName><forename type="first">K</forename><surname>Song</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Tan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Qin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Lu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Liu</surname></persName>
		</author>
		<idno>arXiv 2004.09297</idno>
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
