<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Tonirodriguez at CheckThat!2024: Is it Possible to Use Zero-Shot Cross-Lingual Methods for Subjectivity Detection in Low-Resources Languages? Notebook for the CheckThat! Lab Task2 at CLEF 2024</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Antonio</forename><surname>Rodríguez</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">La Salle Engineering</orgName>
								<orgName type="institution">Universitat Ramon Llull</orgName>
								<address>
									<settlement>Barcelona</settlement>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Elisabet</forename><surname>Golobardes</surname></persName>
							<email>elisabet.golobardes@salle.url.edu</email>
							<affiliation key="aff1">
								<orgName type="department">La Salle Engineering</orgName>
								<orgName type="institution">Universitat Ramon Llull</orgName>
								<address>
									<settlement>Barcelona</settlement>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Jaume</forename><surname>Suau</surname></persName>
							<email>jsuau@blanquerna.url.edu</email>
							<affiliation key="aff2">
								<orgName type="department">Blanquerna</orgName>
								<orgName type="institution">Universitat Ramon Llull</orgName>
								<address>
									<settlement>Barcelona</settlement>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Tonirodriguez at CheckThat!2024: Is it Possible to Use Zero-Shot Cross-Lingual Methods for Subjectivity Detection in Low-Resources Languages? Notebook for the CheckThat! Lab Task2 at CLEF 2024</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">CBAB874EEC582B7D3A92AD295C2A8710</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T18:03+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Subjectivity Detection</term>
					<term>Natural Language Processing</term>
					<term>Fake News</term>
					<term>Journalism</term>
					<term>Misinformation</term>
					<term>Transformers</term>
					<term>Cross-lingual</term>
					<term>Transfer Learning</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Subjectivity detection is a key task within natural language processing due to the challenges generated by new forms of journalism, the proliferation of misinformation and fake news, and existing concerns about the quality and integrity of journalism. Although subjectivity detection is an existing challenge in all languages, the amount of resources available to build these types of applications varies greatly among languages. In this paper, we present our participation in the CLEF2024 CheckThat! Lab Task2 [1], where we have attempted to apply Zero-Shot Cross-Lingual transfer techniques using the datasets for the five languages provided in Task2 (English, German, Italian, Bulgarian, and Arabic). For this, we have fine-tuned two multilingual models, mDeBERTa v3 and XLM-RoBERTa, on a subset of the dataset consisting of three of the languages provided in Task2, specifically English, German, and Italian, and we have applied Zero-Shot Cross-Lingual transfer to the other two languages available in Task2, Arabic and Bulgarian.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Currently, the proliferation of news sites and the widespread use of social networks have revolutionized the way news is consumed, giving rise to new forms of journalism <ref type="bibr" target="#b1">[2]</ref>. However, these changes have introduced several challenges, including the proliferation of misinformation and fake news, the formation of "echo chambers" where news consumers limit their exposure to different points of view, and emerging concerns about the quality and integrity of journalism <ref type="bibr" target="#b2">[3]</ref>. A common element in many of the identified challenges is the need to distinguish whether a news author is sharing objective information or expressing their own opinions, desires, or biases <ref type="bibr" target="#b3">[4]</ref>  <ref type="bibr" target="#b4">[5]</ref>. The goal of Subjectivity Detection (SD) is to develop computational systems capable of implementing a binary classifier that can determine whether a text is objective or subjective.</p><p>CLEF2024 CheckThat! Lab Task2 <ref type="bibr" target="#b0">[1]</ref> provides an opportunity to work on the challenges associated with subjectivity detection. This task aims to construct a binary classifier that can identify whether a text sequence, in the form of a sentence, is subjective or objective <ref type="bibr" target="#b5">[6]</ref>. For the execution of Task2, the organizers have published five datasets in different languages (English, German, Italian, Bulgarian, and Arabic), plus an additional dataset that combines the previous five languages for the multilingual version of the task. The evaluation of the results presented will be carried out through the macro-averaged F1 between the two classes. This paper begins with the "Related Work" section, where a comprehensive review of previous research and studies relevant to the topic is conducted. This is followed by the "Data" section, which provides a detailed description of the structure and characteristics of the dataset provided for the Task2. The "Approach" section outlines the phases and techniques employed to conduct the research. In the "Results" section, the findings obtained from the implementation of the models used are presented and analysed using the metric macro F1. Finally, the "Conclusions" section provides a summary of the results, discusses the implications of the research, and suggests possible directions for future research.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related Work</head><p>According to Liu <ref type="bibr" target="#b6">[7]</ref>, Subjectivity Detection (SD) is a field of study traditionally encompassed within a broader field known as Sentiment Analysis (SA) also referred to as opinion mining. Sentiment analysis is the field of study that analyses people's opinions, sentiments, evaluations, appraisals, attitudes, and emotions towards entities such as products, services, organizations, individuals, issues, events, topics, and their attributes. Sentiment Analysis is an area of research deeply studied in the last two decades.</p><p>Chaturvedi <ref type="bibr" target="#b7">[8]</ref> categorizes methods for subjectivity detection into two main types: traditional syntaxcentered NLP methods and semantics-based NLP approaches. Syntax-centered NLP can be broadly divided into three main categories: keyword spotting, lexical affinity, and statistical methods. The major issue with these methods is that they are highly language-specific and require the existence of databases and resources for each language in which they are to be applied. To address this issue, solutions such as translating content between languages lacking these resources and languages like English, which have a wealth of resources, have been adopted. However, the translation of sentences can lead to the loss of lexical information, such as word sense, resulting in low accuracy <ref type="bibr" target="#b7">[8]</ref>.</p><p>On the other hand, semantic methods based on embeddings, RNNs, Convolutional Networks, and Transformers have gained significant relevance recently. They offer more accurate results than methods based on syntactic features, but they present their own challenges, as they require large datasets for each language in which we want to work. The creation of these datasets is complex and can generate problems such as ambiguity when classifying sentences <ref type="bibr" target="#b7">[8]</ref> or annotator bias <ref type="bibr" target="#b8">[9]</ref>. To avoid these problems, a recent paper published by F. Antici et al. <ref type="bibr" target="#b9">[10]</ref> proposes annotation guidelines with the aim of unifying criteria and avoiding previous problems while experimenting with monolingual, multilingual, and cross-lingual Transformers scenarios between English and Italian languages.</p><p>Schumacher <ref type="bibr" target="#b10">[11]</ref>, starting with a multilingual BERT model, achieves good results for cross-language entity linking. From there, he explores Zero-Shot Cross-Lingual transfer between different languages and obtains robust results with a slight degradation when the model is applied to a language for which fine-tuning has not been performed. He concludes that although multilingual Transformer models make a good transfer between languages, issues remain in disambiguating similar entities unseen in training.</p><p>The objective of this paper is to address the question of the viability of using Zero-Shot Cross-Lingual transfer for subjectivity detection. To this end, we will fine-tune two multilingual Transformer models and analyze the results obtained within the framework of the CLEF2024 CheckThat! Lab Task2 <ref type="bibr" target="#b0">[1]</ref>. To achive this goal, we will employ DeBERTa <ref type="bibr" target="#b11">[12,</ref><ref type="bibr" target="#b12">13]</ref> and RoBERTa <ref type="bibr" target="#b13">[14]</ref> for the monolingual approach and their multilingual versions, MDeBERTa <ref type="bibr" target="#b11">[12,</ref><ref type="bibr" target="#b12">13]</ref> and XLM-RoBERTa <ref type="bibr" target="#b13">[14]</ref>, respectively for the multilingual approach. These models are evolutions built upon BERT that significantly enhance the results achieved by multilingual BERT, particularly in low-resource languages <ref type="bibr" target="#b14">[15]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Data</head><p>The six datasets provided for the execution of Task2 exhibit varying characteristics in terms of size and distribution of objective and subjective sentences. In all datasets, objective sentences are labeled with the tag "OBJ", while subjective sentences are labeled as "SUBJ". As shown in Table <ref type="table" target="#tab_0">1</ref>, the Bulgarian dataset, which is the smallest, comprises a total of 1043 texts, 729 of which are included in the training dataset. In contrast, the Italian dataset contains a total of 2280 sentences, 1613 of which are in the training dataset. Furthermore, an examination of the datasets reveals a distribution bias in favour of the "OBJ" class across all datasets, although the extent of this bias varies depending on the language. For instance, while the bias is only 55.69% in favour of "OBJ" sentences in Bulgarian, this bias increases to 76.32% and 76.37% for Italian and Arabic, respectively. The multilingual dataset, the largest among all, is composed of a subset of sentences provided in each of the other datasets across all subsets (training, validation and test). However, due to its composition, it also exhibits a bias in favour of the "OBJ" class, accounting for 69.16% of the dataset.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Approach</head><p>In our research, we adopted a dual approach. Initially, we employed a monolingual approach that leveraged Transformers, placing the focus on the English language. Subsequently, we implemented a second phase, utilizing multilingual Transformers with a dual purpose: to enhance the results obtained in the first phase with the monolingual Transformers by increasing the size of the training set, and to verify the Zero-Shot Cross-Lingual transfer capabilities of the model. This means that a model that is fine-tuned in certain languages can be applied to other languages without any specific training.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Monolingual Models</head><p>The primary objective of the monolingual phase was to enhance the results provided by Task2 as a baseline. The baseline is based on a two-step approach. First, Sentence-BERT <ref type="bibr" target="#b16">[16]</ref> is used to transform each sentence into a high-dimensional vector representation capable of capturing its semantic meaning. In the second step, a classifier is constructed by training a Logistic Regression model on the vectors generated in the previous step. To improve the results provided by the baseline, we utilized various Transformers such as DeBERTa v3 Large <ref type="bibr" target="#b11">[12,</ref><ref type="bibr" target="#b12">13]</ref>, RoBERTa Large <ref type="bibr" target="#b13">[14]</ref> and BART <ref type="bibr" target="#b17">[17]</ref> Large MNLI <ref type="bibr" target="#b17">[17]</ref> that uses the entailment approach <ref type="bibr" target="#b18">[18]</ref>.</p><p>BART Large MNLI <ref type="bibr" target="#b17">[17]</ref> is a Transformer encoder-decoder (seq2seq) model with a bidirectional (BERT-like) encoder and an autoregressive (GPT-like) decoder pretrained on English. BART is pretrained by ( <ref type="formula">1</ref>) corrupting text with an arbitrary noising function, and (2) learning a model to reconstruct the original text. BART is particularly effective when fine-tuned for text generation tasks (e.g., summarization, translation) but also performs well for comprehension tasks (e.g., text classification, question answering). In this study, we selected the checkpoint for bart-large after it had been trained on the MultiNLI (MNLI) dataset. Yin et al. <ref type="bibr" target="#b18">[18]</ref> proposed a method for using pre-trained NLI models as ready-made Zero-Shot sequence classifiers. The method works by posing the sequence to be classified as the NLI premise and constructing a hypothesis from each candidate label.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Multilingual Models</head><p>In the second phase of our study, we utilized multilingual Transformers. Although these models have architectures and training procedures similar to their monolingual counterparts, they differ in that the corpus used for their pretraining consists of documents in many languages. The multilingual transformer models used in this study were MDeBERTa Base and XLM-RoBERTa Base. These models use masked language modeling as a pretraining objective and are trained jointly on texts in over one hundred languages. By pretraining on vast corpora across numerous languages, these multilingual Transformers enable Zero-Shot Cross-Lingual transfer. This implies that a model fine-tuned on one language can be applied to others without any additional training. The characteristics of these models MDeBERTa V3 Base: <ref type="bibr" target="#b11">[12,</ref><ref type="bibr" target="#b12">13]</ref> mDeBERTa is multilingual version of DeBERTa which use the same structure as DeBERTa and was trained with CC100 multilingual data. The mDeBERTa V3 base model comes with 12 layers and a hidden size of 768. It has 86M backbone parameters with a vocabulary containing 250K tokens which introduces 190M parameters in the Embedding layer. This model was trained using the 2.5T CC100 data as XLM-R.</p><p>XLM-RoBERTa: XLM-RoBERTa is a multilingual version of RoBERTa. It is pre-trained on 2.5TB of filtered CommonCrawl data containing 100 languages. Following the work of XLM and RoBERTa, the XLM-RoBERTa or XLM-R model takes multilingual pretraining one step further by massively upscaling the training data <ref type="bibr" target="#b19">[19]</ref>. Using the Common Crawl corpus, its developers created a dataset with 2.5 terabytes of text; they then trained an encoder with MLM on this dataset. Since the dataset only contains data without parallel texts (i.e., translations), the TLM objective of XLM was dropped. This approach beats XLM and multilingual BERT variants by a large margin, especially on low-resource languages <ref type="bibr" target="#b14">[15]</ref>.</p><p>The objective pursued through this cross-lingual approach is to utilize the same model across different languages, as the resulting linguistic representations can be well generalized across languages for various subsequent tasks, such as classification in our case. To this end, we have fine-tuned the multilingual models in English, German, and Italian, and applied them to the rest of the languages available in Task2, Arabic and Bulgarian.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Results</head><p>In the initial phase of this research, we focused on the English language, applying fine-tuning to various monolingual models with the aim of achieving optimal results as measured by the macro F1 metric, as outlined in the guidelines for Task2. We selected three distinct Transformer-based models for this purpose: DeBERTa Large, RoBERTa Large, and BART Large MNLI. We used Kaggle as the platform for training these models. The results of this process are presented in Table <ref type="table" target="#tab_1">3</ref>.</p><p>The models DeBERTa v3 Large and RoBERTa Large yield very similar results for the English language, with the best result being achieved by RoBERTa Large, scoring 0.74 on the test dataset. A much larger model, BART Large MNLI, which in principle should be capable of a greater understanding of language, performs worse, likely due to the dataset size not allowing it to generalize the characteristics of subjective language. As this model does not have an equivalent multilingual model, we have discarded it for the subsequent phases of the research. In any case, all trained models significantly outperform the baseline result provided for Task2 in English.</p><p>In the second phase of the research, we fine-tuned the multilingual models equivalent to the models selected in Phase 1 on a training dataset composed of the union of the data provided in Task2 for English, Italian, and German languages. Given the increased size of the training dataset, we used the base models, which are smaller in size, instead of the large models. Therefore, we replaced DeBERTa v3 Large with MDeBERTa v3 Base, and instead of RoBERTa Large, we used XLM-RoBERTa Base. As we can observe in Table <ref type="table" target="#tab_2">4</ref>, in all cases, the MDeBERTa v3 Base model outperforms the XLM-RoBERTa Base by a wide margin. In the case of the English language, we narrowly missed surpassing the result obtained by RoBERTa Large in the previous phase, but we matched the result obtained by DeBERTa v3 Large with a base model. The results obtained in the German and Italian languages are noteworthy, where we achieved scores of 0.85 and 0.83 respectively, significantly surpassing the baseline provided by Task2 for these languages.</p><p>In order to ensure the reproducibility of the results obtained with both the monolingual and multilingual approaches, Table <ref type="table" target="#tab_4">6</ref> displays the models, training dataset, and hyperparameters used to train the models that achieved the best results when applied to the Final Test Dataset.</p><p>Finally, we sought to verify the Zero-Shot Cross-Lingual properties of both models by applying the models trained with the English, Italian, and German language datasets to the test datasets for the Bulgarian and Arabic languages without any specific fine-tuning for them.</p><p>We can observe in Table <ref type="table" target="#tab_3">5</ref> that for both Arabic and Bulgarian languages, the results obtained in each case are worse than the baseline provided for both languages by Task2. Therefore, we must conclude that for subjectivity detection, there is no significant transfer of learning from one language to others without having seen examples in the second language during training. Consequently, we cannot rely on this feature of multilingual models for subjectivity detection in low-resource languages.</p><p>We believe that there could be several reasons why cross-lingual transfer has not worked, which should be analyzed in greater depth in subsequent studies. Lauscher <ref type="bibr" target="#b20">[20]</ref> highlights the pretraining corpora size of the target language and the structural language similarity between languages as the main factors for the success of cross-lingual transfer.</p><p>In the final ranking for Task2, we achieved the second position out of a total of 15 participating teams in English language, with a final result for the Macro F1 score of 0.7372 and a SUBJ F1 score of 0.58. In Arabic, we obtained the fifth position out of a total of 7 participating teams, with a Macro F1 score of 0.4551 and a SUBJ F1 score of 0.25.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Conclusion</head><p>Our contribution to Task2 of CheckLab!2024 Subjectivity <ref type="bibr" target="#b0">[1]</ref> aimed to determine, based on the provided datasets, whether it is possible to use the Zero-Shot Cross-Lingual feature of multilingual models to detect subjectivity in low-resource languages. The conclusion we reached is that it is not possible. However, given that this is a widespread problem that applies to all languages, we believe it would be interesting to continue investigating other non-multilingual Transformer-based approaches to help detect subjectivity in low-resource languages. Although the answer to our research question was negative, during the research process, we fine-tuned an MDeBERTa v3 Base model that achieved second place for English in Task2, with a score of 0.7372. It also achieved excellent results for German and Italian, with scores of 0.85 and 0.83 respectively, although we did not actively participate in the competition for these languages. As future lines of work, we propose adding Bulgarian and Arabic datasets, which we have not used to train the MDeBERTa v3 Base model, to see if adding more languages improves the model. It would also be relevant to analyze the use of Downsampling and Oversampling techniques to mitigate the bias present in the available datasets between objective and subjective sentences.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Table 2 Figure 1 :</head><label>21</label><figDesc>Figure 1: Distribution of the number of words per sentence for each of the languages considered in the Task2.</figDesc><graphic coords="4,159.94,170.69,275.40,206.55" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>Datasets and Distribution of classes</figDesc><table><row><cell>English:</cell><cell>Objective</cell><cell>Subjective</cell><cell>Total</cell></row><row><cell>Train</cell><cell>532 (64.10%)</cell><cell>298 (35.90%)</cell><cell>830</cell></row><row><cell>Dev</cell><cell>106 (48.40%)</cell><cell>113 (51.60%)</cell><cell>219</cell></row><row><cell>Dev Test</cell><cell>116 (47.74%)</cell><cell>127 (52.26%)</cell><cell>243</cell></row><row><cell>Italian:</cell><cell>Objective</cell><cell>Subjective</cell><cell>Total</cell></row><row><cell>Train</cell><cell cols="2">1231 (76.32%) 382 (23.68%)</cell><cell>1613</cell></row><row><cell>Dev</cell><cell>167 (73.57%)</cell><cell>60 (26.43%)</cell><cell>227</cell></row><row><cell>Dev Test</cell><cell>323 (73.41%)</cell><cell>117 (26.59%)</cell><cell>440</cell></row><row><cell>German:</cell><cell>Objective</cell><cell>Subjective</cell><cell>Total</cell></row><row><cell>Train</cell><cell>492 (61.50%)</cell><cell>308 (38.50%)</cell><cell>800</cell></row><row><cell>Dev</cell><cell>123 (61.50%)</cell><cell>77 (38.50%)</cell><cell>200</cell></row><row><cell>Dev Test</cell><cell>194 (66.67%)</cell><cell>97 (33.33%)</cell><cell>291</cell></row><row><cell>Bulgarian:</cell><cell>Objective</cell><cell>Subjective</cell><cell>Total</cell></row><row><cell>Train</cell><cell>406 (55.69%)</cell><cell>323 (44.31%)</cell><cell>729</cell></row><row><cell>Dev</cell><cell>59 (55.66%)</cell><cell>47 (44.34%)</cell><cell>106</cell></row><row><cell>Dev Test</cell><cell>116 (55.77%)</cell><cell>92 (44.23%)</cell><cell>208</cell></row><row><cell>Arabic:</cell><cell>Objective</cell><cell>Subjective</cell><cell>Total</cell></row><row><cell>Train</cell><cell>905 (76.37%)</cell><cell>280 (23.63%)</cell><cell>1185</cell></row><row><cell>Dev</cell><cell>227 (76.43%)</cell><cell>70 (23.57%)</cell><cell>297</cell></row><row><cell>Dev Test</cell><cell>363 (81.57%)</cell><cell>82 (18.43%)</cell><cell>445</cell></row><row><cell cols="2">Multilingual: Objective</cell><cell>Subjective</cell><cell>Total</cell></row><row><cell>Train</cell><cell cols="3">3568 (69.16%) 1591 (30.84%) 5159</cell></row><row><cell>Dev</cell><cell>250 (50.00%)</cell><cell>250 (50.00%)</cell><cell>500</cell></row><row><cell>Dev Test</cell><cell>250 (50.00%)</cell><cell>250 (50.00%)</cell><cell>500</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 3</head><label>3</label><figDesc>Results of the Monolingual Models trained in EN and applied to the Final Test dataset for EN.</figDesc><table><row><cell></cell><cell cols="2">F1 Macro SUBJ F1</cell></row><row><cell>Baseline EN</cell><cell>0.63</cell><cell>0.45</cell></row><row><cell>Deberta V3 Large</cell><cell>0.73</cell><cell>0.60</cell></row><row><cell>Roberta Large</cell><cell>0.74</cell><cell>0.59</cell></row><row><cell cols="2">BART Large MNLI 0.69</cell><cell>0.51</cell></row><row><cell>are as follows:</cell><cell></cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 4</head><label>4</label><figDesc>Results of the Multilingual Models trained in EN+IT+DE and applied to the Final Test datasets for EN,IT,DE.</figDesc><table><row><cell></cell><cell></cell><cell>Baseline</cell><cell cols="4">MDeBERTa V3 Base XLM-RoBERTa Base</cell></row><row><cell></cell><cell cols="6">F1 Macro SUBJ F1 F1 Macro SUBJ F1 F1 Macro SUBJ F1</cell></row><row><cell cols="2">English 0.63</cell><cell>0.45</cell><cell>0.73</cell><cell>0.58</cell><cell>0.69</cell><cell>0.50</cell></row><row><cell cols="2">German 0.69</cell><cell>0.63</cell><cell>0.85</cell><cell>0.80</cell><cell>0.82</cell><cell>0.75</cell></row><row><cell>Italian</cell><cell>0.63</cell><cell>0.50</cell><cell>0.83</cell><cell>0.74</cell><cell>0.65</cell><cell>0.43</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 5</head><label>5</label><figDesc>Results of the Multilingual Models trained in EN+IT+DE and applied to the Final Test datasets for AR, BG.</figDesc><table><row><cell></cell><cell></cell><cell>Baseline</cell><cell cols="4">MDeBERTa V3 Base XLM-RoBERTa Base</cell></row><row><cell></cell><cell cols="6">F1 Macro SUBJ F1 F1 Macro SUBJ F1 F1 Macro SUBJ F1</cell></row><row><cell>Arabic</cell><cell>0.49</cell><cell>0.40</cell><cell>0.48</cell><cell>0.29</cell><cell>0.45</cell><cell>0.23</cell></row><row><cell cols="2">Bulgarian 0.75</cell><cell>0.72</cell><cell>0.69</cell><cell>0.61</cell><cell>0.64</cell><cell>0.53</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_4"><head>Table 6</head><label>6</label><figDesc>Hyperparameters for the best performing models</figDesc><table><row><cell cols="2">Hyperparameter Best Monolingual Model</cell><cell>Best Multilingual Model</cell></row><row><cell>Model</cell><cell>FacebookAI/roberta-large</cell><cell>microsoft/mdeberta-v3-base</cell></row><row><cell>Training Dataset</cell><cell>Train_EN</cell><cell>Train_EN+Train_IT+Train_DE</cell></row><row><cell>Num Train Epochs</cell><cell>5</cell><cell>3</cell></row><row><cell>Train Batch Size</cell><cell>8</cell><cell>16</cell></row><row><cell>Eval Batch Size</cell><cell>8</cell><cell>8</cell></row><row><cell>Learning Rate</cell><cell>5e-5</cell><cell>2e-5</cell></row><row><cell>Weight Decay</cell><cell>0.01</cell><cell>0.01</cell></row><row><cell>Warmup Steps</cell><cell>500</cell><cell>200</cell></row></table></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>This research was made possible through the funding of Project TED2021-130810B-C22 by the Ministry of Science and Innovation of the Government of Spain.</p><p>The authors express their gratitude to the Smart Society Research Group at La Salle Engineering, Universitat Ramon Llull, the Digilab Research Group at Blanquerna, Universitat Ramon Llull and the anonymous reviewers for their insightful comments. They also extend their thanks to the organizers of the CheckThat!2024 Lab Task2 <ref type="bibr" target="#b0">[1]</ref> for making this lab possible.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Overview of the CLEF-2024 CheckThat! lab task 2 on subjectivity in news articles</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">M</forename><surname>Struß</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Ruggeri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Barrón-Cedeño</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Alam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Dimitrov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Galassi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Pachov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Koychev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Nakov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Siegel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Wiegand</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Hasanain</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Suwaileh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Zaghouani</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Working Notes of CLEF 2024 -Conference and Labs of the Evaluation Forum, CLEF 2024</title>
				<editor>
			<persName><forename type="first">G</forename><surname>Faggioli</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Ferro</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Galuščáková</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>García Seco De Herrera</surname></persName>
		</editor>
		<meeting><address><addrLine>Grenoble, France</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<title level="m" type="main">Networked Communication. People are the Message</title>
		<author>
			<persName><forename type="first">G</forename><surname>Cardoso</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2023">2023</date>
			<publisher>Editora Mundos Sociais</publisher>
			<pubPlace>Lisboa</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<author>
			<persName><forename type="first">R</forename><surname>Nielsen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ganter</surname></persName>
		</author>
		<idno type="DOI">10.1093/oso/9780190908850.001.0001</idno>
		<title level="m">The Power of Platforms: Shaping Media and Society</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Journalistic power: Constructing the &quot;truth&quot; and the economics of objectivity</title>
		<author>
			<persName><forename type="first">G</forename><surname>Canella</surname></persName>
		</author>
		<idno type="DOI">10.1080/17512786.2021.1914708</idno>
		<idno>arXiv:</idno>
		<ptr target="https://doi.org/10.1080/17512786.2021.1914708" />
	</analytic>
	<monogr>
		<title level="j">Journalism Practice</title>
		<imprint>
			<biblScope unit="volume">17</biblScope>
			<biblScope unit="page" from="209" to="225" />
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Evolving journalism norms: objective, interpretive and fact-checking journalism</title>
		<author>
			<persName><forename type="first">J</forename><surname>Birks</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">The Routledge companion to political journalism</title>
				<meeting><address><addrLine>London</addrLine></address></meeting>
		<imprint>
			<publisher>Routledge</publisher>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="62" to="71" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">The CLEF-2024 CheckThat! Lab: Check-worthiness, subjectivity, persuasion, roles, authorities, and adversarial robustness</title>
		<author>
			<persName><forename type="first">A</forename><surname>Barrón-Cedeño</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Alam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Chakraborty</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Elsayed</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Nakov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Przybyła</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">M</forename><surname>Struß</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Haouari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Hasanain</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Ruggeri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Song</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Suwaileh</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in Information Retrieval</title>
				<editor>
			<persName><forename type="first">N</forename><surname>Goharian</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Tonellotto</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Y</forename><surname>He</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Lipani</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">G</forename><surname>Mcdonald</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">C</forename><surname>Macdonald</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">I</forename><surname>Ounis</surname></persName>
		</editor>
		<meeting><address><addrLine>Nature Switzerland, Cham</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2024">2024</date>
			<biblScope unit="page" from="449" to="458" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<author>
			<persName><forename type="first">B</forename><surname>Liu</surname></persName>
		</author>
		<title level="m">Sentiment Analysis and Opinion Mining</title>
				<imprint>
			<publisher>Morgan and Claypool Publishers</publisher>
			<date type="published" when="2012-05">May 2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Distinguishing between facts and opinions for sentiment analysis: Survey and challenges</title>
		<author>
			<persName><forename type="first">I</forename><surname>Chaturvedi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Cambria</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">E</forename><surname>Welsch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Herrera</surname></persName>
		</author>
		<ptr target="https://api.semanticscholar.org/CorpusID:46764901" />
	</analytic>
	<monogr>
		<title level="j">Inf. Fusion</title>
		<imprint>
			<biblScope unit="volume">44</biblScope>
			<biblScope unit="page" from="65" to="77" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Are we modeling the task or the annotator? an investigation of annotator bias in natural language understanding datasets</title>
		<author>
			<persName><forename type="first">M</forename><surname>Geva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Goldberg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Berant</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/D19-1107</idno>
		<ptr target="https://aclanthology.org/D19-1107.doi:10.18653/v1/D19-1107" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Association for Computational Linguistics</title>
				<editor>
			<persName><forename type="first">K</forename><surname>Inui</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Jiang</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">V</forename><surname>Ng</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">X</forename><surname>Wan</surname></persName>
		</editor>
		<meeting>the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Association for Computational Linguistics<address><addrLine>Hong Kong, China</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="1161" to="1166" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">A corpus for sentence-level subjectivity detection on English news articles</title>
		<author>
			<persName><forename type="first">F</forename><surname>Antici</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Ruggeri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Galassi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Korre</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Muti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Bardi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Fedotova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Barrón-Cedeño</surname></persName>
		</author>
		<ptr target="https://aclanthology.org/2024.lrec-main.25" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)</title>
				<editor>
			<persName><forename type="first">N</forename><surname>Calzolari</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M.-Y</forename><surname>Kan</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">V</forename><surname>Hoste</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Lenci</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Sakti</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Xue</surname></persName>
		</editor>
		<meeting>the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)<address><addrLine>Torino, Italia</addrLine></address></meeting>
		<imprint>
			<publisher>ELRA and ICCL</publisher>
			<date type="published" when="2024">2024</date>
			<biblScope unit="page" from="273" to="285" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Cross-lingual transfer in zero-shot cross-language entity linking</title>
		<author>
			<persName><forename type="first">E</forename><surname>Schumacher</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Mayfield</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Dredze</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2021.findings-acl.52</idno>
		<ptr target="https://aclanthology.org/2021.findings-acl.52.doi:10.18653/v1/2021.findings-acl.52" />
	</analytic>
	<monogr>
		<title level="m">Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, Association for Computational Linguistics</title>
				<editor>
			<persName><forename type="first">C</forename><surname>Zong</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">F</forename><surname>Xia</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">W</forename><surname>Li</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">R</forename><surname>Navigli</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="583" to="595" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Deberta: Decoding-enhanced bert with disentangled attention</title>
		<author>
			<persName><forename type="first">P</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Gao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Chen</surname></persName>
		</author>
		<ptr target="https://openreview.net/forum?id=XPZIaotutsD" />
	</analytic>
	<monogr>
		<title level="m">International Conference on Learning Representations</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<monogr>
		<author>
			<persName><forename type="first">P</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Gao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Chen</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2111.09543</idno>
		<title level="m">Debertav3: Improving deberta using electra-style pre-training with gradientdisentangled embedding sharing</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<monogr>
		<title level="m" type="main">Roberta: A robustly optimized bert pretraining approach</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Ott</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Goyal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Du</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Joshi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Levy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lewis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zettlemoyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Stoyanov</surname></persName>
		</author>
		<idno>ArXiv abs/1907.11692</idno>
		<ptr target="https://api.semanticscholar.org/CorpusID:198953378" />
		<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<monogr>
		<title level="m" type="main">Natural Language Processing with Transformers</title>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">W</forename><surname>Lewis Tunstell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Leandro</forename><surname>Von Werra</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1005">1005</date>
			<publisher>O&apos;Reilly Media, Inc</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<monogr>
		<title/>
		<author>
			<persName><forename type="first">Gravenstein Highway</forename><surname>North</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page">95472</biblScope>
			<pubPlace>Sebastopol, CA</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Sentence-BERT: Sentence embeddings using Siamese BERT-networks</title>
		<author>
			<persName><forename type="first">N</forename><surname>Reimers</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Gurevych</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/D19-1410</idno>
		<ptr target="https://aclanthology.org/D19-1410.doi:10.18653/v1/D19-1410" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Association for Computational Linguistics</title>
				<editor>
			<persName><forename type="first">K</forename><surname>Inui</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Jiang</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">V</forename><surname>Ng</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">X</forename><surname>Wan</surname></persName>
		</editor>
		<meeting>the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Association for Computational Linguistics<address><addrLine>Hong Kong, China</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="3982" to="3992" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension</title>
		<author>
			<persName><forename type="first">M</forename><surname>Lewis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Goyal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Ghazvininejad</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Mohamed</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Levy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Stoyanov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zettlemoyer</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2020.acl-main.703</idno>
		<ptr target="https://aclanthology.org/2020.acl-main.703.doi:10.18653/v1/2020.acl-main.703" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics</title>
				<editor>
			<persName><forename type="first">D</forename><surname>Jurafsky</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Chai</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Schluter</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Tetreault</surname></persName>
		</editor>
		<meeting>the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="7871" to="7880" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<monogr>
		<title level="m" type="main">Benchmarking zero-shot text classification: Datasets, evaluation and entailment approach</title>
		<author>
			<persName><forename type="first">W</forename><surname>Yin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Hay</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Roth</surname></persName>
		</author>
		<idno>ArXiv abs/1909.00161</idno>
		<ptr target="https://api.semanticscholar.org/CorpusID:202540839" />
		<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Unsupervised cross-lingual representation learning at scale</title>
		<author>
			<persName><forename type="first">A</forename><surname>Conneau</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Khandelwal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Goyal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Chaudhary</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Wenzek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Guzmán</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Grave</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Ott</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zettlemoyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Stoyanov</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2020.acl-main.747</idno>
		<ptr target="https://aclanthology.org/2020.acl-main.747.doi:10.18653/v1/2020.acl-main.747" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics</title>
				<editor>
			<persName><forename type="first">D</forename><surname>Jurafsky</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Chai</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Schluter</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Tetreault</surname></persName>
		</editor>
		<meeting>the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="8440" to="8451" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">From zero to hero: On the limitations of zero-shot language transfer with multilingual transformers</title>
		<author>
			<persName><forename type="first">A</forename><surname>Lauscher</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Ravishankar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Vulic</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Glavas</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2020.emnlp-main.363</idno>
		<ptr target="http://dx.doi.org/10.18653/v1/2020.emnlp-main.363.doi:10.18653/v1/2020.emnlp-main.363" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics</title>
				<meeting>the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
