<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">UNITOR @ Sardistance2020: Combining Transformer-based Architectures and Transfer Learning for Robust Stance Detection</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Simone</forename><surname>Giorgioni</surname></persName>
							<email>simone.giorgioni@alumni.uniroma2.eu</email>
							<affiliation key="aff0">
								<orgName type="department">Department of Enterprise Engineering</orgName>
								<orgName type="institution">University of Roma</orgName>
								<address>
									<addrLine>Tor Vergata Via del Politecnico 1</addrLine>
									<postCode>00133</postCode>
									<settlement>Roma</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Marcello</forename><surname>Politi</surname></persName>
							<email>marcello.politi@alumni.uniroma2.eu</email>
							<affiliation key="aff0">
								<orgName type="department">Department of Enterprise Engineering</orgName>
								<orgName type="institution">University of Roma</orgName>
								<address>
									<addrLine>Tor Vergata Via del Politecnico 1</addrLine>
									<postCode>00133</postCode>
									<settlement>Roma</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Samir</forename><surname>Salman</surname></persName>
							<email>samir.salman@alumni.uniroma2.eu</email>
							<affiliation key="aff0">
								<orgName type="department">Department of Enterprise Engineering</orgName>
								<orgName type="institution">University of Roma</orgName>
								<address>
									<addrLine>Tor Vergata Via del Politecnico 1</addrLine>
									<postCode>00133</postCode>
									<settlement>Roma</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Danilo</forename><surname>Croce</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Department of Enterprise Engineering</orgName>
								<orgName type="institution">University of Roma</orgName>
								<address>
									<addrLine>Tor Vergata Via del Politecnico 1</addrLine>
									<postCode>00133</postCode>
									<settlement>Roma</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Roberto</forename><surname>Basili</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Department of Enterprise Engineering</orgName>
								<orgName type="institution">University of Roma</orgName>
								<address>
									<addrLine>Tor Vergata Via del Politecnico 1</addrLine>
									<postCode>00133</postCode>
									<settlement>Roma</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">UNITOR @ Sardistance2020: Combining Transformer-based Architectures and Transfer Learning for Robust Stance Detection</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">59675724C7DF6D38C6D399F05249860A</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-25T01:05+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>English. This paper describes the UNI-TOR system that participated to the Stance Detection in Italian tweets (Sardistance) task within the context of EVALITA 2020. UNITOR implements a transformer-based architecture whose accuracy is improved by adopting a Transfer Learning technique. In particular, this work investigates the possible contribution of three auxiliary tasks related to Stance Detection, i.e., Sentiment Detection, Hate Speech Detection and Irony Detection. Moreover, UN-ITOR relies on an additional dataset automatically downloaded and labeled through distant supervision. The UNITOR system ranked first in Task A within the competition. This confirms the effectiveness of Transformer-based architectures and the beneficial impact of the adopted strategies.</p><p>Italiano. Questo lavoro descrive UN-ITOR, uno dei sistemi partecipanti allo Stance Detection in Italian tweet (SardiStance) task. UNITOR implementa un'architettura neurale basata su Transformer, la cui accuratezza viene migliorata applicando un metodo di Transfer Learning, che sfrutta le informazioni di tre task ausiliari, ovvero Sentiment Detection, Hate Speech Detection e Irony Detection. Inoltre, l'addestramento di UNITOR puó contare su un insieme di dati scaricati ed etichettati automaticamente applicando un semplice metodo di Distant Supervision. Il sistema si é classificato al primo posto nella competizione, confermando l'efficacia delle architetture basate su Transformer e il contributo delle strategie adottate.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Stance detection aims at detecting if the author of a text is in favor of a target topic, or against it <ref type="bibr" target="#b8">(Krejzl et al., 2017)</ref>. In this task, a text pair is generally considered: one text expresses the topic, while the other one reflects the author's judgments. In a possible variant to such a setting, the topic is implicit within an entire document collection over which the stance detection is applied.</p><p>In this work, we will consider this last setting, as defined in the in the Stance Detection in Italian Tweets (SardiStance) task <ref type="bibr" target="#b5">(Cignarella et al., 2020)</ref> within the EVALITA 2020 <ref type="bibr" target="#b2">(Basile et al., 2020)</ref>. A set of texts (here tweets) is provided, almost all concerning the same topic, i.e., the Sardines Movement<ref type="foot" target="#foot_0">1</ref> . The goal is to recognize if each tweet is for or against (or neither) such target, only exploiting textual information. According to the task definition, this corresponds to the so-called Task A. This is quite challenging problem, since it requires at the same time to discover if a text refers to the target topic and the author's orientation, only relying on short messages written in a very conversational style.</p><p>We thus present the UNITOR system participating to the SardiStance task A. The system is based on a Transformer-based architecture for text classification <ref type="bibr" target="#b6">(Devlin et al., 2019)</ref> that is directly pre-trained over a large-scale document collection written in Italian, namely UmBERTo. In a nutshell, the adopted architecture, which has been demonstrated achieving state-of-the-art results in many NLP tasks <ref type="bibr" target="#b6">(Devlin et al., 2019)</ref>, takes in input a message and associates it to one of the target classes indicating the stance. Moreover, due to the task complexity and the small size of the dataset, in order to improve the generalization capabilities of the neural network, we adopted a Transfer Learning approach <ref type="bibr" target="#b12">(Pan and Yang, 2010)</ref>. Our main assumption is that Stance Detection is tied to other tasks involving emotion and subjectivity analysis (such as Sentiment Analysis or Irony Detection) even though important differences do exist among them. As a simplified example, let us consider a message such as "I like the Sardines Movement": it clearly expresses a positive sentiment, also being in favour of the target topic. However, a message such as "I like the EVALITA campaign." is positive as well but it does not express any support or opposition to the Sardines (and it should be associated to the None class). We thus speculate that an automatic system trained over an auxiliary task (e.g., Sentiment Classification) is beneficial, but the transfer process must be carefully designed in order to avoid catastrophic forgetting or interference problems <ref type="bibr" target="#b11">(Mccloskey and Cohen, 1989)</ref>.</p><p>In this work, we investigate the possible contribution of three auxiliary tasks involving the recognition of emotions according to different settings, i.e., Sentiment Detection and Classification, Hate Speech Detection and Irony Detection. We adopt three different classifiers (one for each auxiliary task) and use them to add additional information to the tweets provided in the SardiStance dataset. As an example, when considering the auxiliary task involving Hate Detection, the corresponding classifier will augment each input tweet by expressing if this expresses hate or not. After this step, the final classifier is expected to learn the association between messages and the stance categories, "being aware" (with some unavoidable noise) if the message expresses some sort of hate, irony and more generally, sentiment. Finally, we investigate the possibility of augmenting the training material by automatically downloading messages and labeling them through distant supervision <ref type="bibr" target="#b7">(Go et al., 2009)</ref>. We first selected few hashtags clearly in favour (or not) of the target topic to download and label a set of set of messages. Then, in order to add a set of neutral messages, we selected a set of news titles concerning the Sardines Movement.</p><p>The UNITOR system ranked first in the competition, suggesting that the combination of the Transformer-based learning with the adopted strategies of Transfer Learning and Data Augmentation is beneficial. In the rest of the paper, Sec. 2 describes UNITOR. In Sec. 3, the evaluations are reported while Sec. 4 derives the conclusions.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Transformer-based architectures and</head><p>Transfer Learning for Stance Detection</p><p>The UNITOR system implements a Transformerbased architecture described in Section 2.1. The adopted auxiliary tasks are described in Section 2.2, while our Transfer leaning strategy is in Section 2.3. Finally, an automatic strategy for Data Augmentation is presented in Section 2.4.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">UNITOR as a Transformer-based Architecture</head><p>The approach proposed in <ref type="bibr" target="#b6">(Devlin et al., 2019)</ref>, namely Bidirectional Encoder Representations from Transformers (BERT) provides a very effective model to pre-train a deep and complex neural network over large scale collections of non annotated texts and to apply it to a large variety of NLP tasks. The building block of BERT is the Transformer element <ref type="bibr" target="#b14">(Vaswani et al., 2017)</ref>, an attention-based mechanism that learns contextual relations between words in a text. BERT provides a sentence embedding (as well as the contextualized lexical embeddings of words in the sentence) through a pre-training stage aiming at the acquisition of an expressive and robust language and text model. The Transformer reads the entire input sequence of words at once and is optimized through two pre-training tasks. The first pre-training objective is the (masked language modeling) <ref type="bibr" target="#b6">(Devlin et al., 2019)</ref>. In addition, a Next Sentence Prediction task is used to jointly pre-train text embeddings able to soundly represent discourse level information. This last objective operates on text-pair representations and aims at modeling relational information, e.g. between the consecutive sentences in a text. On top of the produced embeddings, BERT applies a fine-tuning stage devoted to adapt the entire architecture to the targeted task.</p><p>The fine-tuning process of BERT for sentence classification (here adopted) operates on a single texts or text pairs, which can be given in input to BERT, in analogy with a next sentence prediction task. The special token [CLS] is used as first element of each input sequence and the embedding produced by BERT are used in input to a linear classifier customized for the target classification task. While the BERT architecture is pre-trained on large-scale corpora, its application to new tasks is generally obtained by customizing the final classifier to the targeted problem and fine-tuning all the network parameters for few epochs, to avoid catastrophic forgetting. In <ref type="bibr" target="#b10">(Liu et al., 2019b)</ref> RoBERTa is proposed as a variant of BERT which modifies some key hyperparameters, including removing the next-sentence pre-training objective, and training on more data, with much larger mini-batches and learning rates. This allows RoBERTa to improve on the masked language modeling objective compared with BERT and leads to better downstream task performances.</p><p>UNITOR is based on a RoBERTa architecture pre-trained over Italian texts: we adopted Um-BERTo<ref type="foot" target="#foot_1">2</ref> which is pre-trained over a subset of the OSCAR corpus, made of 11 billion tokens. These architectures achieved state-of-the-art results in a wide range of NLP tasks. However, they also rely on large scale annotated datasets composed of (possibly hundreds) thousands of examples. In order to improve the quality of this architecture in the SardiStance Task with a quite limited dataset, we adopted a simple Transfer Learning strategy by relying on the following three auxiliary tasks.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Supporting UNITOR through Auxiliary tasks</head><p>In this work, we speculate that the complexity of the Stance detection task can be simplified whenever the system to be trained is already aware if input messages express some sort of Sentiment, Irony or Hate. In order to expose UNITOR to such information, we trained specific classifiers over dedicated corpora made available in the previous editions of EVALITA, as it follows: Sentiment Detection and Classification. This task consists in the automatic detection of subjectivity (and the eventual positive or negative polarity) in texts <ref type="bibr" target="#b13">(Pang and Lee, 2008)</ref>. Even though the Stance Detection is clearly different from a traditional task of Sentiment Analysis, we speculate that they are nevertheless related. As an example, we can suppose that the presence of stance is more probable in messages expressing subjectivity. We thus considered the setting proposed in SENTIPOLC 2016 <ref type="bibr" target="#b0">(Barbieri et al., 2016)</ref> where a dataset of 8, 000 tweets is made available. For each message, the presence of subjectivity is made explicit and, eventually, the positive and negative polarity. The labeling provided in the dataset was slightly modified and mapped to a classification problem over three classes: all objective tweets were labeled with the special tag &lt;neutrale&gt;, the subjective and positive messages with &lt;positivo&gt; while the negative ones with &lt;negativo&gt;<ref type="foot" target="#foot_2">3</ref> .</p><p>Irony Detection. We speculate that a robust detection of stance requires the recognition of irony, which can even reverse the output of the classification task. For example a false stance can be expressed through a ironic message, such as "Le Sardine sono il futuro passato dell'Italia"<ref type="foot" target="#foot_3">4</ref> . The objective of Irony Detection is to detect whether a given message is ironic or not. We used the dataset provided IronITA 2018 <ref type="bibr" target="#b4">(Cignarella et al., 2018)</ref>, where a dataset of 4, 800 labeled messages is made available. We adopted the original binary classification task, mapping ironic messages to the &lt;ironico&gt; and &lt;non ironico&gt; labels. Hate Speech Detection. Being against a topic can be often expressed through messages expressing also hate. We thus introduce also the Hate Speech Detection task, which involves the automatic recognition of hateful contents. We considered the setting proposed in HaSpeeDe 2018 <ref type="bibr" target="#b4">(Bosco et al., 2018)</ref>, where a dataset of 3, 000 messages is made available. We adopted the original binary classification task: we mapped messages expressing hate with the &lt;odio&gt; label and &lt;non odio&gt; in the other case.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3">Transferring auxiliary tasks in the</head><p>Transformer-based learning</p><p>In order to transfer the information from each auxiliary task into UNITOR, we first trained a specific UmBERTo-based sentence classifier on each of the datasets described in the previous section.</p><p>In each case, the standard parameters proposed in <ref type="bibr" target="#b6">(Devlin et al., 2019)</ref> are used to fine-tune the model<ref type="foot" target="#foot_4">5</ref> . After these three training steps, the entire SardiStance dataset is processed by each of the three classifiers and the resulting labels are used to "augment" the input messages. In particular, these labels generated a sort of new sentence, which is paired with the corresponding message. The following example shows how a tweet<ref type="foot" target="#foot_5">6</ref> against the movement is used in input to UNITOR:</p><formula xml:id="formula_0">"[CLS] negativo ironico odio [SEP]</formula><p>#elezioniregionali Le Sardine aiuteranno a salvare il Paese! #mafammilpiacere Sono proprio dei bei perdigiorno falliti! [SEP]"</p><p>Consistently with <ref type="bibr" target="#b6">(Devlin et al., 2019)</ref>, the first pseudo-token [CLS] is added to generate the embedding used in input in the final linear classifier. Then, the pseudo-sentence "negativo ironico odio" suggests that the message expresses negative polarity and hate through the adoption of irony. Finally, between the [SEP] pseudo-tokens, the original message is reported. This particular schema resembles the classification of text pairs used in relational learning tasks, such as in Textual Entailment <ref type="bibr" target="#b6">(Devlin et al., 2019)</ref>. The output of the auxiliary classifiers defines a sort of hypothesis, i.e., the authors aims at expressing a negative sentiment through an ironic message which also expresses hate, while the original message is the direct consequence, i.e., the "implied" message<ref type="foot" target="#foot_6">7</ref> . The UNITOR model is thus an UmBERTo-based classifier trained over text pairs, where the first element encodes the information derived from the auxiliary tasks and the second one is the original message. Even though the quality of this labeling process can introduce noise (due to incorrectly classified messages) this augmented input is expected simplify the final training process, by explicitly providing information about sentiment, hate and irony.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.4">Distant Supervision for Stance Detection</head><p>In order to balance the limited amount of available data (especially considering the complexity of the task) we augmented the training material by labeling additional messages via Distant Supervision <ref type="bibr" target="#b7">(Go et al., 2009)</ref>. We speculate that a tweet containing an hashtag such as #vivalesardine (in English: #ILikeSarine) is in favour to Sardines instead of a tweet containing for example #sardinefritte (in English: #friedSardine) is against to our target. Hence, we downloaded from the TWITA corpus <ref type="bibr" target="#b1">(Basile and Nissim, 2013)</ref> 3, 200 tweets and labeled them via Distant Supervision.</p><p>In particular, the following subset are derived: 1, 500 tweets against the movement since containing #gatticonsalvini and 1,000 tweets in favour, since containing #nessunotocchilesardine, #iostoconlesardine, #unmaredisardine, #vivalesardine or #forzasardine. Finally, to enlarge the subset of messages without stance, 700 neutral statements were downloaded, which are actually titles from news, derived by querying "sardine" in Google news. In the experimental evaluations discussed in the next section, this dataset of "silver" data is simply added to the training material. To avoid over-fitting, we removed 90% of the occurrences of the hashtags used as query in the new data.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Results and Discussion</head><p>UNITOR participated to Task A -Textual Stance Detection <ref type="bibr" target="#b5">(Cignarella et al., 2020)</ref> where the available dataset is composed by 2,132 tweets concerning the Sardines Movement: 1,028 tweets are against the movement (label Against), 589 tweets in favour of it (label Favour) and 515 tweets do not express any stance about the target topic (label None).</p><p>As discussed in Section 2, UNITOR is based on the UmBERTo pre-trained model, which relies on the RoBERTa architecture. For parameter tuning, we adopted a 10-cross fold validation, so that the training material is divided in 10 folds, each split according to 90%-10% proportion. The model is trained using a standard Cross-entropy Loss and an ADAM optimizer initialized with a learning rate set to 2 • 10 −5 and linearly decreased during the training process. We trained the model for 5 epochs, using a batch size of 32 elements. At test time, an Ensemble of such classifiers is used: each message is in fact classified using all 10 models trained in the different folds and the label suggested by the highest number of classifiers is selected. In the Task A, we submitted two constrained runs, i.e., system considering only tweets from the competition, and two unconstrained ones, where additional tweets were acquired and labeled by applying the approach presented in Section 2.2. All models are implemented using Pytorch<ref type="foot" target="#foot_7">8</ref> and experiments were run over Google Colab<ref type="foot" target="#foot_8">9</ref> .</p><p>Results are reported in Table <ref type="table" target="#tab_0">1</ref> in terms of Precision, Recall and F1 scores obtained by the different models with respect to each label. The final rank considers the average F1 (F1-avg) between the Favour and Against classes.</p><p>First of all, the high complexity of this task is confirmed by the results obtained by the strong Baseline method (the last row). It is a Support Vector Machine trained over a simple Bag-of-Word model <ref type="bibr" target="#b5">(Cignarella et al., 2020)</ref> and achieves an average F1 of 57.84%, being competitive with many systems participating to the task and ranking 13 th over 22 submissions. One important re- We thus trained UmBERTo by adopting the Transfer Learning approach presented in Section 2.3 in the constrained setting. The adoption of all the three auxiliary tasks led to the constrained submission called UNITOR_c_2. Moreover, we considered the training of UmBERTo by considering one auxiliary task at a time. When considering only the Hate Speech Detection task, better results were obtained over the development set, with respect to the adoption of the other tasks taken individually, i.e., Sentiment Detection and Irony Detection 10 . Such a variant, called UN-ITOR_c_1, considers tweets enriched only with information derived by the hate classifier and it generally shows higher precision with respect to the Against class. This suggests that a tweet expressing hate is more likely in opposition to the Sardines Movement. Both constrained models ranked 3 rd and 2 nd in the competition, respectively. These results are impressive as they both outperformed of about 2% of absolute F1 the standard UmBERTo. Moreover, they confirm the beneficial impact of Hate Speech Detection as an auxiliary task. Finally, we augmented the training dataset by using the additional data presented in Section 2.2. We extended the training material used to train UNITOR_c_2 in order to obtain the unconstrained submission called UNITOR_u_2. It is worth noticing that all three auxiliary tasks were used in this submission. This led to a performance drop, i.e. a 66.06% of average F1, which is lower 10 The results of this tuning stage were not reported here for lack of space.</p><p>with respect to the best opponent system, which achieved a 66.21% of F1. It seems that the noise added both from the auxiliary tasks and the additional data, negatively impacted the overall quality. On the contrary, when only the Hate Speech Detection task is considered (i.e., UNITOR_u_1) additional data are positively capitalized by the model, achieving the best average F1 score in the competition, i.e. 68.53%. These results suggest that the combination of the Transformer-based learning with the adopted strategies of Transfer Learning and Data Augmentation is highly beneficial, when only Hate is considered.</p><p>From an error analysis, it seems that a significant number of incorrect classifications occurred in longer and complex messages, where the topic of the stance is not clearly explicit nor captured by the UmBERTo model, such as in "#carfagna: "io per i liberali che non si affidano a Salvini" e "dalle sardine buone idee". Auto-scacco in due mosse. Con la Polverini poi..." 11 . This message is considered to be Against while the system assigns the label None. Here, it is very challenging to understand the connection between the "good ideas of the sardines" and the very colloquial expression "Auto-scacco" which can be translated as "She messed herself ". The same appears in the tweet "Ho finalmente capito chi mi ricordava Mattia Santori, quello delle sardine: Lodo Guenzi. (e infatti in quanto a democristianitá stiamo lá)" 12 which again labeled Against but classified as None. Clearly the system is not able to link the movement to its leader nor to the negative opinion about belonging to the Christian Democrat Party. Another example is the tweet "Dopo avere ascoltato @luigidimaio mi viene in mente una sola parola:grazie. Fiducia nelle sue scelte e immenso rispetto per i grandi risultati ottenuti. Ora un nuovo inizio, con un nuovo entusiamo. Andiamo versogli #statigenerali con serietà e maturità. Forza@mov5stelle!" 13 . Here the system incorrectly assigns the Favour label because the tweet is in favour of a different movement.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Conclusion</head><p>In this work we present the results obtained by the UNITOR system, which participated to the SardiStance task. UNITOR ranked first in Task A, both for constrained and unconstrained runs. These results confirm the beneficial impact of Transformer based architecture for text classification also in the Stance Detection task. Moreover, we demonstrate the beneficial impact of Hate Speech Detection as an auxiliary task in a Transfer Learning setting. Finally, we empirically demonstrate that the adoption of Distance Supervision is useful to reduce data sparseness. Future work will apply the above approaches to task B within SardiStance. Moreover, we will investigate multitask learning approaches <ref type="bibr" target="#b9">(Liu et al., 2019a)</ref> to capitalize information from auxiliary tasks in a more principled way.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 :</head><label>1</label><figDesc>Results obtained by UNITOR at the SardiStance task. In bold best results for each measure. In the system name "c" and "u" refer to constrained and unconstrained runs.</figDesc><table><row><cell>Rk System</cell><cell>avg</cell><cell>F1 Against Favor</cell><cell>None</cell><cell>Rec Against Favor</cell><cell>None</cell><cell>Prec Against Favor</cell><cell>None</cell></row><row><cell cols="8">1 UNITOR_u_1 68.53% 78.66% 58.40% 39.10% 76.01% 57.65% 45.35% 81.50% 59.16% 34.36%</cell></row><row><cell>2 UNITOR_c_1</cell><cell cols="7">68.01% 78.81% 57.21% 39.79% 74.66% 63.78% 43.60% 83.43% 51.87% 36.59%</cell></row><row><cell>3 UNITOR_c_2</cell><cell cols="7">67.93% 79.39% 56.47% 36.72% 77.09% 61.22% 37.79% 81.83% 52.40% 35.71%</cell></row><row><cell>4 Opponent_c_1</cell><cell cols="7">66.21% 75.80% 56.63% 42.13% 68.60% 64.29% 52.91% 84.69% 50.60% 35.00%</cell></row><row><cell>5 UNITOR_u_2</cell><cell cols="7">66.06% 76.89% 55.22% 37.02% 72.64% 56.63% 44.77% 81.67% 53.88% 31.56%</cell></row><row><cell>6 UmBERTo</cell><cell cols="7">65.69% 77.41% 53.97% 35.93% 74.12% 57.14% 40.11% 81.00% 51.14% 32.54%</cell></row><row><cell>13 Baseline</cell><cell cols="7">57.84% 71.58% 44.09% 27.64% 68.06% 49.49% 29.65% 75.49% 39.75% 25.89%</cell></row><row><cell cols="4">sult is obtained by the straight application of the</cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell cols="4">UmBERTo model over the original messages (next</cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell cols="4">to last row in Table 1). In fact, this Transformer-</cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell cols="4">based architecture, empowered with the Ensem-</cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell cols="4">ble technique, achieves an average F1 of 65.69%:</cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell cols="4">a system which directly applies an Ensemble of</cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell cols="4">UmBERTo-based models would have ranked 6 th</cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell>in the competition.</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">https://en.wikipedia.org/wiki/Sardines_movement</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">https://huggingface.co/Musixmatch/ umberto-commoncrawl-cased-v1</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">We discarded the few available messages with mixed polarity, to simplify the final classification task.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_3">In English: "Sardines are the future past of Italy"</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_4">The number of epochs was tuned over a development set made of 10% of the corresponding dataset and the best epoch was selected by maximizing the classification accuracy.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_5">In English: "#regionalelections The Sardines will help to save the country! #please They're just a bunch of losers!"</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="7" xml:id="foot_6">We investigate different ways to encode this information, even using complex sentences, but negligible differences in the tuning process were measured, so we applied the simplest schema.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="8" xml:id="foot_7">https://pytorch.org/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="9" xml:id="foot_8">http://colab.research.google.com/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="11" xml:id="foot_9">In English: "#carfagna: "come with me liberals who do not rely on Salvini" and "from Sardines movement good ideas." She messed herself up with two moves. Not to mention Polverini..."</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="12" xml:id="foot_10">  12  In English: "I finally understood who reminded me of Mattia Santori, the one with the Sardines movement: Lodo Guenzi. (in fact as far as Christian Democrats are concerned they are pretty the same).)"</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Overview of the evalita 2016 sentiment polarity classification task</title>
		<author>
			<persName><forename type="first">Francesco</forename><surname>Barbieri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Valerio</forename><surname>Basile</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Danilo</forename><surname>Croce</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Malvina</forename><surname>Nissim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Nicole</forename><surname>Novielli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Viviana</forename><surname>Patti</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of EVALITA 2016</title>
		<title level="s">CEUR Workshop Proceedings</title>
		<meeting>EVALITA 2016<address><addrLine>Napoli, Italy</addrLine></address></meeting>
		<imprint>
			<date type="published" when="1749">2016. December 5-7, 2016. 1749</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Sentiment analysis on italian tweets</title>
		<author>
			<persName><forename type="first">Valerio</forename><surname>Basile</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Malvina</forename><surname>Nissim</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis</title>
				<meeting>the 4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis<address><addrLine>Atlanta</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2013">2013</date>
			<biblScope unit="page" from="100" to="107" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Evalita 2020: Overview of the 7th evaluation campaign of natural language processing and speech tools for italian</title>
		<author>
			<persName><forename type="first">Danilo</forename><surname>Valerio Basile</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Maria</forename><surname>Croce</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Lucia</forename><forename type="middle">C</forename><surname>Di Maro</surname></persName>
		</author>
		<author>
			<persName><surname>Passaro</surname></persName>
		</author>
		<ptr target="CEUR-WS.org" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of Seventh Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop</title>
				<editor>
			<persName><forename type="first">Danilo</forename><surname>Valerio Basile</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Maria</forename><surname>Croce</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Lucia</forename><forename type="middle">C</forename><surname>Di Maro</surname></persName>
		</editor>
		<editor>
			<persName><surname>Passaro</surname></persName>
		</editor>
		<meeting>Seventh Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop<address><addrLine>EVALITA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2020">2020. 2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<title level="m">Overview of the evalita 2018 hate speech detection task</title>
				<editor>
			<persName><forename type="first">Cristina</forename><surname>Bosco</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Felice</forename><surname>Dell'orletta</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Fabio</forename><surname>Poletto</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Sanguinetti</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Tesconi</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
	<note>EVALITA@CLiC-it</note>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Overview of the EVALITA 2018 task on irony detection in Italian tweets (IronITA)</title>
		<author>
			<persName><forename type="first">Alessandra</forename><surname>Teresa Cignarella</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Simona</forename><surname>Frenda</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Valerio</forename><surname>Basile</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Cristina</forename><surname>Bosco</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Viviana</forename><surname>Patti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Paolo</forename><surname>Rosso</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Sixth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian</title>
				<meeting><address><addrLine>EVALITA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2018">2018. 2018</date>
			<biblScope unit="volume">2263</biblScope>
			<biblScope unit="page" from="1" to="6" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">SardiStance@EVALITA2020: Overview of the Task on Stance Detection in Italian Tweets</title>
		<author>
			<persName><forename type="first">Alessandra</forename><surname>Teresa Cignarella</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Mirko</forename><surname>Lai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Cristina</forename><surname>Bosco</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Viviana</forename><surname>Patti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Paolo</forename><surname>Rosso</surname></persName>
		</author>
		<ptr target="CEUR-WS.org" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 7th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian</title>
				<editor>
			<persName><forename type="first">Danilo</forename><surname>Valerio Basile</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Maria</forename><surname>Croce</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Lucia</forename><forename type="middle">C</forename><surname>Di Maro</surname></persName>
		</editor>
		<editor>
			<persName><surname>Passaro</surname></persName>
		</editor>
		<meeting>the 7th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian<address><addrLine>EVALITA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2020">2020. 2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">BERT: Pre-training of deep bidirectional transformers for language understanding</title>
		<author>
			<persName><forename type="first">Jacob</forename><surname>Devlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ming-Wei</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kenton</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kristina</forename><surname>Toutanova</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of NAACL 2019</title>
				<meeting>NAACL 2019<address><addrLine>Minneapolis, Minnesota</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2019-06">2019. June</date>
			<biblScope unit="page" from="4171" to="4186" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<title level="m" type="main">Twitter sentiment classification using distant supervision</title>
		<author>
			<persName><forename type="first">Alec</forename><surname>Go</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Richa</forename><surname>Bhayani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Lei</forename><surname>Huang</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
	<note type="report_type">Technical report</note>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<title level="m" type="main">Stance detection in online discussions</title>
		<author>
			<persName><forename type="first">Peter</forename><surname>Krejzl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Barbora</forename><surname>Hourová</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Josef</forename><surname>Steinberger</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Multi-task deep neural networks for natural language understanding</title>
		<author>
			<persName><forename type="first">Xiaodong</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Pengcheng</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Weizhu</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jianfeng</forename><surname>Gao</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of ACL</title>
				<meeting>ACL<address><addrLine>Florence, Italy</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2019-07">2019a. July</date>
			<biblScope unit="page" from="4487" to="4496" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<title level="m" type="main">Roberta: A robustly optimized BERT pretraining approach</title>
		<author>
			<persName><forename type="first">Yinhan</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Myle</forename><surname>Ott</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Naman</forename><surname>Goyal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jingfei</forename><surname>Du</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Mandar</forename><surname>Joshi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Danqi</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Omer</forename><surname>Levy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Mike</forename><surname>Lewis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Luke</forename><surname>Zettlemoyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Veselin</forename><surname>Stoyanov</surname></persName>
		</author>
		<idno>CoRR, abs/1907.11692</idno>
		<imprint>
			<date type="published" when="2019">2019b</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Catastrophic interference in connectionist networks: The sequential learning problem</title>
		<author>
			<persName><forename type="first">Michael</forename><surname>Mccloskey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Neil</forename><forename type="middle">J</forename><surname>Cohen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">The Psychology of Learning and Motivation</title>
		<imprint>
			<biblScope unit="volume">24</biblScope>
			<biblScope unit="page" from="104" to="169" />
			<date type="published" when="1989">1989</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">A Survey on Transfer Learning</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">J</forename><surname>Pan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Yang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Knowledge and Data Engineering</title>
		<imprint>
			<biblScope unit="volume">22</biblScope>
			<biblScope unit="issue">10</biblScope>
			<biblScope unit="page" from="1345" to="1359" />
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Opinion mining and sentiment analysis</title>
		<author>
			<persName><forename type="first">Bo</forename><surname>Pang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Lillian</forename><surname>Lee</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Found. Trends Inf. Retr</title>
		<imprint>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="issue">1-2</biblScope>
			<biblScope unit="page" from="1" to="135" />
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Attention is all you need</title>
		<author>
			<persName><forename type="first">Ashish</forename><surname>Vaswani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Noam</forename><surname>Shazeer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Niki</forename><surname>Parmar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jakob</forename><surname>Uszkoreit</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Llion</forename><surname>Jones</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Aidan</forename><forename type="middle">N</forename><surname>Gomez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ł</forename><surname>Kaiser</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Illia</forename><surname>Polosukhin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in Neural Information Processing Systems</title>
				<editor>
			<persName><forename type="first">I</forename><surname>Guyon</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">U</forename><forename type="middle">V</forename><surname>Luxburg</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Bengio</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">H</forename><surname>Wallach</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">R</forename><surname>Fergus</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Vishwanathan</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">R</forename><surname>Garnett</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="volume">30</biblScope>
			<biblScope unit="page" from="5998" to="6008" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
