<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">DeepReading @ SardiStance: Combining Textual, Social and Emotional Features</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">María</forename><forename type="middle">S</forename><surname>Espinosa</surname></persName>
							<email>mespinosa@lsi.uned.es</email>
							<affiliation key="aff0">
								<orgName type="laboratory">NLP &amp; IR Group</orgName>
								<orgName type="institution">UNED</orgName>
								<address>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Rodrigo</forename><surname>Agerri</surname></persName>
							<email>rodrigo.agerri@ehu.eus</email>
							<affiliation key="aff1">
								<orgName type="department">HiTZ Center</orgName>
								<orgName type="institution">Ixa University of the Basque Country UPV/EHU</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Alvaro</forename><surname>Rodrigo</surname></persName>
							<email>alvarory@lsi.uned.es</email>
							<affiliation key="aff2">
								<orgName type="laboratory">NLP &amp; IR Group</orgName>
								<orgName type="institution">UNED</orgName>
								<address>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Roberto</forename><surname>Centeno</surname></persName>
							<email>rcenteno@lsi.uned.es</email>
							<affiliation key="aff3">
								<orgName type="laboratory">NLP &amp; IR Group</orgName>
								<orgName type="institution">UNED</orgName>
								<address>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">DeepReading @ SardiStance: Combining Textual, Social and Emotional Features</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">05571E0AACC9FDD9593D4E256015F52E</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-25T01:04+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>In this paper we describe our participation to the SardiStance shared task held at EVALITA 2020. We developed a set of classifiers that combined text features, such as the best performing systems based on large pre-trained language models, together with user profile features, such as psychological traits and social media user interactions. The classification algorithms chosen for our models were various monolingual and multilingual Transformer models for text only classification, and XGBoost for the non-textual features. The combination of the textual and contextual models was performed by a weighted voting ensemble learning system. Our approach obtained the best score for Task B, on Contextual Stance Detection.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>One of the most important research topics in the field of Natural Language Processing (NLP) is automatic information extraction from textual data. The recent rise of social media has completely changed the way in which people communicate their ideas and has thus led to the emergence of new research problems regarding the automatic analysis of online contents, such as sentiment analysis, emotion recognition, or fake news detection. Stance detection (usually considered as a subproblem of sentiment analysis) is part of the aforementioned family of research problems</p><p>Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).</p><p>( <ref type="bibr" target="#b7">Küc ¸ük and Can, 2020)</ref>. While there are various formulation of the stance detection task, for SardiStance 2020 the aim is to detect the stance (AGAINST, FAVOR or NEUTRAL) conveyed by a given tweet with respect to a specific, previously given topic <ref type="bibr" target="#b10">(Mohammad et al., 2016)</ref>, namely, about the Sardines movement in Italy.</p><p>Thus, we address the problem of automatic stance detection in tweets written in Italian language for the SardiStance 2020 shared task <ref type="bibr" target="#b3">(Cignarella et al., 2020)</ref>, organized within EVALITA 2020 <ref type="bibr" target="#b1">(Basile et al., 2020)</ref>. In this paper we include the participation of three teams within the framework of the DeepReading project<ref type="foot" target="#foot_0">1</ref> : (1) Ixa Group, (2) UNED group, and (3) DeepReading Group. While Ixa focused on developing text classifiers based on textual information only (Task A), UNED was more interested in exploring how to use contextual information available (Task B). Likewise, DeepReading is the product of combining both Ixa and UNED systems into one.</p><p>In this sense, the main idea behind our model is to exploit textual information, based on finetuning large pre-trained language models for text classification, together with contextual information using several feature categories, such as psychological traits of the user, social media data, and network based features. As a result of our joint effort, we submitted 4 and 5 runs, respectively, to tasks A and B. The official results show that our systems obtained the 3rd position among the constrained runs submitted to Task A, which considered only textual information for prediction, and 1st position from 13 participants for Task B, which considered textual and contextual information.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Systems Description</head><p>In this section we first describe the text classification systems developed for Task A and then the contextual features used to train XGBoost classifiers for Task B. We also include a description of the strategies used to combine the classifiers from both tasks, which resulted in the winner system for Task B.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Task A: Textual Stance Detection</head><p>The main objective of our participation in Task A was to benchmark the performance, on the stance detection task for Italian, of large pre-trained language models based on the transformer architecture <ref type="bibr" target="#b15">(Vaswani et al., 2017)</ref>. This would help us to identify the best performing models which will be leveraged to generate features for Task B (Contextual Stance Detection).</p><p>As for many other Natural Language Processing (NLP) tasks, current best performing systems for text classification are based on large pre-trained language models which allow to build rich representations of text based on contextual word embeddings. Deep learning methods in NLP represent words as continuous vectors on a low dimensional space, called word embeddings. The first approaches generated static word embeddings <ref type="bibr" target="#b9">(Mikolov et al., 2013;</ref><ref type="bibr" target="#b1">Bojanowski et al., 2017)</ref>, namely, they provided a unique vector-based representation for a given word independently of the context in which the word occurs. This means that polysemy cannot be represented.</p><p>In order to address this problem, contextual word embeddings were proposed. The idea is to be able to generate word representations according to the context in which the word occurs. Currently there are many approaches to generate such contextual word representations, but we will focus on publicly available multilingual and monolingual pre-trained models for Italian.</p><p>There are several multilingual versions of these models. Thus, the multilingual version of <ref type="bibr">BERT (Devlin et al., 2019)</ref> was trained for the top 100 languages with the largest Wikipedias. More recently, XLM-RoBERTa <ref type="bibr" target="#b4">(Conneau et al., 2019)</ref> distributes a multilingual model which contains 104 languages trained on 2.5 TB of Common Crawl data. Italian is included in both multilingual models.</p><p>These multilingual models perform very well in tasks involving high-resourced languages such as English or Spanish, but their performance drops when applied to languages not so well represented in the language model <ref type="bibr" target="#b0">(Agerri et al., 2020)</ref>. Although this is still an open issue, a number of reasons can be found in the literature. First, each language has to share the quota of substrings and parameters with the rest of the languages represented in the pre-trained multilingual model. As the quota of substrings partially depends on corpus size, this means that larger languages such as English or Spanish are better represented than other languages such as Italian. Moreover, multilingual models also seem to behave better for structurally similar languages <ref type="bibr" target="#b6">(Karthikeyan et al., 2020)</ref>.</p><p>We have benchmarked four monolingual pretrained language models for Italian: AlBERTo, GilBERTo, UmBERTo and Italian BERT XXL with the aim of comparing them with respect to the multilingual pre-trained models previosly mentioned, namely, mBERT and XLM-RoBERTa.</p><p>AlBERTo is a BERT base pre-trained lowercased model containing a vocabulary of 128k terms from 200M of Italian tweets <ref type="bibr" target="#b13">(Polignano et al., 2019)</ref>.</p><p>The Italian BERT XXL models<ref type="foot" target="#foot_1">2</ref> are also based on the BERT base architecture. The training data contains the Italian Wikipedia, various parts of the OPUS corpus and the OSCAR corpus for Italian <ref type="bibr" target="#b12">(Ortiz Suárez et al., 2019)</ref>, for a total of 81GB of Italian text.</p><p>GilBERTo<ref type="foot" target="#foot_2">3</ref> is based on the RoBERTa base <ref type="bibr" target="#b8">(Liu et al., 2019)</ref> architecture, an improved, optimized version of BERT which discards the next sentence prediction task. The model was trained using the Italian Oscar (Ortiz <ref type="bibr" target="#b12">Suárez et al., 2019)</ref>, which contains 71GB of text. The vocabulary used consisted of 32k BPE subwords tokenized by the Sen-tencePiece tokenizer<ref type="foot" target="#foot_3">4</ref> .</p><p>UmBERTo<ref type="foot" target="#foot_4">5</ref> also leverages the RoBERTa base architecture, the OSCAR corpus for Italian and the SentencePiece tokenizer, but it adds Whole Word Masking to the training process. The idea is to mask an entire word, instead of subwords, if at least one of all (sub-)tokens generated by Senten-cePiece was originally selected as mask.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Task B: Contextual Stance Detection</head><p>In this task, we use several sets of features with the purpose of trying to model user's behaviour when writing a tweet. We obtain such features from both the text and the social network. Our hypothesis is that the stance of a user regarding a particular tweet is highly correlated with the way of writing of the own user extracted in terms of psychological and emotional features. On the other hand, we focus on exploring how the concept of "homopohily", namely, the tendency of individuals to associate and bond with similar individuals, previously studied in <ref type="bibr">DellaPosta et al. (2015)</ref>. In order to test this hypothesis, we have tested different models that are explained below.</p><p>In this task, we use several sets of features with the purpose of trying to model user's behaviour when writing a tweet. We obtain such features from text and the network.</p><p>The complete set of features extracted from the data is depicted in Table <ref type="table" target="#tab_0">1</ref>. The set of features used in the model can be divided into five main types: psychological, emotional, Twitter-based, networkbased, and language model features. Psychological features. These features were extracted using a third-party API developed by Symanto 6 . Each tweet was sent to the API in order to retrieve the personality traits and communication styles obtained from the analysis of the tweet contents.</p><p>The personality traits value would be either "emotional" or "rational" depending on the analysis of the user's text. The value returned by the API when the communication styles are re-6 https://symanto-research.github.io/symanto-docs/ quested is a collection of traits, such as selfrevealing, which means sharing one's own experience and opinion; fact-oriented, which implies focusing on factual information, objective observations or statements; information-seeking, that is, posing questions; and action-seeking or aiming to trigger someone's action by giving recommendation, requests or advice.</p><p>Emotional features. In order to retrieve the emotion values from the tweets, we used Russell's circumplex model of affect <ref type="bibr" target="#b14">(Russell, 1980)</ref>. Russell argues that emotions can be conceptualized in a two-dimensional continuous space where the axes correspond to the degree of arousal and valence (or pleasure). These two dimensions form a Cartesian space that can be configured in a circular order in which the different combinations of valence and arousal correspond to one of four discrete emotion regions: tired, tense, excited, and pleased.</p><p>The values for the degree of arousal and valence of the tweets were obtained using an adaptation to Italian language of the Affective Norms for English Words (ANEW) <ref type="bibr" target="#b2">(Bradley and Lang, 1999)</ref>. This database was developed from translations of the 1,034 English words present in the ANEW dictionary and from words taken from Italian semantic norms <ref type="bibr" target="#b11">(Montefinese et al., 2014)</ref>.</p><p>Twitter features. Exploring how the users behave in the social network could offer some insights on the stance tendency of the users. The collection of Twitter data of each user contained four features: the number of statuses published by the user, the number of users followed by the user, the number of users following the user, and the creation date of the Twitter account of the user.</p><p>Network features. Using the FRIEND.csv data provided, we built a network consisting of 669817 nodes (or users) and 2847197 edges (or relationships) in order to represent the following network of the users. From that network, we extracted a sub-graph containing the users of known stance from the training data and the users involved in testing in order to calculate the mean distances of each user to the rest of known stance users using the following formula:</p><formula xml:id="formula_0">d T (n) = |T | i=1 1 d 2 n→i |T |</formula><p>where |T | is the total number of users of a determined stance (AGAINST, FAVOR, NONE) and d 2 n→i corresponds to the square distance in users from node n to node i. From this calculation we obtained 3 values per user: mean distance to users against (d against ), mean distance to users in favor (d f avor ), and mean distance to neutral users (d none )</p><p>Language model features. In order to incorporate the language model results into the rest of the features of the system we choose the best performing, at the development phase, of the models described in Section 2.1, which was UmBERTo. Since this kind of language models use a great amount of features for learning and training, the strategy used in order to incorporate the language model without having a great imbalance in the number of features representing each category, consisted in extracting the probabilities assigned by the model to each class for each tweet. In this way, the language model would be present in 3 of the 18 features of the model, and it would therefore have a balanced size with regards to the rest of features of the model.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Results</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Task A</head><p>As we use the base version of every transformer model we can fine-tune them in a basic GPU of 12GB RAM. Hyperparameter tuning (batch size, maximum sequence length, learning rate and number of epochs) was performed on the development set. For mBERT, AlBERTo, Italian BERT XXL and UmBERTo the best configuration was: maximum sequence length 256, batch 32, learning rate 5e-5, and 5 epochs. For GilBERTo we used the same values except the number of epochs, which was increased to 10. Finally, the best performing hyperparameters for XLM-RoBERTa was the following: maximum sequence length 256, batch 16, learning rate 2e-5, and 10 epochs.</p><p>While the monolingual models clearly outperformed both mBERT and XLM-RoBERTa on the development data, we decided to submit the three best monolingual runs and the best multilingual one. Table <ref type="table" target="#tab_1">2</ref> reports the official results obtained by each of the models and their position with respect to the ranking of constrained runs for Task A released by the task organizers. Our submission based on Italian BERT XXL was clearly the best of our four runs, although its performance was around 1.5 scores in F1 lower than the winner system for Task A. Furthermore, the ranking obtained in the test does not correspond with the results obtained during the development phase, where Um-BERTo outperformed the other monolingual models by more than 3 points in F1 score.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Task B</head><p>We presented a total of five models to Task B, which consisted of different combinations of the features listed in Table <ref type="table" target="#tab_0">1</ref>.</p><p>Models 1, 2, and 3. During the training and development phases of the models, several configurations were tested on models 1, 2, and 3, including training with different classifiers, such as Random Forest Classifier, Decision Tree Classifier and XGBoost Classifier. The best performing classifier was XGBoost configured for multi-class classification and taking into account class weights in order to deal with the imbalance present in the data. XGBoost is an efficient and scalable implementation of gradient boosting framework by <ref type="bibr" target="#b5">(Friedman, 2001)</ref>. With regards to the set of features, the first approach to the task considered only psychological, emotion, and Twitter features. For the second model, network features were added to the feature set. Finally, model 3 considered the probabilities of each class (AGAINST, FA-VOR, NONE) predicted by the UmBERTo language model as three additional features for training.</p><p>Models 4 and 5. These two models were constructed using voting based ensemble learning. The voting system for model 4 considered predictions of models 1, 2, and 3 as well as predictions by the best performing language models on Figure <ref type="figure" target="#fig_0">1</ref> shows the confusion matrices obtained from the released gold test data for each of the five runs submitted to task B. As it can be noticed, the performance of each model is increasingly better from the first to the fifth, as new features are added to them. The biggest increase, especially with respect to false positives in the AGAINST class, takes place from model 1 to model 2, that is, with the inclusion of network features into the model. This indicates that considering contextual information for stance detection tasks, such as the stance of those who are part of the friendship network of the user, can help determine their stance more accurately. Furthermore, we can see that predictions from model 3 also experimented a great increase in true positives of each of the classes. This increase is related to the inclusion of the language model into the features of model 2, which demonstrates the importance of textual data in stance detection tasks. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Conclusions and Future Work</head><p>In this paper we have shown the benefits of exploiting information from different and heterogeneous sources. For our participation to the SardiStance 2020 shared task we have experimented with classifiers trained with the textual content of the tweets as well as with features based on social networks. This combination of features has allowed us to obtain the best overall results in the task.</p><p>As future work, we plan to further explore the contribution of network information. Besides, we want to develop new divergent models and study how to combine them.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Confusion matrices for models 1 to 5 on test data.</figDesc><graphic coords="5,307.28,305.24,221.10,127.59" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 :</head><label>1</label><figDesc>Complete set of features extracted from the data.</figDesc><table><row><cell>Category</cell><cell>Feature name</cell><cell>Description</cell></row><row><cell></cell><cell>pers pred</cell><cell>personality prediction</cell></row><row><cell>Psychological features</cell><cell>self pred info pred action pred</cell><cell>self-revealing prediction information-seeking prediction action-seeking prediction</cell></row><row><cell></cell><cell>fact pred</cell><cell>fact-oriented prediction</cell></row><row><cell>Emotion freatures</cell><cell>arousal valence russell</cell><cell>mean arousal value mean valence value emotion value on Russell's model</cell></row><row><cell></cell><cell>statuses count</cell><cell>number of tweets posted by user</cell></row><row><cell>Twitter</cell><cell>friends count</cell><cell>number of following users</cell></row><row><cell>features</cell><cell>followers count</cell><cell>number of follower users</cell></row><row><cell></cell><cell>created at</cell><cell>account creation date</cell></row><row><cell>Network features</cell><cell>d favor d against d none</cell><cell>mean distance to users in favor mean distance to users against mean distance to neutral users</cell></row><row><cell>Language</cell><cell>p favor</cell><cell>prob. of tweet being in favor</cell></row><row><cell>model</cell><cell>p against</cell><cell>prob. of tweet being against</cell></row><row><cell>features</cell><cell>p none</cell><cell>prob. of tweet being neutral</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2 :</head><label>2</label><figDesc>Official Results for Task A.</figDesc><table><row><cell>Team</cell><cell>Model</cell><cell cols="5">Rank F1 avg F1 Against F1 F avour F1 N one</cell></row><row><cell cols="2">DeepReading Italian BERT XXL</cell><cell>3</cell><cell>66.21</cell><cell>75.80</cell><cell>56.63</cell><cell>42.13</cell></row><row><cell>Ixa</cell><cell>UmBERTo</cell><cell>4</cell><cell>64.73</cell><cell>76.16</cell><cell>53.30</cell><cell>38.88</cell></row><row><cell>Ixa</cell><cell>GilBERTo</cell><cell>6</cell><cell>61.71</cell><cell>75.43</cell><cell>48.00</cell><cell>36.75</cell></row><row><cell>DeepReading</cell><cell>XML-RoBERTa</cell><cell>8</cell><cell>60.04</cell><cell>69.66</cell><cell>50.42</cell><cell>39.16</cell></row><row><cell>-</cell><cell>baseline</cell><cell cols="2">12-13 57.84</cell><cell>71.58</cell><cell>44.09</cell><cell>27.64</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">http://ixa2.si.ehu.es/deepreading/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">https://github.com/dbmdz/berts</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">https://github.com/idb-ita/GilBERTo</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_3">https://github.com/google/sentencepiece</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_4">https://github.com/musixmatchresearch/umberto</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>This work has been partially funded by the Spanish Ministry of Science, Innovation and Universities (DeepReading RTI2018-096846-B-C21, MCIU/AEI/FEDER, UE), and DeepText (KK-2020/00088), funded by the Basque Government. Rodrigo Agerri is additionally funded by the RYC-2017-23647 fellowship and acknowledges the donation of a Titan V GPU by the NVIDIA Corporation. Maria S. Espinosa is also funded by the European Social Fund through the Youth Employment Initiative (YEI 2019).</p></div>
			</div>

			<div type="annex">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>the development data: UmBERTo, GilBERTo, and Italian BERT XXL, described in Section 2.1. The most common predicted value among the 6 systems was chosen as the final prediction of model 4. In case of having two or more values with the same counts, the final value is randomly selected.</p><p>On the other hand, model 5 used a weighted voting ensemble learning in which each of the systems considered had as weight the F1 value obtained on the development data. Therefore, the model considered the weighted predictions of each system in order to choose the final prediction.</p><p>Table <ref type="table">3</ref> shows the official results obtained by each model and their position with respect to the ranking for Task B on Contextual Stance Detection. As it can be noted, model 5 ranked first in this task, obtaining an average F1 of 0.7445. Models 3 and 4 also had promising results in the official test set, ranking third and fourth, respectively, and just 0.0079 below the system which obtained the second best result. Model 2 had a slightly worse performance, ranking seventh from a total of 13, but still 0.0604 above the baseline. Finally, model 1 had the lowest performance, ranking last for the task.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Discussion</head></div>			</div>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Give your text representation models some love: the case for basque</title>
		<author>
			<persName><surname>Agerri</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">LREC 2020</title>
				<imprint>
			<date type="published" when="2020">2020. 2020</date>
			<biblScope unit="page" from="4781" to="4788" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">EVALITA 2020: Overview of the 7th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian</title>
		<author>
			<persName><surname>Basile</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">EVALITA</title>
				<editor>
			<persName><forename type="first">Danilo</forename><surname>Valerio Basile</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Maria</forename><surname>Croce</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Lucia</forename><forename type="middle">C</forename><surname>Di Maro</surname></persName>
		</editor>
		<editor>
			<persName><surname>Passaro</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="2017">2020. 2020. 2020. 2017. 2017</date>
			<biblScope unit="volume">5</biblScope>
			<biblScope unit="page" from="135" to="146" />
		</imprint>
	</monogr>
	<note>Enriching word vectors with subword information</note>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<title level="m" type="main">Affective norms for english words (anew): Instruction manual and affective ratings</title>
		<author>
			<persName><forename type="first">Lang1999</forename><forename type="middle">;</forename><surname>Bradley</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Margaret</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Peter</forename><forename type="middle">J</forename><surname>Bradley</surname></persName>
		</author>
		<author>
			<persName><surname>Lang</surname></persName>
		</author>
		<idno>C-1</idno>
		<imprint>
			<date type="published" when="1999">1999</date>
		</imprint>
		<respStmt>
			<orgName>University of Florida</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Technical report</note>
	<note>the center for research in psychophysiology</note>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">SardiStance@EVALITA2020: Overview of the Task on Stance Detection in Italian Tweets</title>
		<author>
			<persName><surname>Cignarella</surname></persName>
		</author>
		<ptr target="CEUR-WS.org" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 7th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian</title>
				<editor>
			<persName><forename type="first">Danilo</forename><surname>Valerio Basile</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Maria</forename><surname>Croce</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Lucia</forename><forename type="middle">C</forename><surname>Di Maro</surname></persName>
		</editor>
		<editor>
			<persName><surname>Passaro</surname></persName>
		</editor>
		<meeting>the 7th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian<address><addrLine>EVALITA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2020">2020. 2020. 2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">BERT: pre-training of deep bidirectional transformers for language understanding</title>
		<author>
			<persName><surname>Conneau</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1911.02116</idno>
	</analytic>
	<monogr>
		<title level="m">NAACL-HLT 2019</title>
				<imprint>
			<date type="published" when="2019">2019. 2019. 2019. 2019</date>
			<biblScope unit="page" from="4171" to="4186" />
		</imprint>
	</monogr>
	<note>Unsupervised cross-lingual representation learning at scale</note>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Greedy function approximation: a gradient boosting machine</title>
		<author>
			<persName><surname>Jerome H Friedman</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Annals of statistics</title>
		<imprint>
			<biblScope unit="page" from="1189" to="1232" />
			<date type="published" when="2001">2001</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Crosslingual ability of multilingual bert: An empirical study</title>
		<author>
			<persName><surname>Karthikeyan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Learning Representations (ICLR)</title>
				<imprint>
			<date type="published" when="2020">2020. 2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Stance detection: A survey</title>
		<author>
			<persName><forename type="first">Küc</forename><surname>¸ük</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Can2020 ; Dilek</forename><surname>Küc</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Fazli</forename><surname>Can</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Computing Surveys (CSUR)</title>
		<imprint>
			<biblScope unit="volume">53</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="1" to="37" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<title level="m" type="main">RoBERTa: A robustly optimized bert pretraining approach</title>
		<author>
			<persName><surname>Liu</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1907.11692</idno>
		<imprint>
			<date type="published" when="2019">2019. 2019</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Distributed representations of words and phrases and their compositionality</title>
		<author>
			<persName><surname>Mikolov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in Neural Information Processing Systems</title>
				<imprint>
			<date type="published" when="2013">2013. 2013</date>
			<biblScope unit="page" from="3111" to="3119" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">SemEval-2016 task 6: Detecting stance in tweets</title>
		<author>
			<persName><forename type="first">Mohammad</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">SemEval</title>
				<imprint>
			<date type="published" when="2016">2016. 2016. 2016</date>
			<biblScope unit="page" from="31" to="41" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">The adaptation of the affective norms for english words (anew) for italian</title>
		<author>
			<persName><surname>Montefinese</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Behavior research methods</title>
		<imprint>
			<biblScope unit="volume">46</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="887" to="903" />
			<date type="published" when="2014">2014. 2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Asynchronous pipelines for processing huge corpora on medium to low resource infrastructures</title>
		<author>
			<persName><forename type="first">Ortiz</forename><surname>Suárez</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Workshop on Challenges in the Management of Large Corpora (CMLC-7) 2019</title>
				<meeting>the Workshop on Challenges in the Management of Large Corpora (CMLC-7) 2019<address><addrLine>Cardiff</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2019-07">2019. 2019. July 2019</date>
			<biblScope unit="page" from="9" to="16" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">AlBERTo: Italian BERT Language Understanding Model for NLP Challenging Tasks Based on Tweets</title>
		<author>
			<persName><surname>Polignano</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Sixth Italian Conference on Computational Linguistics (CLiC-it 2019)</title>
				<meeting>the Sixth Italian Conference on Computational Linguistics (CLiC-it 2019)</meeting>
		<imprint>
			<date type="published" when="2019">2019. 2019</date>
			<biblScope unit="volume">2481</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">A circumplex model of affect</title>
		<author>
			<persName><forename type="first">Russell</forename><surname>James</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of personality and social psychology</title>
		<imprint>
			<biblScope unit="volume">39</biblScope>
			<biblScope unit="issue">6</biblScope>
			<biblScope unit="page">1161</biblScope>
			<date type="published" when="1980">1980</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Attention is all you need</title>
		<author>
			<persName><surname>Vaswani</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in neural information processing systems</title>
				<imprint>
			<date type="published" when="2017">2017. 2017</date>
			<biblScope unit="page" from="5998" to="6008" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
