UNITOR @ Sardistance2020: Combining Transformer-based Architectures and Transfer Learning for Robust Stance Detection Simone Giorgioni, Marcello Politi, Samir Salman, Danilo Croce and Roberto Basili Department of Enterprise Engineering, University of Roma, Tor Vergata Via del Politecnico 1, 00133 Roma, Italy {simone.giorgioni,marcello.politi,samir.salman}@alumni.uniroma2.eu {croce,basili}@info.uniroma2.it Abstract 1 Introduction English. This paper describes the UNI- Stance detection aims at detecting if the author of TOR system that participated to the Stance a text is in favor of a target topic, or against it (Kre- Detection in Italian tweets (Sardistance) jzl et al., 2017). In this task, a text pair is generally task within the context of EVALITA 2020. considered: one text expresses the topic, while the UNITOR implements a transformer-based other one reflects the author’s judgments. In a pos- architecture whose accuracy is improved sible variant to such a setting, the topic is implicit by adopting a Transfer Learning tech- within an entire document collection over which nique. In particular, this work investigates the stance detection is applied. the possible contribution of three auxil- In this work, we will consider this last setting, iary tasks related to Stance Detection, i.e., as defined in the in the Stance Detection in Ital- Sentiment Detection, Hate Speech Detec- ian Tweets (SardiStance) task (Cignarella et al., tion and Irony Detection. Moreover, UN- 2020) within the EVALITA 2020 (Basile et al., ITOR relies on an additional dataset auto- 2020). A set of texts (here tweets) is provided, matically downloaded and labeled through almost all concerning the same topic, i.e., the Sar- distant supervision. The UNITOR system dines Movement1 . The goal is to recognize if each ranked first in Task A within the compe- tweet is for or against (or neither) such target, only tition. This confirms the effectiveness of exploiting textual information. According to the Transformer-based architectures and the task definition, this corresponds to the so-called beneficial impact of the adopted strategies. Task A. This is quite challenging problem, since it requires at the same time to discover if a text Italiano. Questo lavoro descrive UN- refers to the target topic and the author’s orienta- ITOR, uno dei sistemi partecipanti tion, only relying on short messages written in a allo Stance Detection in Italian tweet very conversational style. (SardiStance) task. UNITOR implementa We thus present the UNITOR system partici- un’architettura neurale basata su Trans- pating to the SardiStance task A. The system is former, la cui accuratezza viene miglio- based on a Transformer-based architecture for text rata applicando un metodo di Transfer classification (Devlin et al., 2019) that is directly Learning, che sfrutta le informazioni di tre pre-trained over a large-scale document collection task ausiliari, ovvero Sentiment Detection, written in Italian, namely UmBERTo. In a nut- Hate Speech Detection e Irony Detection. shell, the adopted architecture, which has been Inoltre, l’addestramento di UNITOR puó demonstrated achieving state-of-the-art results in contare su un insieme di dati scaricati ed many NLP tasks (Devlin et al., 2019), takes in in- etichettati automaticamente applicando put a message and associates it to one of the target un semplice metodo di Distant Supervi- classes indicating the stance. Moreover, due to the sion. Il sistema si é classificato al primo task complexity and the small size of the dataset, posto nella competizione, confermando in order to improve the generalization capabili- l’efficacia delle architetture basate su ties of the neural network, we adopted a Trans- Transformer e il contributo delle strategie fer Learning approach (Pan and Yang, 2010). Our adottate. main assumption is that Stance Detection is tied Copyright © 2020 for this paper by its authors. Use per- to other tasks involving emotion and subjectivity mitted under Creative Commons License Attribution 4.0 In- 1 ternational (CC BY 4.0). https://en.wikipedia.org/wiki/Sardines_movement analysis (such as Sentiment Analysis or Irony De- adopted auxiliary tasks are described in Section tection) even though important differences do exist 2.2, while our Transfer leaning strategy is in Sec- among them. As a simplified example, let us con- tion 2.3. Finally, an automatic strategy for Data sider a message such as “I like the Sardines Move- Augmentation is presented in Section 2.4. ment”: it clearly expresses a positive sentiment, also being in favour of the target topic. However, 2.1 UNITOR as a Transformer-based a message such as “I like the EVALITA campaign.” Architecture is positive as well but it does not express any sup- The approach proposed in (Devlin et al., 2019), port or opposition to the Sardines (and it should be namely Bidirectional Encoder Representations associated to the None class). We thus speculate from Transformers (BERT) provides a very effec- that an automatic system trained over an auxiliary tive model to pre-train a deep and complex neu- task (e.g., Sentiment Classification) is beneficial, ral network over large scale collections of non an- but the transfer process must be carefully designed notated texts and to apply it to a large variety of in order to avoid catastrophic forgetting or inter- NLP tasks. The building block of BERT is the ference problems (Mccloskey and Cohen, 1989). Transformer element (Vaswani et al., 2017), an In this work, we investigate the possible contri- attention-based mechanism that learns contextual bution of three auxiliary tasks involving the recog- relations between words in a text. BERT provides nition of emotions according to different settings, a sentence embedding (as well as the contextual- i.e., Sentiment Detection and Classification, Hate ized lexical embeddings of words in the sentence) Speech Detection and Irony Detection. We adopt through a pre-training stage aiming at the acquisi- three different classifiers (one for each auxiliary tion of an expressive and robust language and text task) and use them to add additional information to model. The Transformer reads the entire input se- the tweets provided in the SardiStance dataset. As quence of words at once and is optimized through an example, when considering the auxiliary task two pre-training tasks. The first pre-training ob- involving Hate Detection, the corresponding clas- jective is the (masked language modeling) (Devlin sifier will augment each input tweet by expressing et al., 2019). In addition, a Next Sentence Predic- if this expresses hate or not. After this step, the tion task is used to jointly pre-train text embed- final classifier is expected to learn the association dings able to soundly represent discourse level in- between messages and the stance categories, “be- formation. This last objective operates on text-pair ing aware” (with some unavoidable noise) if the representations and aims at modeling relational in- message expresses some sort of hate, irony and formation, e.g. between the consecutive sentences more generally, sentiment. Finally, we investigate in a text. On top of the produced embeddings, the possibility of augmenting the training mate- BERT applies a fine-tuning stage devoted to adapt rial by automatically downloading messages and the entire architecture to the targeted task. labeling them through distant supervision (Go et The fine-tuning process of BERT for sentence al., 2009). We first selected few hashtags clearly classification (here adopted) operates on a single in favour (or not) of the target topic to download texts or text pairs, which can be given in input to and label a set of set of messages. Then, in order BERT, in analogy with a next sentence prediction to add a set of neutral messages, we selected a set task. The special token [CLS] is used as first el- of news titles concerning the Sardines Movement. ement of each input sequence and the embedding The UNITOR system ranked first in the com- produced by BERT are used in input to a linear petition, suggesting that the combination of classifier customized for the target classification the Transformer-based learning with the adopted task. While the BERT architecture is pre-trained strategies of Transfer Learning and Data Augmen- on large-scale corpora, its application to new tasks tation is beneficial. In the rest of the paper, Sec. 2 is generally obtained by customizing the final clas- describes UNITOR. In Sec. 3, the evaluations are sifier to the targeted problem and fine-tuning all reported while Sec. 4 derives the conclusions. the network parameters for few epochs, to avoid catastrophic forgetting. In (Liu et al., 2019b) 2 Transformer-based architectures and RoBERTa is proposed as a variant of BERT which Transfer Learning for Stance Detection modifies some key hyperparameters, including re- The UNITOR system implements a Transformer- moving the next-sentence pre-training objective, based architecture described in Section 2.1. The and training on more data, with much larger mini- batches and learning rates. This allows RoBERTa Irony Detection. We speculate that a robust de- to improve on the masked language modeling ob- tection of stance requires the recognition of irony, jective compared with BERT and leads to better which can even reverse the output of the classi- downstream task performances. fication task. For example a false stance can be UNITOR is based on a RoBERTa architecture expressed through a ironic message, such as “Le pre-trained over Italian texts: we adopted Um- Sardine sono il futuro passato dell’Italia”4 . The BERTo2 which is pre-trained over a subset of the objective of Irony Detection is to detect whether OSCAR corpus, made of 11 billion tokens. These a given message is ironic or not. We used the architectures achieved state-of-the-art results in a dataset provided IronITA 2018 (Cignarella et al., wide range of NLP tasks. However, they also 2018), where a dataset of 4, 800 labeled messages rely on large scale annotated datasets composed is made available. We adopted the original binary of (possibly hundreds) thousands of examples. In classification task, mapping ironic messages to the order to improve the quality of this architecture in and labels. the SardiStance Task with a quite limited dataset, Hate Speech Detection. Being against a topic we adopted a simple Transfer Learning strategy by can be often expressed through messages express- relying on the following three auxiliary tasks. ing also hate. We thus introduce also the Hate Speech Detection task, which involves the auto- 2.2 Supporting UNITOR through Auxiliary matic recognition of hateful contents. We con- tasks sidered the setting proposed in HaSpeeDe 2018 In this work, we speculate that the complexity of (Bosco et al., 2018), where a dataset of 3, 000 mes- the Stance detection task can be simplified when- sages is made available. We adopted the original ever the system to be trained is already aware if binary classification task: we mapped messages input messages express some sort of Sentiment, expressing hate with the label and in the other case. information, we trained specific classifiers over dedicated corpora made available in the previous 2.3 Transferring auxiliary tasks in the editions of EVALITA, as it follows: Transformer-based learning Sentiment Detection and Classification. This In order to transfer the information from each aux- task consists in the automatic detection of subjec- iliary task into UNITOR, we first trained a spe- tivity (and the eventual positive or negative polar- cific UmBERTo-based sentence classifier on each ity) in texts (Pang and Lee, 2008). Even though of the datasets described in the previous section. the Stance Detection is clearly different from a In each case, the standard parameters proposed traditional task of Sentiment Analysis, we spec- in (Devlin et al., 2019) are used to fine-tune the ulate that they are nevertheless related. As an model5 . After these three training steps, the en- example, we can suppose that the presence of tire SardiStance dataset is processed by each of the stance is more probable in messages expressing three classifiers and the resulting labels are used to subjectivity. We thus considered the setting pro- “augment” the input messages. In particular, these posed in SENTIPOLC 2016 (Barbieri et al., 2016) labels generated a sort of new sentence, which is where a dataset of 8, 000 tweets is made avail- paired with the corresponding message. The fol- able. For each message, the presence of subjec- lowing example shows how a tweet6 against the tivity is made explicit and, eventually, the posi- movement is used in input to UNITOR: tive and negative polarity. The labeling provided “[CLS] negativo ironico odio [SEP] in the dataset was slightly modified and mapped #elezioniregionali Le Sardine aiuteranno a to a classification problem over three classes: all salvare il Paese! #mafammilpiacere Sono proprio objective tweets were labeled with the special tag dei bei perdigiorno falliti! [SEP]” , the subjective and positive mes- sages with while the negative ones Consistently with (Devlin et al., 2019), the first with 3 . 4 In English: “Sardines are the future past of Italy” 5 The number of epochs was tuned over a development set 2 https://huggingface.co/Musixmatch/ made of 10% of the corresponding dataset and the best epoch umberto-commoncrawl-cased-v1 was selected by maximizing the classification accuracy. 3 6 We discarded the few available messages with mixed po- In English: “#regionalelections The Sardines will help to larity, to simplify the final classification task. save the country! #please They’re just a bunch of losers!” pseudo-token [CLS] is added to generate the em- news. In the experimental evaluations discussed bedding used in input in the final linear classi- in the next section, this dataset of “silver” data is fier. Then, the pseudo-sentence “negativo iron- simply added to the training material. To avoid ico odio” suggests that the message expresses neg- over-fitting, we removed 90% of the occurrences ative polarity and hate through the adoption of of the hashtags used as query in the new data. irony. Finally, between the [SEP] pseudo-tokens, 3 Results and Discussion the original message is reported. This particular UNITOR participated to Task A - Textual Stance schema resembles the classification of text pairs Detection (Cignarella et al., 2020) where the avail- used in relational learning tasks, such as in Tex- able dataset is composed by 2,132 tweets con- tual Entailment (Devlin et al., 2019). The output cerning the Sardines Movement: 1,028 tweets of the auxiliary classifiers defines a sort of hypoth- are against the movement (label Against), 589 esis, i.e., the authors aims at expressing a negative tweets in favour of it (label Favour) and 515 sentiment through an ironic message which also tweets do not express any stance about the target expresses hate, while the original message is the topic (label None). direct consequence, i.e., the “implied” message7 . As discussed in Section 2, UNITOR is based The UNITOR model is thus an UmBERTo-based on the UmBERTo pre-trained model, which re- classifier trained over text pairs, where the first el- lies on the RoBERTa architecture. For parame- ement encodes the information derived from the ter tuning, we adopted a 10-cross fold validation, auxiliary tasks and the second one is the original so that the training material is divided in 10 folds, message. Even though the quality of this label- each split according to 90%-10% proportion. The ing process can introduce noise (due to incorrectly model is trained using a standard Cross-entropy classified messages) this augmented input is ex- Loss and an ADAM optimizer initialized with a pected to simplify the final training process, by learning rate set to 2 · 10−5 and linearly decreased explicitly providing information about sentiment, during the training process. We trained the model hate and irony. for 5 epochs, using a batch size of 32 elements. 2.4 Distant Supervision for Stance Detection At test time, an Ensemble of such classifiers is used: each message is in fact classified using all In order to balance the limited amount of avail- 10 models trained in the different folds and the la- able data (especially considering the complexity bel suggested by the highest number of classifiers of the task) we augmented the training material by is selected. In the Task A, we submitted two con- labeling additional messages via Distant Supervi- strained runs, i.e., system considering only tweets sion (Go et al., 2009). We speculate that a tweet from the competition, and two unconstrained ones, containing an hashtag such as #vivalesardine (in where additional tweets were acquired and labeled English: #ILikeSarine) is in favour to Sardines by applying the approach presented in Section 2.2. instead of a tweet containing for example #sar- All models are implemented using Pytorch8 and dinefritte (in English: #friedSardine) is against experiments were run over Google Colab9 . to our target. Hence, we downloaded from the Results are reported in Table 1 in terms of Pre- TWITA corpus (Basile and Nissim, 2013) 3, 200 cision, Recall and F1 scores obtained by the dif- tweets and labeled them via Distant Supervision. ferent models with respect to each label. The final In particular, the following subset are derived: rank considers the average F1 (F1-avg) between 1, 500 tweets against the movement since contain- the Favour and Against classes. ing #gatticonsalvini and 1,000 tweets in favour, First of all, the high complexity of this task is since containing #nessunotocchilesardine, #ios- confirmed by the results obtained by the strong toconlesardine, #unmaredisardine, #vivalesardine Baseline method (the last row). It is a Support or #forzasardine. Finally, to enlarge the subset of Vector Machine trained over a simple Bag-of- messages without stance, 700 neutral statements Word model (Cignarella et al., 2020) and achieves were downloaded, which are actually titles from an average F1 of 57.84%, being competitive with news, derived by querying “sardine” in Google many systems participating to the task and rank- 7 We investigate different ways to encode this information, ing 13th over 22 submissions. One important re- even using complex sentences, but negligible differences in 8 the tuning process were measured, so we applied the simplest https://pytorch.org/ 9 schema. http://colab.research.google.com/ F1 Rec Prec Rk System avg Against Favor None Against Favor None Against Favor None 1 UNITOR_u_1 68.53% 78.66% 58.40% 39.10% 76.01% 57.65% 45.35% 81.50% 59.16% 34.36% 2 UNITOR_c_1 68.01% 78.81% 57.21% 39.79% 74.66% 63.78% 43.60% 83.43% 51.87% 36.59% 3 UNITOR_c_2 67.93% 79.39% 56.47% 36.72% 77.09% 61.22% 37.79% 81.83% 52.40% 35.71% 4 Opponent_c_1 66.21% 75.80% 56.63% 42.13% 68.60% 64.29% 52.91% 84.69% 50.60% 35.00% 5 UNITOR_u_2 66.06% 76.89% 55.22% 37.02% 72.64% 56.63% 44.77% 81.67% 53.88% 31.56% 6 UmBERTo 65.69% 77.41% 53.97% 35.93% 74.12% 57.14% 40.11% 81.00% 51.14% 32.54% 13 Baseline 57.84% 71.58% 44.09% 27.64% 68.06% 49.49% 29.65% 75.49% 39.75% 25.89% Table 1: Results obtained by UNITOR at the SardiStance task. In bold best results for each measure. In the system name "c" and "u" refer to constrained and unconstrained runs. sult is obtained by the straight application of the with respect to the best opponent system, which UmBERTo model over the original messages (next achieved a 66.21% of F1. It seems that the noise to last row in Table 1). In fact, this Transformer- added both from the auxiliary tasks and the addi- based architecture, empowered with the Ensem- tional data, negatively impacted the overall qual- ble technique, achieves an average F1 of 65.69%: ity. On the contrary, when only the Hate Speech a system which directly applies an Ensemble of Detection task is considered (i.e., UNITOR_u_1) UmBERTo-based models would have ranked 6th additional data are positively capitalized by the in the competition. model, achieving the best average F1 score in the We thus trained UmBERTo by adopting the competition, i.e. 68.53%. These results suggest Transfer Learning approach presented in Section that the combination of the Transformer-based 2.3 in the constrained setting. The adoption of learning with the adopted strategies of Transfer all the three auxiliary tasks led to the constrained Learning and Data Augmentation is highly ben- submission called UNITOR_c_2. Moreover, we eficial, when only Hate is considered. considered the training of UmBERTo by consid- From an error analysis, it seems that a signif- ering one auxiliary task at a time. When consid- icant number of incorrect classifications occurred ering only the Hate Speech Detection task, better in longer and complex messages, where the topic results were obtained over the development set, of the stance is not clearly explicit nor captured with respect to the adoption of the other tasks by the UmBERTo model, such as in “#carfagna: taken individually, i.e., Sentiment Detection and “io per i liberali che non si affidano a Salvini” e Irony Detection10 . Such a variant, called UN- “dalle sardine buone idee”. Auto-scacco in due ITOR_c_1, considers tweets enriched only with mosse. Con la Polverini poi...”11 . This message is information derived by the hate classifier and it considered to be Against while the system as- generally shows higher precision with respect to signs the label None. Here, it is very challenging the Against class. This suggests that a tweet to understand the connection between the “good expressing hate is more likely in opposition to ideas of the sardines” and the very colloquial ex- the Sardines Movement. Both constrained mod- pression “Auto-scacco” which can be translated as els ranked 3rd and 2nd in the competition, respec- “She messed herself ”. The same appears in the tively. These results are impressive as they both tweet “Ho finalmente capito chi mi ricordava Mat- outperformed of about 2% of absolute F1 the stan- tia Santori, quello delle sardine: Lodo Guenzi. (e dard UmBERTo. Moreover, they confirm the ben- infatti in quanto a democristianitá stiamo lá)”12 eficial impact of Hate Speech Detection as an aux- which again labeled Against but classified as iliary task. Finally, we augmented the training None. Clearly the system is not able to link dataset by using the additional data presented in the movement to its leader nor to the negative Section 2.2. We extended the training material opinion about belonging to the Christian Demo- used to train UNITOR_c_2 in order to obtain the crat Party. Another example is the tweet “Dopo unconstrained submission called UNITOR_u_2. It 11 In English: “#carfagna: "come with me liberals who is worth noticing that all three auxiliary tasks were do not rely on Salvini" and "from Sardines movement good used in this submission. This led to a performance ideas." She messed herself up with two moves. Not to men- drop, i.e. a 66.06% of average F1, which is lower tion Polverini...” 12 In English: “I finally understood who reminded me of Mattia Santori, the one with the Sardines movement: Lodo 10 The results of this tuning stage were not reported here Guenzi. (in fact as far as Christian Democrats are concerned for lack of space. they are pretty the same).)” avere ascoltato @luigidimaio mi viene in mente Cristina Bosco, Felice Dell’Orletta, Fabio Poletto, una sola parola:grazie. Fiducia nelle sue scelte M. Sanguinetti, and M. Tesconi. 2018. Overview of the evalita 2018 hate speech detection task. In e immenso rispetto per i grandi risultati ottenuti. EVALITA@CLiC-it. Ora un nuovo inizio, con un nuovo entusiamo. An- diamo versogli #statigenerali con serietà e matu- Alessandra Teresa Cignarella, Simona Frenda, Valerio Basile, Cristina Bosco, Viviana Patti, Paolo Rosso, rità. Forza@mov5stelle!”13 . Here the system in- et al. 2018. Overview of the EVALITA 2018 task correctly assigns the Favour label because the on irony detection in Italian tweets (IronITA). In tweet is in favour of a different movement. Sixth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian (EVALITA 4 Conclusion 2018), volume 2263, pages 1–6. In this work we present the results obtained by Alessandra Teresa Cignarella, Mirko Lai, Cristina the UNITOR system, which participated to the Bosco, Viviana Patti, and Paolo Rosso. 2020. SardiStance task. UNITOR ranked first in Task SardiStance@EVALITA2020: Overview of the Task on Stance Detection in Italian Tweets. In Valerio A, both for constrained and unconstrained runs. Basile, Danilo Croce, Maria Di Maro, and Lucia C. These results confirm the beneficial impact of Passaro, editors, Proceedings of the 7th Evalua- Transformer based architecture for text classifi- tion Campaign of Natural Language Processing and cation also in the Stance Detection task. More- Speech Tools for Italian (EVALITA 2020). CEUR- over, we demonstrate the beneficial impact of Hate WS.org. Speech Detection as an auxiliary task in a Transfer Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Learning setting. Finally, we empirically demon- Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language under- strate that the adoption of Distance Supervision standing. In Proceedings of NAACL 2019, pages is useful to reduce data sparseness. Future work 4171–4186, Minneapolis, Minnesota, June. will apply the above approaches to task B within Alec Go, Richa Bhayani, and Lei Huang. 2009. Twit- SardiStance. Moreover, we will investigate multi- ter sentiment classification using distant supervision. task learning approaches (Liu et al., 2019a) to cap- Technical report. italize information from auxiliary tasks in a more Peter Krejzl, Barbora Hourová, and Josef Steinberger. principled way. 2017. Stance detection in online discussions. Xiaodong Liu, Pengcheng He, Weizhu Chen, and Jian- References feng Gao. 2019a. Multi-task deep neural networks Francesco Barbieri, Valerio Basile, Danilo Croce, for natural language understanding. In Proceedings Malvina Nissim, Nicole Novielli, and Viviana Patti. of ACL, pages 4487–4496, Florence, Italy, July. 2016. Overview of the evalita 2016 sentiment polar- Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Man- ity classification task. In Proceedings of EVALITA dar Joshi, Danqi Chen, Omer Levy, Mike Lewis, 2016, Napoli, Italy, December 5-7, 2016, volume Luke Zettlemoyer, and Veselin Stoyanov. 2019b. 1749 of CEUR Workshop Proceedings. Roberta: A robustly optimized BERT pretraining Valerio Basile and Malvina Nissim. 2013. Sentiment approach. CoRR, abs/1907.11692. analysis on italian tweets. In Proceedings of the 4th Michael Mccloskey and Neil J. Cohen. 1989. Catas- Workshop on Computational Approaches to Subjec- trophic interference in connectionist networks: The tivity, Sentiment and Social Media Analysis, pages sequential learning problem. The Psychology of 100–107, Atlanta. Learning and Motivation, 24:104–169. Valerio Basile, Danilo Croce, Maria Di Maro, and Lu- S.J. Pan and Q. Yang. 2010. A Survey on Transfer cia C. Passaro. 2020. Evalita 2020: Overview Learning. IEEE Transactions on Knowledge and of the 7th evaluation campaign of natural language Data Engineering, 22(10):1345–1359. processing and speech tools for italian. In Valerio Basile, Danilo Croce, Maria Di Maro, and Lucia C. Bo Pang and Lillian Lee. 2008. Opinion mining and Passaro, editors, Proceedings of Seventh Evalua- sentiment analysis. Found. Trends Inf. Retr., 2(1- tion Campaign of Natural Language Processing and 2):1–135. Speech Tools for Italian. Final Workshop (EVALITA 2020). CEUR-WS.org. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Ł ukasz 13 In English: “After listening to @luigidimaio only one Kaiser, and Illia Polosukhin. 2017. Attention is expression came to my mind: thank you. I have trust in all you need. In I. Guyon, U. V. Luxburg, S. Ben- his choices and a huge respect for the great results ob- gio, H. Wallach, R. Fergus, S. Vishwanathan, and tained. Now it’s a new start, with new enthusiasm. Let’s R. Garnett, editors, Advances in Neural Information move towards the #statigenerali with seriousness and matu- Processing Systems 30, pages 5998–6008. rity.Forza@mov5stars”