<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Emotion Hunters at EMit: Categorical Emotion Detection combining BERT and ChatGPT models</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Gianluca Calò</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Francesco Massafra</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Berardina De Carolis</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Corrado Loglisci</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, University of Bari</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Emotion detection in text plays a crucial role in various applications, such as customer feedback analysis, social media monitoring, or for the analysis of the verbal part of human communication. Deep learning techniques have shown promising results in accurately recognizing and classifying emotions in textual data. This paper describes the approach to categorical emotion detection of the Emotion Hunters team. After a preprocessing phase, a model fine-tuned from AlBERTo together with the ChatGPT APIs was used to address the challenge. The results show that on the out-of-domain test set our approach performed better than on the in-domain one thus showing a good generalization capability.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Emotion Detection</kwd>
        <kwd>BERT</kwd>
        <kwd>ChatGPT</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>for other less-resourced Indo-European languages, such
as Italian.</p>
      <p>
        Emotion detection in texts has gained significant impor- The EMit (Emotions in Italian) Subtask A [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] at
tance in recent years due to the pervasive presence of EVALITA 2023 [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] aims at detecting emotions in social
digital communication platforms and the wealth of user- media messages about TV shows, music videos and
adgenerated content. The ability of software to understand vertisements. Given a message, the system has to decide
and analyze emotions expressed in a text has numerous which emotions are expressed in the message or if the
applications across various domains, including market- message is a neutral one. According to the annotation
ing, customer service, mental health, and social sciences. of the dataset, the problem to address is designed as a
Emotion detection involves the use of Natural Language multilabel classification one. Therefore, the system, given
Processing (NLP) techniques and machine learning al- a message, will classify it and return all the possible
lagorithms to automatically identify and classify emotions bels denoting emotions contained in it. In particular, in
conveyed in textual content. By accurately detecting and Subtask A, the message could be classified as neutral, or
interpreting emotions, we can gain a deeper understanding expressing one or more emotions in the following set of
of human experiences, opinions, and attitudes. 10 labels: anger, anticipation, disgust, fear, joy, sadness,
      </p>
      <p>
        As far as emotion detection for the Italian language surprise, trust, neutral [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], and the additional label love.
is concerned, it presents distinctive features which range Our team, the EmotionHunters, addressed this challenge
from morphological to lexical viewpoints. It presents with a two-steps model. After a pre-processing phase, we
a lot of words with particles in two or three units (e.g., fine-tuned a model based on AlBERTo [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and test it on
verbal groups), which are difcfiult to label. The words the validation set. The performance on the validation set
used to express the same idea can have different types of exceeded the baseline of about 16% reaching an accuracy
grammatical categories, they can be nouns and verbs. In calculated with the weighted F1-score of 0.56. However,
addition, they can be associated with general semantic when we run the model on the provided in-domain
testcategories or specific categories. Italian is a very rich set, we noticed that in some cases the model did not make
language with words that hold more than one meaning, predictions and that there was a high number of neutral
which may mislead an automatic emotion detector. More- predictions on the total of the results. Then, since the
over, while many linguistic resources and annotated texts beginning of the call for this challenge, ChatGPT became
have been generated for wide-coverage languages, such very popular, for these two cases, we integrated the
Chatas English, Chinese and Arabic, the same cannot be said GPT APIs 3.5 [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] and this increased the prediction of the
model of 1%. The proposed model has been tested on
EVALITA 2023: 8th Evaluation Campaign of Natural Language Pro- the two test sets proposed by the challenge: in-domain,
cessing and Speech Tools for Italian, Sep 7 – 8, Parma, IT including tweets of the same textual genre and subjects of
$ g.calo26@studenti.uniba.it (G. Calò); the training set, and another one, out-of-domain, including
fb.emraarsdsianfara.d7e@casrtouldies@ntiu.unnibibaa.i.tit(B(F..DM. aCsasraoflrias));; social data of different genres and subjects. Our approach
corrado.loglisci@uniba.it (C. Loglisci) showed to have a better performance on the out-of-domain
© 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License test set showing that it is able to generalize with respect
CPWrEooUrckReshdoinpgs IhStpN:/c1e6u1r3-w-0s.o7r3g ACttEribUutiRon 4W.0Iontrekrnsathioonapl(CPCrBoYc4e.0e)d.ings (CEUR-WS.org)
      </p>
      <sec id="sec-1-1">
        <title>2.2. Pre-processing</title>
        <p>to the topic. This is for us an interesting result because
we want to apply the model in different domains like the
one of conversational experiences with intelligent agents.</p>
        <p>We did not have the time within the challenge deadline to
train a model based only of LLMs like ChatGPT and this
is part of our future work.</p>
        <p>
          A pipeline of preliminary operations is performed in
order to rfist clean and standardize the input messages and
then prepare them for a format suitable to the selected
learning algorithms [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. More precisely, we remove the
symbols used in social media communication (emoticons,
url, links, mentions) without discarding the relative
con2. Description of the system tents but we assign them to semantic categories denoted
as tags. For instance, the mentions "@NickName" are
In the following, we first describe the pipeline of text pre- converted into the tokens with the uniform and generic tag
processing and then provide details on the classification &lt;  &gt;. Each emoticon is converted into the textual
dealgorithms used and configured for the task at hand. scription of its meaning taken from a predefined collection
of emoticon-description pairs we made for the purpose
2.1. Analysis of the Dataset of this work. For instance, the emoticon with grinning
face with big eyes would be converted into the
descripThe provided training dataset consists of a collection of tion &lt;faccina con un gran sorriso e occhi spalancati&gt;
5966 labeled tweets, each identified with one or more (in Italian). Also, words with hashtags are split into the
labels related to the predicted emotions (among the 10 single tokens, which are then reported with the open and
emotions mentioned above) (see Figure 2). The class close tags &lt; ℎℎ &gt; and &lt; /ℎℎ &gt;. For
indistribution is not homogeneous, with the trust and neutral stance, the hashtag "#Sanremo2020" would be converted
classes being predominant, while the rear class is the least into the tagged sequence &lt; ℎℎ &gt; Sanremo 2020
frequent (see Figure 1). &lt; /ℎℎ &gt; composed of two single tokens. The
        </p>
        <p>
          To augment the number of sentences of the fear class, rationale behind these operations is to make tokens and
we integrated the training dataset with sentences taken symbols expressing emotions explicit and jointly to
augby the dataset MultiEmotions-It [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ], moreover, using the ment the features describing the original text. So, the
affective dictionary proposed in [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ], we changed affective learning process can work on multiple sources of
informaterms with synonyms in this way the size of the fear class tion and better capture the emotive content.
was upsampled to 400 sentences. Next, we perform a conversion operation to represent
the tweets pre-processed in the input format for the
selected learning algorithms, that is, BERT models and
variants (as explained in the following). All the tokens
produced for the pre-processed tweets were indexed and
used to create a dictionary for the input vectors to the
learning process. Considering the typical length of the
tweet, usually very less the maximum number of
admissible digits, we chose an input length of 128 (elements of
the vectors) and prepared an attention mask to decrease
the importance of the elements inserted into the input
data for padding. Each input vector is in binary code
and each element represents the presence/absence of the
corresponding indexed token.
        </p>
      </sec>
      <sec id="sec-1-2">
        <title>2.3. Selected Models</title>
        <sec id="sec-1-2-1">
          <title>The experimentation is based on BERT models (and vari</title>
          <p>ants), which have achieved state-of-the-art results in text
classification tasks. BERT utilizes special tokens [CLS]
and [SEP] to indicate the beginning of the input sequence
and the separation between sentences.</p>
          <p>In this specific case, the contextualized embedding
associated with the [CLS] token is used as the embedding
for the entire sentence. Thanks to the multi-head
attention mechanism, it can capture the semantics of the entire
sentence effectively.</p>
          <p>
            Several instances of pre-trained BERT models that
contain at least some Italian text in their training corpus were
considered. For each model, a fully connected layer was
added to perform multi-label classification and fine-tune
the models on the specific dataset. The models considered
are:
• dbmdz/bert-base-italian-xxl-cased [
            <xref ref-type="bibr" rid="ref9">9</xref>
            ]: The MDZ
Digital Library team released "Italian BERT cased XXL,"
a BERT version pre-trained on two corpora. The first
corpus consists of texts obtained from a Wikipedia
dump and various texts from the OPUS collection
(http://opus.nlpl.eu/), with a total corpus size of
approximately 13 GB and over 2 billion tokens. The
second corpus is the Italian part of the OSCAR corpus
(https://traces1.inria.fr/oscar/), with a final corpus size
of approximately 81 GB and over 13 billion tokens.
The "cased" version was chosen as it aligns better with
the chosen pre-processing method, as previously done
in [
            <xref ref-type="bibr" rid="ref8">8</xref>
            ].
• AlBERTo-Base, Italian Twitter lower-cased [
            <xref ref-type="bibr" rid="ref4">4</xref>
            ]: A
BERT model trained on a corpus of 200 million Italian
tweets.
• UmBERTo-Commoncrawl-Cased [
            <xref ref-type="bibr" rid="ref10">10</xref>
            ]: A RoBERTa
model (a variant of BERT) trained on an Italian
subcorpus of OSCAR as the training set. It uses a ten-fold
version of the Italian corpus, which consists of 70 GB
of raw text data, 210 million sentences, and 11 billion
words. The sentences were filtered, shuffled at the line
level, and utilized for NLP research.
• MilaNLProc/feel-it-italian-emotion [
            <xref ref-type="bibr" rid="ref11">11</xref>
            ]: An adapted
version of UmBERTo for classification on the Feel-IT
dataset.
• bert-base-multilingual-uncased [
            <xref ref-type="bibr" rid="ref12">12</xref>
            ]: A multilingual
uncased BERT model.
contribution can be attributed to the translation of emojis
into their Italian descriptions, which can be discriminative
in emotion recognition. As a resource we used the one in
[15] whose CLDR Short Name was translated into Italian.
          </p>
          <p>The best trials were achieved with the BERT AlBERTo
model, with learning rates ranging from 2e-05 to 3e-05, a
patience value of 3 allowing for training for 4 epochs, a
batch size of 16, and employing the "transform"
preprocessing strategy. The best trial, in particular, was achieved
with the following hyperparameters:</p>
        </sec>
      </sec>
      <sec id="sec-1-3">
        <title>2.4. Training the Models</title>
        <sec id="sec-1-3-1">
          <title>The challenge provides two baselines and the correspond</title>
          <p>ing code to reproduce their execution. The first baseline
uses count vectors to represent the documents based on
token frequency, while the second baseline uses TF-IDF
vectors. Both implementations limit the vector dimen- AlBERTo and dbmdz had a similar performance,
howsions to 5000. In the emotion recognition task, the TF- ever, we selected AlBERTo with an average F1 score of
IDF baseline performs better, achieving an F1 score of 0.562 also because it had a better performance in
classi41.48%. fying fear, which was one of the most problematic due to</p>
          <p>For both the baselines and the experiments conducted the limited number of examples (see Figure 3).
in this work, a seed was fixed to ensure reproducibility.</p>
          <p>
            Taking into consideration the work presented in
GoEmotions [
            <xref ref-type="bibr" rid="ref13">13</xref>
            ], we decided to freeze the layers of the pre- 2.5. Prediction
trained BERT model and train only the additional classifi- During the prediction phase on the test dataset made using
cation layers. the model fine-tuned from AlBERTo, we noticed a high
          </p>
          <p>Various preliminary experiments were conducted by number of neutral examples, and in some cases, the model
manually modifying the model’s hyperparameters, such was unable to determine the emotion, leaving the result
as the number of epochs, batch size, learning rate, etc., to ifeld empty. We noticed this problem only on the test set
understand which was the better approach for this task. To and not on the validation set, therefore, to address this
improve the results, the following decisions were made: issue, in addition to using our trained model, we integrated
The Optuna library was adopted to systematically test the ChatGPT APIs to obtain additional results to fill in
different combinations of hyperparameters. The MLFlow the case of neutral or undetermined sentences. Then,
library was used to track intermediate (F1) and final results each sentence in the test set is pre-processed and given
(F1 and metrics for each class). as input into the fine-tuned model, with a threshold set at</p>
          <p>The hyperparameter search space was defined as fol- 0.5. At the end of the prediction phase, every sentence
lows: that didn’t receive any label or was classified as neutral
is passed to a Python program that utilizes the ChatGPT
APIs 3.5. Below, we show the prompt used and some of
the examples provided to ChatGPT. Due to the limitations
of the free APIs, the number of tokens and the amount of
input examples are limited.
• Learning rate: between 2e-05 and 5e-05
• Epsilon (AdamW optimizer): between 1e-8 and 1e-6
• Hidden dropout probability: between 0.1 and 0.3
• Patience for early stopping: a discrete interval between</p>
          <p>1 and 5
• Batch size: 16, 32, and 64</p>
          <p>prompt = "Your are an emotion recognition tool for</p>
          <p>The numerous trials conducted using Optuna allowed us ˓→ tweets and your task is to " \
to observe that the models pre-trained consistently on an ˓"→analeymzoetitohnesm, asnedpagriavteedabsyincgolmemaemtohtaitonyoourmaiglhitstthoifnk
Italian text corpus performed better than those pre-trained ˓→ are expressed in the current tweet and you should
on a multi-lingual corpus, including Italian. ˓"→thisusleisotnl[y'atnhgeere'm,ot'iaonntsicfirpoamti"on\', 'disgust', 'fear',</p>
          <p>
            In general, it is observed that training that exceed the ˓→ 'joy', 'love', 'neutral', 'sadness', 'surprise',
fourth epoch often result in a degradation of performance ˓→ 'trust']"
in terms of F1 score, and the ideal batch size was found to examples = """ \
be 16. On average, the performance of all models benefits IHnepruets"oImoe aenxcaomrpalenso:n ho capito se la voce mentre
from the preprocessing step, as this strategy likely retains ˓→ cantano sia modificata o meno
more informative content and includes typical social me- ˓I→npu#tIl"CRaTnt@aunsteerM:asTcahretraartuog"a Oiustptuhte:nneeuwtrzaolccola enorme
dia expressions in standard Italian [
            <xref ref-type="bibr" rid="ref14">14</xref>
            ]. A significant ˓→ #chilhavisto" Output:disgust
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>3. Results</title>
      <p>The results in the following Tables 1 and 2 suggest that
our approach, even if it is not the best model of the
challenge, at least generalizes quite well to a domain different
from the training one. This is for us a good result since
we aim at applying the model in contexts different from
social media analysis. In particular, we are working on
the multimodal analysis of human communication with
conversational agents, in which the analysis of the textual
part of verbal communication can be very important to
fully understand the emotional state of the user. We are
actually exploring the performances of models based on</p>
      <sec id="sec-2-1">
        <title>LLMs by fine-tuning the most popular one on a dataset of text denoting emotion expression not taken from tweets, which is more in line with our final goal.</title>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>O.</given-names>
            <surname>Araque</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Frenda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sprugnoli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Nozza</surname>
          </string-name>
          , V. Patti, EMit at EVALITA 2023:
          <article-title>Overview of the Categorical Emotion Detection in Italian Social Media Task, in: Proceedings of the Eighth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian</article-title>
          .
          <source>Final Workshop (EVALITA</source>
          <year>2023</year>
          ), CEUR.org, Parma, Italy,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Lai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Menini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Polignano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sprugnoli</surname>
          </string-name>
          , G. Venturi,
          <year>Evalita 2023</year>
          :
          <article-title>Overview of the 8th evaluation campaign of natural language processing and speech tools for italian, in: Proceedings of the Eighth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian</article-title>
          .
          <source>Final Workshop (EVALITA</source>
          <year>2023</year>
          ), CEUR.org, Parma, Italy,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>R.</given-names>
            <surname>Plutchik</surname>
          </string-name>
          ,
          <article-title>The nature of emotions: Human emotions have deep evolutionary roots, a fact that may explain their complexity and provide tools for clinical practice</article-title>
          ,
          <source>American Scientist</source>
          <volume>89</volume>
          (
          <year>2001</year>
          )
          <fpage>344</fpage>
          -
          <lpage>350</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Polignano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Basile</surname>
          </string-name>
          , M. de Gemmis, G. Semeraro,
          <string-name>
            <given-names>V.</given-names>
            <surname>Basile</surname>
          </string-name>
          ,
          <article-title>Alberto: Italian BERT language understanding model for NLP challenging tasks based on tweets</article-title>
          ,
          <source>in: Proceedings of the Sixth Italian ConNovember 13-15</source>
          ,
          <year>2019</year>
          ,
          <year>2019</year>
          .
          <volume>133</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <source>[5] Openai - gpt 3.5 api large language model</source>
          ,
          <year>2023</year>
          . [15]
          <string-name>
            <surname>Emoji</surname>
            <given-names>list</given-names>
          </string-name>
          ,
          <source>last consulted May</source>
          <year>2023</year>
          . URL: https: URL: https://chat.openai.com/chat. //unicode.org/emoji/charts/full-emoji-list.html.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>R.</given-names>
            <surname>Sprugnoli</surname>
          </string-name>
          ,
          <article-title>Multiemotions-it: a new dataset for opinion polarity and emotion analysis for italian</article-title>
          ,
          <source>in: Proceedings of the Seventh Italian Conference on Computational Linguistics</source>
          , CLiC-it
          <year>2020</year>
          , Bologna, Italy, March 1-
          <issue>3</issue>
          ,
          <year>2021</year>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>L. C.</given-names>
            <surname>Passaro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lenci</surname>
          </string-name>
          ,
          <article-title>Evaluating context selection strategies to build emotive vector space models</article-title>
          , in: N.
          <string-name>
            <surname>Calzolari</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Choukri</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Declerck</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Goggi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Grobelnik</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Maegaard</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Mariani</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Mazo</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Moreno</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Odijk</surname>
          </string-name>
          , S. Piperidis (Eds.),
          <source>Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC</source>
          <year>2016</year>
          , Portorož, Slovenia, May
          <volume>23</volume>
          -28,
          <year>2016</year>
          ,
          <string-name>
            <given-names>European</given-names>
            <surname>Language Resources Association</surname>
          </string-name>
          (ELRA),
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Pota</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ventura</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Catelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Esposito</surname>
          </string-name>
          ,
          <article-title>An effective bert-based pipeline for twitter sentiment analysis: A case study in italian</article-title>
          ,
          <source>Sensors</source>
          <volume>21</volume>
          (
          <year>2021</year>
          )
          <article-title>133</article-title>
          . doi:
          <volume>10</volume>
          .3390/s21010133.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <article-title>[9] Mdz digital library team, «bert xxl italian models»</article-title>
          ,
          <source>hugging face</source>
          ,
          <year>2020</year>
          . URL: https://huggingface.co/d bmdz/bert-base
          <article-title>-italian-xxl-cased.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>L.</given-names>
            <surname>Parisi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Francia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Magnani</surname>
          </string-name>
          ,
          <article-title>Umberto: an italian language model trained with whole word masking</article-title>
          ,
          <source>Original-date 55</source>
          (
          <year>2020</year>
          )
          <article-title>31Z</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>F.</given-names>
            <surname>Bianchi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Nozza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Hovy</surname>
          </string-name>
          , et al.,
          <article-title>Feel-it: Emotion and sentiment classification for the italian language</article-title>
          ,
          <source>in: Proceedings of the Eleventh Workshop on Computational Approaches</source>
          to Subjectivity, Sentiment and
          <string-name>
            <surname>Social Media Analysis</surname>
          </string-name>
          ,
          <source>Association for Computational Linguistics</source>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , Bert:
          <article-title>Pre-training of deep bidirectional transformers for language understanding</article-title>
          , arXiv preprint arXiv:
          <year>1810</year>
          .
          <volume>04805</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>D.</given-names>
            <surname>Demszky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Movshovitz-Attias</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Cowen</surname>
          </string-name>
          , G. Nemade,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ravi</surname>
          </string-name>
          ,
          <article-title>Goemotions: A dataset of fine-grained emotions</article-title>
          , arXiv preprint arXiv:
          <year>2005</year>
          .
          <volume>00547</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>M.</given-names>
            <surname>Pota</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ventura</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Catelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Esposito</surname>
          </string-name>
          ,
          <article-title>An effective bert-based pipeline for twitter sentiment analysis: A case study in italian</article-title>
          ,
          <source>Sensors</source>
          <volume>21</volume>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>