<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Overview of the EVALITA 2018 Italian Emoji Prediction (ITAMoji) Task</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Francesco Ronzano</string-name>
          <email>francesco.ronzano@upf.edu</email>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Francesco Barbieri</string-name>
          <email>francesco.barbieri@upf.edu</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Endang Wahyu Pamungkas,Viviana Patti</string-name>
          <email>pamungka@di.unito.it</email>
          <email>patti@di.unito.it</email>
          <email>{pamungka,patti}@di.unito.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Francesca Chiusaroli</string-name>
          <email>f.chiusaroli@unimc.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, University of Turin</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Humanities, Università di Macerata</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Universitat Pompeu Fabra</institution>
          ,
          <addr-line>Barcelona</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Universitat Pompeu Fabra, Spain, Hospital del Mar Medical Research Center</institution>
          ,
          <addr-line>Barcelona</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>English. The Italian Emoji Prediction task (ITAmoji) is proposed at EVALITA 2018 evaluation campaign for the first time, after the success of the twin Multilingual Emoji Prediction Task, organized in the context of SemEval-2018 in order to challenge the research community to automatically model the semantics of emojis in Twitter. Participants were invited to submit systems designed to predict, given an Italian tweet, its most likely associated emoji, selected in a wide and heterogeneous emoji space. Twelve runs were submitted at ITAmoji by five teams. We present the data sets, the evaluation methodology including different metrics and the approaches of the participating systems. We also present a comparison between the performance of automatic systems and humans solving the same task. Data and further information about this task can be found at: https://sites. google.com/view/itamoji/.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Italiano. Il task italiano per la predizione
degli emoji in Twitter (ITAmoji) viene
proposto nell’ambito della campagna di
valutazione di Evalita 2018 per la prima volta,
dopo il successo del task gemello, il
Multilingual Emoji Prediction Task, proposto
a Semeval-2018 per stimolare la
comunità di ricerca a costruire modelli
computazionali della semantica delle emoji in
Twitter. I partecipanti sono stati invitati
a costruire sistemi disegnati per predire
l’emoji piú probabile dato un tweet in
italiano, selezionandola in uno spazio
ampio e eterogeneo di emoji. In ITAmoji
sono stati valutati i risultati di dodici
sistemi di predizione di emoji messi a punto
da cinque gruppi di lavoro.
Presentiamo qui i dataset, la metodologia di
valutazione (che include diverse metriche) e
gli approcci dei sistemi che hanno
partecipato. Presentiamo inoltre una riflessione
sui risultati ottenuti in tale task da sistemi
automatici e umani.
1</p>
    </sec>
    <sec id="sec-2">
      <title>Introduction</title>
      <p>During the last decade the use of emoji has
increasingly pervaded social media platforms by
providing users with a rich set of pictograms
useful to visually complement and enrich the
expressiveness of short text messages. Nowadays this
novel, visual way of communication represents a
de facto standard in a wide range of social media
platforms including fully-fledged portals for
usergenerated contents like Twitter, Facebook and
Instagram as well as instant-messaging services like
WhatsApp. As a consequence, the possibility to
effectively interpret and model the semantics of
emojis has become an essential task to deal with
when we analyze social media contents.</p>
      <p>
        Even if over the last few years the study of
this new form of language has been receiving a
growing attention, at present the body of
investigations that deal with emojis is still scarce, especially
when we consider their characterization from a
Natural Language Processing (NLP) perspective.
While there are notable exceptions which study
the semantics of emojis and their usage
        <xref ref-type="bibr" rid="ref14 ref16 ref16 ref2 ref3 ref3 ref4 ref4 ref6 ref7">(Barbieri et al., 2016a; Barbieri et al., 2018b; Aoki
and Uchida, 2011; Eisner et al., 2016; Ljubešic´
and Fišer, 2016)</xref>
        , reflecting also on their
informative behaviour
        <xref ref-type="bibr" rid="ref1 ref11 ref12 ref13 ref17 ref18 ref5 ref8">(Donato and Paggio, 2017; Donato
and Paggio, 2018)</xref>
        , or their sentiment
        <xref ref-type="bibr" rid="ref21">(Novak et
al., 2015)</xref>
        , the interplay between text-based
messages and emojis remains still explored only by a
small number of studies. Among these
investigations there is the analysis of emoji predictability
by
        <xref ref-type="bibr" rid="ref5">(Barbieri et al., 2017)</xref>
        , which proposed a neural
model to predict the most likely emoji to appear
in a text message (tweet). The task resulted to be
hard, as emojis encode multiple meanings
        <xref ref-type="bibr" rid="ref3 ref4">(Barbieri et al., 2016b)</xref>
        . Related to this, in the context
of the International Workshop on Semantic
Evaluation (SemEVAL 2018), the Multilingual Emoji
Prediction Task
        <xref ref-type="bibr" rid="ref6 ref7">(Barbieri et al., 2018a)</xref>
        has been
organized in order to challenge the research
community to automatically model the semantics of
emojis occurring in English and Spanish Twitter
messages. The task was very successful, with 49
teams participating in the English subtask and 22
in the Spanish subtask. This motivated us to
propose the shared task also for the Italian language
in the context of the Evalita 2018 evaluation
campaign
        <xref ref-type="bibr" rid="ref8">(Caselli et al., 2018)</xref>
        , with the twofold aim to
widen the setting for cross-language comparisons
for emoji prediction in Twitter and to experiment
with novel metrics to better assess the quality of
the automatic predictions.
      </p>
      <p>
        In general, exciting and highly relevant avenues
for research are still to explore with respect to
emoji understanding, since emojis represent often
an essential component of social media texts:
ignoring or misinterpreting them may lead to
misunderstandings in comprehending the intended
meaning of a message
        <xref ref-type="bibr" rid="ref19">(Miller et al., 2016)</xref>
        . The
ambiguity of emojis raises also interesting
questions in application domains, think for instance to
a human-computer interaction setting: how can
we teach an artificial agent to correctly interpret
and recognize emojis’ use in spontaneous
conversation? The main motivation behind this question
is that an AI system able to predict emojis could
contribute notably to better natural language
understanding
        <xref ref-type="bibr" rid="ref21">(Novak et al., 2015)</xref>
        and thus to other
Natural Language Processing tasks such as
generating emoji-enriched social media content,
enhancing emotion/sentiment analysis systems,
improving retrieval of social network material, and
ultimately improving user profiling.
      </p>
      <p>In the following, we describe the main elements
of the shared task (Section 3), after proposing a
brief summary about previous projects reflecting
on the semantics of emojis in Italian (Section 2).
Then, we cover the data collection, curation and
release process (Section 4). In Section 5 we
detail the evaluation metrics, we describe the
participants results and we propose a first comparison
with performances of humans solving the same
task. We conclude the paper with some reflections
on the outcomes of the proposed task.
2</p>
    </sec>
    <sec id="sec-3">
      <title>Emojis and Italian</title>
      <p>
        We can observe a growing interest on the
semantics of emojis in relation with Italian. In
particular, some recent interesting projects have
been carried out in the last years, which address
the issue in a translation framework, investigating
the possibility to translate from Italian literary
texts into the universal visual language of emoji
        <xref ref-type="bibr" rid="ref20 ref9">(Chiusaroli, 2015; Monti et al., 2016)</xref>
        . In
particular, the Emojitaliano project was launched as a
translation project of the Italian novel Pinocchio
in emoji
        <xref ref-type="bibr" rid="ref10">(Chiusaroli, 2017)</xref>
        on Twitter. An
original approach based on crowdsourcing was
adopted, by involving for the translation task the
Twitter community named as Scritture Brevi.
The Twitter community #scritturebrevi The
community (#scritturebrevi, @FChiusaroli,
10,151 followers in November 2018) had
previously been involved in experiments of creative
writing, also in emojis: with the hashtag
#inemoticon, on Twitter, experiments of mixed translation
- words and emojis – have been carried out,
experiencing the semantic versatility of emojis,
and their values in rebus writings. Translating
the whole Pinocchio book was a more complex
and engaging task, especially for its focus on
developing a common code base, in terms of
glossary and grammar, which is absolute new
with respect to previous projects. The translation
of Pinocchio started on February 2016. Everyday,
for 28 weeks, sentences taken from Pinocchio
were tweeted, and the followers were invited to
suggest their translations to emoji; at the end of
each day, the official version of the translation
was validated and published. An online tool
Emojiitalianobot has been developed in order to
support the community to memorize the semantic
values assigned to each emoji during the
collective translation process. Since its first beginning
on Twitter, the project was an instant success,
becoming a viral web phenomenon thanks to
the Scritture brevi community. Therefore, it
was a natural choice to involve the same Twitter
community to reflect on the semantics of emoji
from a different perspective, i.e. the one we
propose in the context of the ITAmoji shared task,
thus helping us to understand how humans are
good at predicting emojis (see Section 5.5.2).
samples.
      </p>
      <sec id="sec-3-1">
        <title>Emoji</title>
        <p>3</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Task Description</title>
      <p>We invited participants to submit systems
designed to predict, given a tweet in Italian, its most
likely associated emoji, only based on the text of
the tweet. As for the experimental setting, for
simplicity purposes, we considered tweets including
only one emoji (eventually repeated). After
removing the emoji from the tweet, we asked users
to predict it. We challenged systems to predict
Innamorato sempre di più
[URL]
emojis among a wide and heterogeneous emoji
space. In particular, we selected the tweets that
included one of the twenty five emojis that occur
most frequently in the Twitter data we collected
(see Table 1). Therefore, the task can be seen
as a multi-class classification task where systems
should predict one of 25 possible emojis from the
text of a given tweet. Each participant was
allowed to submit up to three system runs.
Participants were allowed to use additional data to train
the systems such as lexicons and pre-trained word
embeddings. In order to have the possibility to
perform a finer grained evaluation of results, we
encouraged participants to submit, for each tweet,
not only the most likely emoji predicted but also
the complete rank from the most likely to the less
likely emoji to be associated to the text of the
tweet.
4</p>
    </sec>
    <sec id="sec-5">
      <title>Task Data</title>
      <p>
        The data for this task were retrieved from Twitter
by experimenting with two different approaches:
(i) gathering Twitter stream on (geolocalized)
Italian tweets from October 2015 to February 2018;
and (ii) retrieving tweets from the followers of the
most popular Italian newspaper’s accounts. We
randomly selected 275; 000 tweets from these
collections by choosing tweets that contained one and
only one emoji over 25 most frequent emojis listed
in Table 1. We split our data into two sets
consisting of 250; 000 training samples and 25; 000 test
% Tweet in Train and Test set
20.27
The evaluation of the emoji prediction systems
has been based on the classic precision and
recall metrics over each emoji. The final ranking of
the participating teams of ITAmoji 2018 relies on
the Macro F1 score computed with respect to the
most likely emoji predicted, given the text of each
tweet of the test set, in line with the proposal in the
twin task at Semeval 2018 for English and Spanish
        <xref ref-type="bibr" rid="ref6 ref7">(Barbieri et al., 2018a)</xref>
        . In this way we intend to
encourage systems to perform well overall, which
would inherently mean a better sensitivity to the
use of emojis in general, rather than for instance
overfitting a model to do well in the three or four
most common emojis of the test data.
      </p>
      <p>In general, the identification of a coherent and
effective approach to compare the performance of
distinct emoji prediction systems is not an easy
task. We have often the clear impression that the
semantics of some sets of emojis can be similar,
therefore it would be interesting to have a way to
compare and evaluate at a finer grained level the
emoji prediction quality of two distinct systems,
when they both fail in predicting the right emoji to
associate to a tweet. In such cases, indeed, it can
be important to distinguish between the system
that identifies the right prediction among the most
likely emojis to be associated to that tweet and the
one that characterizes the right prediction as an
emoji that is unlikely to be associated to that tweet.
In order to catch this aspect, we gave ITAmoji
participants the possibility to submit as emoji
predictions, the ordered ranking of the 25 emojis
considered in ITAmoji. Systems providing the ranked
list of emoji predictions were also compared by
considering the following additional
emoji-rankbased metrics: Accuracy@5/10/15/20 and
Coverage Error. All the submissions we received
provided the ranked list of 25 emojis as
predictions: as a consequence it was possible to compute
the emoji-rank-based metrics considered for all of
them.</p>
      <p>A detailed description of all the evaluation
metrics we considered to compare the quality of emoji
prediction approaches is given below. The
following three standard metrics are computed by
considering only the emoji predicted as the most
likely one to be associated to the text of a tweet:
Macro F1: compute the F1 score for each
label (emoji), and find their un-weighted mean
(exploited to determine the final ranking of
the participating teams);
Micro F1: compute the F1 score globally by
counting the total true positives, false
negatives and false positives across all label
(emojis);
Weighted F1: compute the F1 score for
each label (emoji), and find their average,
weighted by support (the number of true
instances for each label);</p>
      <sec id="sec-5-1">
        <title>Regarding the emoji-rank-based metrics, we</title>
        <p>considered:</p>
        <p>Coverage error: compute how far we need
to go through the ranked scores of labels
(emojis) to cover all true labels;
Accuracy@n: is the accuracy value
computed by considering as right predictions the
ones in which the right label (emoji) is among
the top N most likely ones.
5.2</p>
      </sec>
      <sec id="sec-5-2">
        <title>Baseline</title>
        <p>
          In order to compare the performance of the
ITAmoji participating systems with baseline
approaches, we considered three different baselines:
- Majority baseline: for each text of a tweet we
predict the ordered list of 25 most-likely emojis
sorted by their frequency in the training set, that
is, we always predict as first choice the red heart,
and as last choice the rose emoji.
- Weighted random baseline: for each text of a
tweet we predict the ordered list of the 25
mostlikely emojis where the first prediction is
randomly selected taking in consideration the
labelfrequency in the training set (in order to keep the
same labels distribution) and the rest of the
predictions (from the second to the last one) are
generated by considering the rest of emojis sorted by
label-frequency.
- FastText baseline: for each text of a tweet
we predict the ordered list of the 25 most-likely
emojis by relying on fasttext with basic
parameters1 and pretrained embeddings with 300
dimensions
          <xref ref-type="bibr" rid="ref16 ref3 ref4">(Barbieri et al., 2016a)</xref>
          .
5.3
        </p>
      </sec>
      <sec id="sec-5-3">
        <title>Participating Systems and Results</title>
        <p>
          We received 12 submissions in total from 5
different teams. The main approaches and features of
participating teams are described below.
FBK_FLEXED_BICEPS
          <xref ref-type="bibr" rid="ref1">(Andrei et al., 2018)</xref>
          This system exploit recurrent neural network
architecture Bidirectional Long Short Term
Memory (Bi-LSTM), together with user based features
to deal with this task. They concatenate the
output of Bi-LSTM network that take word sequence
as input with the user history distribution in
using emoji. Finally, the softmax activation is used
to get the probability distribution of the 25 emoji
labels.
        </p>
        <sec id="sec-5-3-1">
          <title>1https://fasttext.cc/</title>
          <p>
            GW2017
            <xref ref-type="bibr" rid="ref1 ref11 ref13 ref17 ref18 ref8">(Mauro and Xileny, 2018)</xref>
            This
system based on ensemble of two models, Bi-LSTM
and LightGBM2. The first model uses two
different word2vec models based on the time creation,
while the second model exploits several surfaces
feature extracted from tweet text (e.g., number of
words, number of characters).
          </p>
          <p>
            CIML-UNIPI
            <xref ref-type="bibr" rid="ref11">(Daniele et al., 2018)</xref>
            This system
is based on ensemble composed of 13 models (12
basen on TreeESNs and one on LSTM over
characters. Models based on TreeESN are built by
varying the number of reservoir units, activation
function, readout and parser.
sentim
            <xref ref-type="bibr" rid="ref15">(Jacob, 2018)</xref>
            This system relies on a
convolutional neural network (CNN) architecture
which uses character embedding as input. 9 layers
of residual dilated convolutions with skip
connections are applied, followed by a ReLU activation
to increase nonlinearity.
          </p>
          <p>
            UNIBA
            <xref ref-type="bibr" rid="ref1 ref11 ref13 ref17 ref18 ref8">(Lucia and Daniela, 2018)</xref>
            This system
is built by using ensemble classifier based on
WEKA3 and scikit-learn4. Several features are
exploited by using micro-blogging based feature,
sentiment based feature, and semantic based
feature.
          </p>
          <p>Table 2 shows the official results of
ITAmoji 2018 task, ordered by decreasing Macro
F1. The best performing system was proposed by
the FBK_FLEXED_BICEPS team, which achieves
0.365312 in Macro F1. Overall, we can see that
systems which exploit neural network
architecture obtained good performances in this task,
especially when relying on Bi-LSTM model. Table
3 shows the performance of ITAmoji systems with
respect to emoji-rank-based metrics.
5.4</p>
        </sec>
      </sec>
      <sec id="sec-5-4">
        <title>Analysis</title>
        <p>From Table 2 we can notice that the ranking
order of the 5 system runs that obtained the best
Macro F1 is substantially preserved when we
consider Micro F1 or Weighted F1. Anyway, with
respect to Macro F1, when we consider Micro F1
the differences among the scores obtained by the
top-performing systems tend to be substantially
smaller: for instance the Macro F1 of the best
system is greater by a factor of 1.64 with respect to
the fifth system, while the Micro F1 of the best
system is greater by a factor of 1.18 with respect
to the fifth system (ranked by Micro F1). This fact
2https://github.com/Microsoft/LightGBM
3https://www.cs.waikato.ac.nz/ml/weka/
4http://scikit-learn.org/stable/
can be motivated by the trend, when we consider
Micro F1, to favour systems that tend to overfit
their prediction model to do well in the most
common emojis of the test data with respect to
systems with good performances over all emojis: this
fact confirms our choice to select Macro F1 as the
official metric to rank ITAmoji 2018 participating
systems.</p>
        <p>From Table 3 we can see how the order to the
top-5 best performing systems in terms of Macro
F1 is substantially preserved when we consider
the emoji-rank-based metrics Coverage Error and
Accuracy@5 (except for the switch between the
fourth and fifth best performing approach).</p>
        <p>If we consider the performance of our three
baseline systems (described in Section 5.2) we can
notice from Table 2 that, as expected, FastText is
the best performing baseline approach: a FastText
embedding based prediction system would have
ranked as eight by Macro F1 in ITAmoji 2018.</p>
        <p>Table 6 shows the highest F1 score for each
emoji / label across all ITAmoji 2018 team
submissions. We can notice that even if specific
emojis like , , , or are characterized by a small
percentage of training samples (about 1%),
prediction systems manage to obtain high Macro F1
scores. In contrast, when we consider emojis like
or , even if there are more training samples
available with respect to the previous set of
emojis (more than 2%), we observe that the
prediction systems do not manage to get high Macro F1
scores. This fact can be explained by the
variability of the context of use that characterizes the
latter set of emojis that makes it difficult for system
to learn to predict.</p>
        <p>To conclude our analysis, we have to notice that
the three runs that obtained the highest Macro F1
scores, to predict the emojis exploited, besides the
text of a tweet, the way the author of that tweet
used emojis in previous tweets. This fact
highlights that the choice of an emoji strongly depends
on the preferences and writing style of each
individual, both representing relevant inputs to model
in order to improve emoji prediction quality.
5.5</p>
      </sec>
      <sec id="sec-5-5">
        <title>Emoji prediction by humans</title>
        <p>In this section we present a preliminary discussion
of the results of two experiments designed in
order to evaluate how humans perform when they
are requested to identify the most likely emoji(s)
to associate to the text of an Italian tweet. The
Sentim_Test_Run_3
Sentim_Test_Run_2
gw2017_pe
itamoji_uniba_run1
Sentim_Test_Run_1
emoji predicted as the most likely one to be associated to the text of a tweet. Teams runs are ranked by
Macro F1. The table shows also the performance of the three baselines considered in ITAmoji, ranked
with respect to their Macro F1.
racy@n). Teams runs ranked by Macro F1. The table shows also the performance of the three baselines
considered in ITAmoji, ranked with respect to their Macro F1.
final purpose here is to explore if humans are
better than automated systems in the emoji
prediction task from text, or viceversa. In an attempt to
consider an uniform set of emojis in our
experimental settings, in both human emoji prediction
experiments described in the rest of this section
we decided to focus only on the 15 emojis shown
in Table 4. This group of emojis includes all the
yellow-face emojis considered in the ITAmoji task
(Table 1).
5.5.1
We selected 1,005 tweets with one face-emojis
from the ITAmoji test set and set up a collaborative
annotation task in Figure Eight (F8)5 by asking
an</p>
        <sec id="sec-5-5-1">
          <title>5https://www.figure-eight.com/</title>
          <p>of each tweet 6. The set of 1,005 tweets to annotate
was perfectly balanced across the 15 face emojis
considered. A total of 64 annotators from the F8
6Instructions provided to annotators (in Italian) here:
http://bit.ly/itaMoji
platform provided 6,150 evaluations by spotting
the 3 most likely face emojis to associate to the
text of a tweet.</p>
          <p>The Macro F1 of F8 annotators is 24.74. On
the same set of 1,005 tweets, the emoji prediction
performance of human annotators was better than
9 out of 12 systems submitted to ITAmoji.
However, the the best performing system submitted to
ITAmoji obtained a Macro F1 of 40.48 on those
tweets, suggesting that computational models can
perform better than humans in this task.
5.5.2</p>
        </sec>
      </sec>
      <sec id="sec-5-6">
        <title>Twitter human annotation</title>
        <p>Thanks to the support and collaboration of the
#scritturebrevi Twitter community, we replicated
the human annotation experiment carried out in F8
in a “crowdcourcing in the wild” setting. From the
end of July to the beginning of September 2018,
we posted 485 tweets on the Scritture Brevi
Twitter account (@FChiusaroli), most of them selected
from the same portion of the ITAmoji test set
considered in our F8 experiment (see Section 5.5.1).
Members of the Scritture Brevi Twitter
community were called to participate to a sort of Twitter
crowdsourcing game with slogan #ITAmoji che
passione and hashtag #ITAmoji. Every day a set
of tweets without emoji was posted on the
Scritture Brevi Twitter account, and ITAmojiers had
to post as a reply the most likely face emoj they
would associate to the text of the posted tweet7.
The game became viral. We managed to involve
more than one hundred users with an average
number of valid predictions/replies per tweet equal
to 5.4. When the #ITAmoji che passione game
ended, we were able to identify for each tweet
posted on #scritturebrevi (485 tweets in total) the
most-likely face-emoji that the Twitter community
would associate. In general, the emoji prediction
performance of people from Scritture Brevi
Twitter community was better than 8 out of 12 systems
submitted to ITAmoji (always on the same set of
485 tweets annotated by that community).
5.5.3</p>
      </sec>
      <sec id="sec-5-7">
        <title>Comparing human and automated emoji predictions</title>
        <p>In the two experiments just described, we asked
humans to identify the face emoji(s) they would
associate to the text of a tweet by exploiting
differ7The announce of the “#ITAmoji che passione” game
was published on the Scritture Brevi’s blog and linked to
every posted tweet: https://www.scritturebrevi.
it/2018/07/16/itamoji-che-passione/
ent approaches to collect data: a controlled
collaborative annotation environment in the case of F8
(Section 5.5.1) and a “crowdcourcing in the wild”
setting in the case of the Scritture Brevi Twitter
community (Section 5.5.2). In Table 5 we
compare the emoji prediction performance of human
annotators (from both F8 and Scritture Brevi
Twitter community) with the performance of the emoji
prediction systems submitted to ITAmoji. To
perform this comparison we consider the set of 428
tweets of the ITAmoji test set annotated by F8 and
the Scritture Brevi Twitter community.</p>
        <p>We can notice that human predictions, both
from F8 and Scritture Brevi, outperforms most
of the automated systems. Moreover, F8
predictions obtain a Macro F1 (24.46) higher than
Scritture Brevi Twitter community (22.94). This trend
may be related to the fact that F8, in contrast to
the #scritturebrevi Twitter community, represents
a controlled annotation environment.
6</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Conclusion</title>
      <p>Considered the widespread diffusion of emojis
as visual devices useful to provide an additional
layer of meaning to social media messages, on one
hand, and the unquestionable role of Twitter as one
of the most important social media platforms, on
the other, we proposed this year at Evalita 2018
ITAmoji, the Italian Emoji Prediction task.</p>
      <p>
        Results of automated systems are in line with ones
obtained in the twin shared task proposed for
English and Spanish at Semeval 2018
        <xref ref-type="bibr" rid="ref6 ref7">(Barbieri et
al., 2018a)</xref>
        . The introduction of new experimental
emoji-rank based metrics in ITAmoji allowed us
to perform a finer-grained evaluation of the
systems’ emoji prediction quality. Moreover,
comparing performances of humans and systems in the
emoji prediction task confirms also in an Italian
setting the outcomes of a similar experiment
proposed for English
        <xref ref-type="bibr" rid="ref5">(Barbieri et al., 2017)</xref>
        ,
suggesting that computational models are able to better
capture the underlying semantics of emojis.
      </p>
      <p>Team
FBK_FLEXED_BICEPS
FBK_FLEXED_BICEPS
FBK_FLEXED_BICEPS
Figure Eight predictions
CIML-UNIPI
Scritture Brevi predictions
GW2017
GW2017
CIML-UNIPI
sentim
sentim
GW2017
UNIBA
sentim
run1
gw2017_p
gw2017_e
run2
Sentim_Test_Run_2
Sentim_Test_Run_3
gw2017_pe
itamoji_uniba_run1
Sentim_Test_Run_1
proaches, compared by considering the set of 428 tweets with face-emoji that are part of the ITAmoji test
set and have been annotated by both Figure 8 platform and #scritturebrevi community. Emoji prediction
approaches are ranked by decreasing Macro F1.
respectively show, for each emoji, the number and percentage of test samples present in the test dataset.
red heart
face with tears of joy
kiss mark
face savoring food
rose
sun
smiling face with heart eyes
face blowing a kiss
blue heart
smiling face with smiling eyes
grinning face
winking face
beaming face with smiling eyes
sparkles
thumbs up
rolling on the floor laughing
smiling face with sunglasses
flexed biceps
thinking face
two hearts
loudly crying face
top arrow
grinning face with sweat
winking face with tongue
face screaming in fear</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>Catalin</given-names>
            <surname>Coman</surname>
          </string-name>
          <string-name>
            <surname>Andrei</surname>
          </string-name>
          , Nechaev Yaroslav, and
          <string-name>
            <given-names>Zara</given-names>
            <surname>Giacomo</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Predicting emoji exploiting multimodal data: Fbk participation in itamoji task</article-title>
          .
          <source>In Tommaso Caselli</source>
          , Nicole Novielli, Viviana Patti, and Paolo Rosso, editors,
          <source>Proceedings of 6th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA</source>
          <year>2018</year>
          ), Turin, Italy. CEUR.org.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>Sho</given-names>
            <surname>Aoki</surname>
          </string-name>
          and
          <string-name>
            <given-names>Osamu</given-names>
            <surname>Uchida</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>A method for automatically generating the emotional vectors of emoticons using weblog articles</article-title>
          .
          <source>In Proc. 10th WSEAS Int. Conf. on Applied Computer and Applied Computational Science</source>
          , Stevens Point, Wisconsin, USA, pages
          <fpage>132</fpage>
          -
          <lpage>136</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>Francesco</given-names>
            <surname>Barbieri</surname>
          </string-name>
          , German Kruszewski, Francesco Ronzano, and
          <string-name>
            <given-names>Horacio</given-names>
            <surname>Saggion</surname>
          </string-name>
          . 2016a.
          <article-title>How cosmopolitan are emojis?: Exploring emojis usage and meaning over different languages with distributional semantics</article-title>
          .
          <source>In Proceedings of the 2016 ACM on Multimedia Conference</source>
          , pages
          <fpage>531</fpage>
          -
          <lpage>535</lpage>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>Francesco</given-names>
            <surname>Barbieri</surname>
          </string-name>
          , Francesco Ronzano, and
          <string-name>
            <given-names>Horacio</given-names>
            <surname>Saggion</surname>
          </string-name>
          . 2016b.
          <article-title>What does this emoji mean? a vector space skip-gram model for Twitter emojis</article-title>
          .
          <source>In Proc. of LREC</source>
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>Francesco</given-names>
            <surname>Barbieri</surname>
          </string-name>
          , Miguel Ballesteros, and
          <string-name>
            <given-names>Horacio</given-names>
            <surname>Saggion</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Are emojis predictable</article-title>
          ?
          <source>In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume</source>
          <volume>2</volume>
          ,
          <string-name>
            <surname>Short</surname>
            <given-names>Papers</given-names>
          </string-name>
          , pages
          <fpage>105</fpage>
          -
          <lpage>111</lpage>
          , Valencia, Spain, April. ACL.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>Francesco</given-names>
            <surname>Barbieri</surname>
          </string-name>
          , Jose Camacho-Collados, Francesco Ronzano, Luis Espinosa Anke, Miguel Ballesteros, Valerio Basile, Viviana Patti, and
          <string-name>
            <given-names>Horacio</given-names>
            <surname>Saggion</surname>
          </string-name>
          . 2018a.
          <article-title>Semeval 2018 task 2: Multilingual emoji prediction</article-title>
          .
          <source>In Proceedings of The 12th International Workshop on Semantic Evaluation</source>
          , pages
          <fpage>24</fpage>
          -
          <lpage>33</lpage>
          . ACL.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>Francesco</given-names>
            <surname>Barbieri</surname>
          </string-name>
          , Luis Marujo,
          <string-name>
            <given-names>William</given-names>
            <surname>Brendel</surname>
          </string-name>
          , Pradeep Karuturim, and
          <string-name>
            <given-names>Horacio</given-names>
            <surname>Saggion</surname>
          </string-name>
          . 2018b.
          <article-title>Exploring Emoji Usage and Prediction Through a Temporal Variation Lens</article-title>
          .
          <source>In 1st International Workshop on Emoji Understanding and Applications in Social Media (at ICWSM</source>
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <given-names>Tommaso</given-names>
            <surname>Caselli</surname>
          </string-name>
          , Nicole Novielli, Viviana Patti, and
          <string-name>
            <given-names>Paolo</given-names>
            <surname>Rosso</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Evalita 2018: Overview of the 6th evaluation campaign of natural language processing and speech tools for italian</article-title>
          .
          <source>In Tommaso Caselli</source>
          , Nicole Novielli, Viviana Patti, and Paolo Rosso, editors,
          <source>Proceedings of Sixth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA</source>
          <year>2018</year>
          ), Turin, Italy. CEUR.org.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>Francesca</given-names>
            <surname>Chiusaroli</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>La scrittura in emoji tra dizionario e traduzione</article-title>
          .
          <source>In Proceedings of 2nd Italian Conference on Computational Linguistics</source>
          (CLiC-it
          <year>2015</year>
          ), Trento, Italy, December 3-
          <issue>4</issue>
          ,
          <year>2015</year>
          . Aacademia University Press.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>F.</given-names>
            <surname>Chiusaroli</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Pinocchio in emojitaliano</article-title>
          .
          <source>Apice Libri.</source>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <given-names>Di</given-names>
            <surname>Sarli</surname>
          </string-name>
          <string-name>
            <surname>Daniele</surname>
          </string-name>
          , Gallicchio Claudio, and
          <string-name>
            <given-names>Micheli</given-names>
            <surname>Alessio</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Itamoji 2018: Emoji prediction via tree echo state networks</article-title>
          .
          <source>In Tommaso Caselli</source>
          , Nicole Novielli, Viviana Patti, and Paolo Rosso, editors,
          <source>Proceedings of 6th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA</source>
          <year>2018</year>
          ), Turin, Italy. CEUR.org.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>Giulia</given-names>
            <surname>Donato</surname>
          </string-name>
          and
          <string-name>
            <given-names>Patrizia</given-names>
            <surname>Paggio</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Investigating redundancy in emoji use: Study on a Twitter based corpus</article-title>
          .
          <source>In Proceedings of the 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis</source>
          , pages
          <fpage>118</fpage>
          -
          <lpage>126</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <given-names>Giulia</given-names>
            <surname>Donato</surname>
          </string-name>
          and
          <string-name>
            <given-names>Patrizia</given-names>
            <surname>Paggio</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Classifying the Informative Behaviour of Emoji in Microblogs</article-title>
          .
          <source>In Proc. of the 11th International Conference on Language Resources and Evaluation (LREC</source>
          <year>2018</year>
          ), Miyazaki, Japan, May 7-
          <issue>12</issue>
          ,
          <year>2018</year>
          . ELRA.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <given-names>Ben</given-names>
            <surname>Eisner</surname>
          </string-name>
          , Tim Rocktäschel, Isabelle Augenstein, Matko Bošnjak, and
          <string-name>
            <given-names>Sebastian</given-names>
            <surname>Riedel</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>emoji2vec: Learning emoji representations from their description</article-title>
          .
          <source>arXiv preprint arXiv:1609</source>
          .
          <fpage>08359</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <given-names>Anderson</given-names>
            <surname>Jacob</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Fully convolutional networks for text classification</article-title>
          . In Tommaso Caselli, Nicole Novielli, Viviana Patti, and Paolo Rosso, editors,
          <source>Proceedings of 6th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA</source>
          <year>2018</year>
          ), Turin, Italy. CEUR.org.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <given-names>Nikola</given-names>
            <surname>Ljubešic</surname>
          </string-name>
          ´ and
          <string-name>
            <given-names>Darja</given-names>
            <surname>Fišer</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>A global analysis of emoji usage</article-title>
          .
          <source>In Proceedings of the 10th Web as Corpus Workshop</source>
          , pages
          <fpage>82</fpage>
          -
          <lpage>89</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <given-names>Siciliani</given-names>
            <surname>Lucia</surname>
          </string-name>
          and
          <string-name>
            <given-names>Girardi</given-names>
            <surname>Daniela</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>The uniba system at the evalita 2018 italian emoji prediction task</article-title>
          .
          <source>In Tommaso Caselli</source>
          , Nicole Novielli, Viviana Patti, and Paolo Rosso, editors,
          <source>Proceedings of 6th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA</source>
          <year>2018</year>
          ), Turin, Italy. CEUR.org.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <given-names>Bennici</given-names>
            <surname>Mauro</surname>
          </string-name>
          and Seijas Portocarrero Xileny.
          <year>2018</year>
          .
          <article-title>The validity of word vectors over the time for the evalita 2018 emoji prediction task (itamoji)</article-title>
          .
          <source>In Tommaso Caselli</source>
          , Nicole Novielli, Viviana Patti, and Paolo Rosso, editors,
          <source>Proceedings of 6th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA</source>
          <year>2018</year>
          ), Turin, Italy. CEUR.org.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <given-names>Hannah</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <string-name>
            <surname>Jacob</surname>
            Thebault-Spieker,
            <given-names>Shuo</given-names>
          </string-name>
          <string-name>
            <surname>Chang</surname>
            , Isaac Johnson, Loren Terveen, and
            <given-names>Brent</given-names>
          </string-name>
          <string-name>
            <surname>Hecht</surname>
          </string-name>
          .
          <year>2016</year>
          . “
          <article-title>Blissfully happy" or “ready to fight": Varying interpretations of emoji</article-title>
          .
          <source>Proc. of ICWSM'16.</source>
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <given-names>Johanna</given-names>
            <surname>Monti</surname>
          </string-name>
          , Federico Sangati, Francesca Chiusaroli,
          <string-name>
            <given-names>Martin</given-names>
            <surname>Benjamin</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Sina</given-names>
            <surname>Mansour</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Emojitalianobot and emojiworldbot - new online tools and digital environments for translation into emoji</article-title>
          .
          <source>In Proc. CLiC-it</source>
          <year>2016</year>
          , Napoli, Italy, December 5-
          <issue>7</issue>
          ,
          <year>2016</year>
          ., volume
          <volume>1749</volume>
          <source>of CEUR Workshop Proceedings.</source>
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <string-name>
            <given-names>Petra</given-names>
            <surname>Kralj</surname>
          </string-name>
          <string-name>
            <surname>Novak</surname>
          </string-name>
          , Jasmina Smailovic´,
          <string-name>
            <given-names>Borut</given-names>
            <surname>Sluban</surname>
          </string-name>
          , and Igor Mozeticˇ.
          <year>2015</year>
          .
          <article-title>Sentiment of emojis</article-title>
          .
          <source>PloS one</source>
          ,
          <volume>10</volume>
          (
          <issue>12</issue>
          ):
          <fpage>e0144296</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>