#andràtuttobene: Images, Texts, Emojis and Geodata in a Sentiment Analysis Pipeline

                        Pierluigi Vitale, Serena Pelosi, Mariacristina Falco
                        Department of Political and Communication Sciences
                                        University of Salerno
                           [pvitale,spelosi,mfalco]@unisa.it


                      Abstract                                 The methodology proposed is based on a fully
                                                            automatic natural language processing pipeline,
     This research investigates Instagram us-               including the images’ analysis phase. Its output is
     ers’ sentiment narrated during the lock-               an interactive dashboard (Figure 1) that is able to
     down period in Italy, caused by the                    explore the sentiment analysis values about every
     COVID-19 pandemic. The study is based                  single kind of text, and the synthesis of all of
     on the analysis of all the posts published             them. Thanks to a system of interactions and fil-
     on Instagram under the hashtag #an-                    ters, the observation is leaded by the images’ fea-
     dràtuttobene on May 4, May 18 and June                 tures, such as different kind of spaces (indoor or
     3, 2020. Our research carried out a view               outdoor) and different kind of the photos’ subject
     on a national, regional and provincial                 (human or not human).
     scale. We analyzed all the different lan-                 The collected geographical data enabled the
     guages and forms (i.e. captions, hashtags,             analysis of several dimensions, with an overview
     emojis and images) that constitute the                 observation based on the regional scale. Hence, it
     posts. The aim of this research is to pro-             gave us an opportunity to focus on the deeper
     vide a set of procedures revealing the dif-            level of the Italian provinces. This choice is moti-
     ferent polarity trends for each kind of ex-            vated by the Italian DPCM (Decreto Presidenza
     pression and to propose a single compre-               del Consiglio dei Ministri) published on 24 March
     hensive measure.                                       20202, in which it is stated the partial autonomy of
                                                            the regions.
     Introduction
                                                            1.   State of the Art
   This paper investigates the case of the Italian
most used hashtag about the lockdown period for             In Natural Language Processing (NLP) studies,
the COVID-19 pandemic on Instagram: #an-                    the automatic treatment of opinionated expres-
dràtuttobene1.                                              sions and documents is known as Sentiment Anal-
   The research team collected 7,482 posts, the             ysis.
entire amount published in three specific dates:               Lexical resources for sentiment analysis cre-
May 4, May 18 and June 3, corresponding with                ated for the Italian Language are Sentix (Basile,
three different steps of the reopening phase of the         Nissim 2013); SentIta (Pelosi 2015a); the lexicon
country, led by the government.                             of the FICLIT+CS@UniBO System (Di Gennaro
   Instagram posts are composed by several kind             2014); the CELI Sentiment Lexicon (Bolioli
of languages: captions (texts), hashtags, emojis            2013); the Distributional Polarity Lexicons (Cas-
and images. The aim of this work is to design a set         tellucci 2016).
of procedures revealing the different polarity                 For the Italian language, significant contribu-
trends for each one and to propose a unique meas-           tions on sentiment analysis of social media come
ure. This measure can show the sentiment ex-                from Bosco et al. (2013, 2014), Castellucci (2014,
pressed by the texts, in their semiotic broad mean-         2016) and Stranisci (2016), among others.
ing.


Copyright ©️2020 for this paper by its authors. Use per-    billion monthly active users, according to a study by
mitted under Creative Commons License Attribution           Hootsuite and WeAreSocial.
                                                            2
4.0 International (CC BY 4.0).                                https://www.gazzettauff-
                                                            ciale.it/eli/id/2020/06/17/20G00071/sg
1
  The choice of Instagram is due to its success. It is in
fact one of the most popular social networks, with 1


                                                                                                               1
Hahstag processing in Sentiment Analysis is par-           2017, Tian 2017), but, it is possible to improve the
ticularly challenging in terms of word segmenta-           score of sentiment analysis tools by knowing the
tion. Obviously, the absence of white spaces be-           meaning of emojis (LeCompte 2017, Felbo 2017,
tween words poses several problems that concerns           Guibon 2016, Novak 2015).
ambiguity. Among the most relevant contribution               The content analysis of the images has been ad-
in this area, we cite Zangerle (2018), Reuter              dressed from several perspectives and techniques.
(2016), Simeon (2016); Bansal 2015, Srinivasan                Several studies are moving from a fully quali-
(2012) and Celebi (2018). The solution proposed            tative and manual approach (Tifentale & Ma-
in literature concerns mostly the use of n-grams,          novich, 2015, Vitale et al., 2019, Palazzo et al.,
syntactic complexity, pattern length, or pos-tag-          2020, Esposito et al., 2020) to mixed methods in-
ging.                                                      volving algorithms and computer vision tech-
   In the last years, the way to communicate               niques combined with qualitative observations
online involves many kinds of languages, con-              (Hochman 2015, Indaco & Manovich, 2016).
nected to verbal and non-verbal features. This                This work, starting from a well experimented
complexity makes classical textual analysis less           innovative approach on previous studies (Vitale et
adequate to have a real and representative per-            al., 2020, Giordano et al, 2020), makes the choice
spective on people’s interests and opinions.               to analyze the images in their textual translation,
In particular way, the conventional approach               with a fully automatic analytical pipeline, de-
seems to be not suitable for visual social media,          signed in a semiotic point of view. Besides the se-
such as Instagram, where all the languages are in-         miotic interest to digital media date back to the
volved and the images seem to be dominant.                 early 2000s and continues to the present days,
   The analysis of these social media tends to un-         considering digital media a specific semiotic field
derline the issues of textocentricity (Singhal &           (Cosenza 2014, Bianchi e Cosenza 2020).
Rattine-Flaherty, 2006) and textocentrism (Balo-              Lastly, considering design, the visual represen-
menu & Garrod, 2019), making necessary a dif-              tation of the social media data is increasing wide-
ferent way to approach the participant generated           spread as vehicle for knowledge of several fields
images (PGI) or user generated contents (UGC) in           (Ciuccarelli et al., 2014).
general.                                                      The research team doesn’t provide an algorithm
   Opinions, emotions, and contents are expressed          to analyze the images but adopt the automatic
in a mixed way, that is the combination of several         translation from the social media algorithm, de-
languages, visual and textual, and the related             signed to the visual impaired users by parsing the
metadata, such as: geographical position and               html code of the Instagram web interface. The
hashtag which they are labeled with.                       metadata involved is the “accessibility_caption”.
   The automatic treatment of emojis is faced                 These are lists of words, hierarchically distrib-
through two main approaches: the processing of             uted, that let us to define and observe subject and
the textual descriptions of emojis3 (Fernández-            attributes of the images, in addition to allowing
Gavilanes 2018, Singh 2019) and the analysis of            the analysis of the entities.
the emojis (co-)occurrences (Guibon 2018,
Rakhmetullina 2018, Barbieri 2016, Novak et al.,
2015)4.
   The function of emojis is not limited to a pre-
dictable labeling of the emotional content (Felbo

3                                                          4
  Among the emoji resources, we mention; Emojipedia         Actually, the original meaning of emojis, specified in
(emojipedia.org), iEmoji (www.iemoji.com); the anno-       their descriptions, could be very different with the ones
tated resources, such as The Emoji Dictionary (emo-        attributed by people into specific text occurrences (Fer-
jidictionary.emojifoundation.com); EmojiNet (emo-          nández-Gavilanes 2018, Wood & Ruder 2016). There-
jinet.knoesis.org); the ones which are specific for Sen-   fore, the manual annotation of emoji dictionaries could
timent Analysis and Emotion Detection purposes, such       ignore important details that concerns usage dynamics
as Emoji Sentiment Ranking (kt.ijs.si/data/Emoji_sen-      over time (Ahanin 2020, Felbo 2017).
timent_ranking), EmoTag (abushoeb.github.io/emo-           Nevertheless, the representation of an emoji can vary
tag) and The SentiStrength emoticon sentiment lexicon      widely across different communication platforms
(sentistrength.wlv.ac.uk); and the corpora, such as        (Wagner, 2020) and their semantics can present culture
ITAMoji (Ronazano 2018) and the Emojiworldbot cor-         or language specific usage patterns (Barbieri 2016).
pus (Monti 2016).                                          Thus, the results produced by the analyses of the emoji
                                                           in large corpora could present some drawbacks as well.


                                                                                                                  2
2.   Methodology                                             written Italian have been detected automatically
                                                             by adopting the google translate API (Application
   In this work, we propose the automatic treat-             Programming Interface) and removed. Moreover,
ment of the sentiment expressed into 7,482 Insta-            from this field all the hashtags have been ex-
gram posts.                                                  tracted, to allow their standalone analysis.
   All the information composing the dataset (i.e.              Accessibility captions have been clustered on
captions, hashtags, emojis and images) are auto-             two dimensions: “human or not human” and “in-
matically put into relation with one another and             door or outdoor”, previously defined thanks to a
visualized into an interactive dashboard. The phe-           list of coherent words, subsequently matched by a
nomenon, can be observed through a system of fil-            pattern matching phase6. Geographical coordi-
ters, zooms and interdependent interactions. The             nates set the images on a specific point on the
result captures the topography of feelings, moods            map, so it has been necessary to make a reverse
and needs expressed on the Instagram platform                geocoding procedure to find out region and prov-
during the lockdown.                                         ince levels.7
   The NLP activities are performed in this re-                 Furthermore, Timestamp have been converted
search through the software NooJ5, which allows              in a conventional date and time format.
both the formalization of linguistic resources and              After these steps, images and texts became
the parsing of corpora. The dictionaries and gram-           ready to be analyzed through NLP procedures and
mars, which have been built ad hoc for this work,            mapped with geographical visualization tech-
complement the open-source resources of the                  niques, observing them on the desired timeframe.
basic Italian and English modules of NooJ (Vietri               For the analysis of verbal features, we used
2014).                                                       SentIta, a semi-automatically built lexicon task
   All the pictures published on May 4, May 18               (Pelosi 2015a), containing more than 15,000 lem-
and June 3, 2020 with the hashtag #andratut-                 mas, simple words and multiword units. Each en-
tobene have been collected with a custom python              try is annotated with polarity and intensity scores,
script that simulate the human navigation. For               into a scale that ranges from -3 to +3. It must be
each picture, we collected the entire source code            applied to texts in conjunction with a network of
of the web page in a JSON (JavaScript Object No-             almost 130 embedded local grammars, formalized
tation) format.                                              in the shape of Finite State Automata (Pelosi
   This one has been parsed to a tabular one, in             2015b), which systematically modulate the prior
order to plan a format suitable for the adopted              polarity of words according to their syntactic local
tools. The files have been refined selecting the             context8. These resources can be directly applied
endpoint useful for the analysis: captions (includ-          to the Instagram captions, while hashtags need to
ing hashtags); images hyperlink; accessibility               be initially segmented. In this phase, they are an-
captions;     geographical      coordinates     and          alytically decomposed into their constituents
timestamp.                                                   through 10 morpho-syntactic grammars applied
   Some data required a data refinement phase.               simultaneously, but with different priorities. In
For the captions, it has been necessary to do a              this way, the selection of the most probable se-
cleaning phase in which all the texts that were not          quences is decided for the upstream9.

5                                                            9
  http://www.nooj-association.org/                             Basically, if the system produces more than one inter-
6
  For instance, in the “human” cluster we have grouped       pretation, the preferred one is the one in which the con-
all the accessibility caption containing words such as       stituents have a longer length and the smallest number
“people, man, woman, person” etc.                            of constituents. In other words, the system firstly com-
At the same time, in the “outdoor” cluster we have           pares the whole normalized string with the word forms
grouped all the pictures with words such as “sea, sky-       from SentIta, then continues the comparison with Eng-
line, lawn, beach” and so on.                                lish and Italian word forms from the basic module.
7
  This phase has been possible in an automatic way           Hence, the dictionaries receive the higher priority and
adopting the python library reverse-geocoder                 are applied before morphological grammars.
(https://github.com/thampiman/reverse-geocoder)              If the system does not match any word in the lexicon,
8
  The performances of our method produced satisfac-          it starts the structural analysis of the string, which con-
tory results in the sentence-level analysis of the textual   sist of a systematic comparison of substrings with the
part of the corpus: 0,85 Recall; 0,96 Precision and 0,9      all the words contained in the dictionaries, according
F-score.                                                     to part of speech specific syntactic structures. Such
                                                             structure, ordered here by priority assignments, can be


                                                                                                                      3
     For the analysis of the non-verbal features,
emojis are treated by using an electronic diction-
ary, which has been semi-automatically annotated
with the same information used to analyze verbal
features. We created this database with recogniza-
ble decimal codes in UTF-8 encoding from Emo-
jipedia, then we carried out the automatic analysis
of the textual descriptions of each emoji.
   This dictionary has been used to locate and in-
terpret the emojis occurring in the posts10.
   After the clustering phase (human and not hu-
man; indoor and outdoor), all the findings of the
sentiment on all the languages can be associated
to the pictures’ features, combined or not.11

3.       Visualization and Results

For a complete observation of the analysis’ pro-
cess and of its results, we developed a data visu-
alization dashboard. In the following dashboard it
is possible to observe the sentiment analysis on
each language processed, with the chance of in-
vestigating the different trends during the days
and the single hours day by day.
Adopting the clusters detected in the images, a
system of filters let to focus the results basing on
the subjects depicted.
   On the left side of the dashboard, a map shows
the geographical situation, merging the 4 senti-
ment values in a single one (weighted average)                                                        Figure 1 The sentiment analysis values
and coloring the regional shape on chromatic                                                    As a matter of fact, Novak (2015) underlined
scale from the minimum value (-3) in orange, to                                              that it is more common the use of positive emojis
the maximum value (+3) in blue. The same scale                                               with respect to the negative ones. Moreover, Boia
is applied to the line chart on the right, in which                                          et al. (2013) observed a poor correlation between
each line is related to the vertical axes and colored                                        the perceived emotional polarity of emojis and the
as described before.                                                                         accompanying linguistic text alone. Although it is
 (𝑆𝑒𝑛𝑡   𝐸𝑚𝑜𝑗𝑖𝑠 ∗ 𝑃   𝐸𝑚𝑜𝑗𝑖𝑠 + 𝑆𝑒𝑛𝑡   𝐻𝑎𝑠ℎ𝑡𝑎𝑔𝑠 ∗ 𝑃   𝐻𝑎𝑠ℎ𝑡𝑎𝑔 + 𝑆𝑒𝑛𝑡   𝑇𝑒𝑥𝑡𝑠   ∗𝑃   𝑇𝑒𝑥𝑡𝑠 )
                                                                                             actually challenging to predict the interaction be-
                          𝑃   𝐸𝑚𝑜𝑗𝑖𝑠 + 𝑃   𝐻𝑎𝑠ℎ𝑡𝑎𝑔 + 𝑃   𝑇𝑒𝑥𝑡𝑠                               tween emoji and texts, there are cases in which the
                                                                                             emojis express or reinforce the sentiment of the
   Each score reached by the three languages are                                             text with which they occur and cases in which
taken into account, namely texts, hashtags and                                               they modify it or even express an opposite emo-
emojis, are weighted according to the assumption                                             tional state (Guibon 2018, Shoeb 2019).
that the euphoric level of emojis’ sentiment is                                                 Hashtags are conventionally used in two ways:
higher than hashtags’ one, and both are higher                                               on one hand, to describe the contents in a list of
than written texts’ one in general. According to                                             words, and on the other hand for strategic pur-
these results, we propose this weighted average                                              poses, in order to place the images in useful the-
formula, in which emojis, hashtags and texts have                                            matic spaces. This is also the reason why we have
different weights (P), respectively 33, 50, and                                              removed from the analysis of all the Instagram-
100.

multiword expressions; free nominal, prepositional,                                          full words contained in the posts, the sentiment labelled
adjectival and adverbial phrases; elementary sen-                                            emojis cover the 19% of the total number of emojis in
tences; and verbless sentences.                                                              the corpus.
10
   While the oriented words located into captions and
hashtags respectively cover the 6% and the 9% of the


                                                                                                                                                    4
specific hashtags, such as: #likeforlike, #fol-
lowoforfollow etc., that are not suitable or even
could be misleading or biased for our investiga-
tion. At the same time, the hashtags are also used
as part of the messages, in substitution of words,
so they deserve to be included in the final meas-
ure, but not with the same relevance of the cap-
tions.
   The performances of emoji, hashtag and texts
as indicators for sentiment analysis purposes,
alone and combined with one another, have been
tested on our corpus. We verified a significant im-
provement in terms of document-level precision
when the indicators are considered together
(0.98), if compared with the precision of texts
(0.91), hashtag (0.81) and emoji (0.65) considered
alone. The different precisions reached by the
three languages considered alone empirically con-
firm the diversification of weights we proposed in
our formula. This weighted measure has, then,
been compared by three different judges12 with
the arithmetic mean on a sample of 100 Instagram
posts from our corpus and performed better in the
92% of the cases.
   Nevertheless, the geographical dimension is
very important to observe the different kind of
languages in the online community (Arnaboldi et
al., 2017). Through an overlay function, moving                  Figure 2 Overlay function: provincial scale
the cursor on the map (figure 2), we show the ge-          The research brought together linguistic analysis
ographical data in the deeper level of the single          and design into a more general semiotic frame-
province, focusing on each region. The result rep-         work. The aim was, in fact, to put in shape the
resents the possible different polarity value be-          pandemic phenomenon through a selection of lin-
tween different provinces. For instance, on May 4          guistic relevance.
in the provinces of Oristano (Sardegna), Genova            The virus caused a series of unpredictable changes
(Liguria) and Viterbo (Lazio), the sentiment value         narrated on Instagram through the hashtag #an-
is negative, despite the positive average value of         dràtuttobene. A mantra for the Igers and an isot-
the region. However, the average sentiment value           opy for the analysts (Greimas & Courtès 1979).
over the three days analyzed is found always pos-          Working on multiple levels, the research has of-
itive, with different evidences on regional and            fered a general and a local view of the emotions
provincial scale. Lastly, users can explore the re-        told during the lockdown period. Starting from a
sults focusing on one or more region though a fil-         lexical base, made up by a list of words, and using
ter function (by clicking or selecting). All the fil-      electronic dictionaries also for the images, the
ters are interdependent, so it is possible to select       analysis organized a large amount of data, devel-
all the functions available investigating the phe-         oping a real map of emotions and needs expressed
nomenon from all the possible perspectives.                during the first wave of pandemic. The map can
                                                           be visualized trough a dashboard letting users ob-
     Conclusion                                            serve general and local reactions, down to the sin-
                                                           gle province. The emotional effects of sense have
Throughout the quantitative and qualitative anal-          been evaluated thanks to a polar and unique meas-
ysis of the different expressive forms used on In-         ure.
stagram, this work proposes a general view of
COVID-19 in Italy.

12
  For the evaluation of the three judges, we have cal-     dorff’s Alpha formula (kalpha). The three coders se-
culated the intercoder reliability adopting the Krippen-   lected have a kalpha of 0.9.


                                                                                                               5
   In the end, did everything really go well for In-        Bosco, C., Patti, V., & Bolioli, A. (2013). Developing
stagram's Italy? In general, it seems so. The aver-           corpora for sentiment analysis: The case of irony
age sentiment value over the three days analyzed              and senti-tut. IEEE intelligent systems, 28(2), 55-
is always positive, with variations on regional and           63.
provincial scale. Going down the single province,           Boia M., Faltings B., Musat C. C., & Pu P. (2013). A:)
we can find differences, as the Sardinia, Lazio and           is worth a thousand words: How people attach sen-
Liguria cases.                                                timent to emoticons and words in tweets. In “2013
                                                              International Conference on Social Computing”, pp.
                                                              345-350. IEEE.
References
                                                            Castellucci, G., Croce, D., & Basili, R. (2016). A lan-
                                                              guage independent method for generating large
Agerri, R. and García-Serrano, A. (2010). Q-WordNet:
                                                              scale polarity lexicons. In Proceedings of the Tenth
  Extracting polarity fromWordNet senses. In Pro-
                                                              International Conference on Language Resources
  ceedings of the International Conference on Lan-
                                                              and Evaluation (LREC'16) (pp. 38-45).
  guage Resources and Evaluation, pages 2300–2305.
                                                            Celebi, A., & Özgür, A. (2016). Segmenting hashtags
Ahanin, Z., & Ismail, M. A. (2020). Feature extraction
                                                              using automatically created training data. In Pro-
  based on fuzzy clustering and emoji embeddings for
                                                              ceedings of the Tenth International Conference on
  emotion classification. International Journal of
                                                              Language Resources and Evaluation (LREC'16)
  Technology Management and Information System,
                                                              (pp. 2981-2985).
  2(1), 102-112.
                                                            Celebi, A., & Özgür, A. (2018). Segmenting hashtags
Arnaboldi, M., Brambilla, M., Cassottana, B., Ciuc-
                                                              and analyzing their grammatical structure. Journal
  carelli, P., & Vantini, S. (2017). Urbanscope: A lens
                                                              of the Association for Information Science and
  to observe language mix in cities. American Behav-
                                                              Technology, 69(5), 675-686.
  ioral Scientist, 61(7), 774-793.
                                                            Cosenza, G., (2014). Introduzione alla semiotica dei
Balomenou, N., & Garrod, B. (2019). Photographs in
                                                              nuovi media, Laterza, Milano.
  tourism research: Prejudice, power, performance
  and participant-generated images. Tourism Man-            Di Gennaro, P., Rossi, A., & Tamburini, F. (2014). The
  agement, 70, 201-217.                                        FICLIT+ CS@ UniBO System at the EVALITA
                                                               2014 Sentiment Polarity Classification Task. In Pro-
Bansal, P., Bansal, R., & Varma, V. (2015). Towards
                                                               ceedings of the Fourth International Workshop
  deep semantic analysis of hashtags. In European
                                                               EVALITA 2014.
  conference on information retrieval (pp. 453-464).
  Springer, Cham.                                           Eisner, B., Rocktäschel, T., Augenstein, I., Bošnjak,
                                                               M., & Riedel, S. (2016). emoji2vec: Learning emoji
Barbieri, F., Ronzano, F., & Saggion, H. (2016). What
                                                               representations from their description. arXiv pre-
  does this emoji mean? a vector space skip-gram
                                                               print arXiv:1609.08359.
  model for twitter emojis. In Proceedings of the
  Tenth International Conference on Language Re-            Esposito, F., Falco, M., & Vitale, P. (2020). Experienc-
  sources and Evaluation (LREC 2016); 2016 May                ing Museums A Qualitative and Quantitative De-
  23-28; Portorož, Slovenia.                                  scription About Igers’ Narration of an Exhibit
                                                              Space. In Workshops of the International Confer-
Basile, V., & Nissim, M. (2013). Sentiment analysis on
                                                              ence on Advanced Information Networking and Ap-
  Italian tweets. In Proceedings of the 4th Workshop
                                                              plications (pp. 1011-1018). Springer, Cham.
  on Computational Approaches to Subjectivity, Sen-
  timent and Social Media Analysis (pp. 100-107).           Felbo, B., Mislove, A., Søgaard, A., Rahwan, I., &
                                                               Lehmann, S. (2017). Using millions of emoji occur-
Bianchi, C., & Cosenza, G. (2020). LexiaRivista di
                                                               rences to learn any-domain representations for de-
  semiotica, 33-34. Semiotica e Digital marketing,
                                                               tecting sentiment, emotion and sarcasm. arXiv pre-
  Roma, Aracne.
                                                               print arXiv:1708.00524.
Bolioli, A., Salamino, F., & Porzionato, V. (2013). So-
                                                            Fernández-Gavilanes, M., Juncal-Martínez, J., García-
  cial Media Monitoring in Real Life with Blogmeter
                                                              Méndez, S., Costa-Montenegro, E., & González-
  Platform. ESSEM@ AI* IA, 1096, 156-163.
                                                              Castaño, F. J. (2018). Creating emoji lexica from
Bosco, C., Allisio, L., Mussa, V., Patti, V., Ruffo, G.       unsupervised sentiment analysis of their descrip-
  F., Sanguinetti, M., & Sulis, E. (2014). Detecting          tions. Expert Systems with Applications, 103, 74-
  happiness in Italian tweets: Towards an evaluation          91.
  dataset for sentiment analysis in Felicitta. In 5th In-
                                                            Giordano, G., Primerano, I., & Vitale, P. (2020). A
  ternational Workshop on Emotion, Social Signals,
                                                              Network-Based Indicator of Travelers Performa-
  Sentiment & Linked Open Data, Es³Lod 2014 (pp.
                                                              tivity on Instagram. Social Indicators Research, 1-
  56-63). European Language Resources Association.
                                                              19.


                                                                                                                  6
Greimas, A. J., Courtés, J. (1979). Sémiotique: diction-       cessing and Speech Tools for Italian. Final Work-
  naire raisonné de la théorie du langage, Paris,              shop, EVALITA 2018 (Vol. 2263, pp. 1-9). CEUR-
  Hachette.                                                    WS.
Guibon, G., Ochs, M., & Bellot, P. (2018). From emoji        Simeon, C., Hamilton, H. J., & Hilderman, R. J.
  usage to categorical emoji prediction. In 19th Inter-        (2016). Word segmentation algorithms with lexical
  national Conference on Computational Linguistics             resources for hashtag classification. In 2016 IEEE
  and Intelligent Text Processing (CICLING 2018).              International Conference on Data Science and Ad-
  Springer Lecture Notes in Computer Science, Swit-            vanced Analytics (DSAA) (pp. 743-751). IEEE.
  zerland.
                                                             Singh, A., Blanco, E., & Jin, W. (2019). Incorporating
Hochman, N. (2015). The social media image: Modes               emoji descriptions improves tweet classification. In
  of visual ordering on social media (Doctoral disser-          Proceedings of the 2019 Conference of the North
  tation, University of Pittsburgh).                            American Chapter of the Association for Computa-
                                                                tional Linguistics: Human Language Technologies,
Indaco, A., & Manovich, L. (2016). Urban social me-
                                                                Volume 1 (Long and Short Papers) (pp. 2096-2101).
   dia inequality: definition, measurements, and appli-
   cation. arXiv preprint arXiv:1607.01845.                  Singhal, A., & Rattine-Flaherty, E. (2006). Pencils and
                                                                photos as tools of communicative research and
LeCompte, T., & Chen, J. (2017). Sentiment analysis
                                                                praxis: Analyzing Minga Perú’s quest for social jus-
  of tweets including emoji data. In 2017 International
                                                                tice in the Amazon. International Communication
  Conference on Computational Science and Compu-
                                                                Gazette, 68(4), 313-330.
  tational Intelligence (CSCI) (pp. 793-798). IEEE.
                                                             Srinivasan, S., Bhattacharya, S., & Chakraborty, R.
Monti J., Sangati F., Chiusaroli F., Benjamin M., and
                                                                (2012). Segmenting web-domains and hashtags us-
  Mansour S.. (2016). Emojitalianobot and emo-
                                                                ing length specific models. In Proceedings of the
  jiworldbot - new online tools and digital environ-
                                                                21st ACM international conference on Information
  ments for translation into emoji. In Proc. CLiC-it
                                                                and knowledge management (pp. 1113-1122).
  2016, volume 1749 of CEUR Workshop Proceed-
  ings.                                                      Stranisci, M., Bosco, C., Patti, V., & HERNANDEZ
                                                                FARIAS, D. I. (2015). Analyzing and annotating for
Novak, P. K., Smailović, J., Sluban, B., & Mozetič, I.
                                                                sentiment analysis the socio-political debate on# la-
  (2015). Sentiment of emojis. PloS one, 10(12).
                                                                buonascuola. In second Italian Conference on Com-
Palazzo, M., Vollero, A., Vitale, P., & Siano, A. Urban         putational Linguistics (pp. 274-279). aAcademia
   and rural destinations on Instagram: Exploring the           University Press.
   influencers’ role in# sustainabletourism. Land Use
                                                             Tian, Y., Galery, T., Dulcinati, G., Molimpakis, E., &
   Policy, 100, 104915.
                                                                Sun, C. (2017). Facebook sentiment: Reactions and
Pelosi S., (2015a): SentIta and Doxa: Italian Databases         emojis. In Proceedings of the Fifth International
   and Tools for Sentiment Analysis Purposes. The
                                                             Tifentale, A., & Manovich, L. (2015). Selfiecity: Ex-
   second Italian Conference on Computational Lin-
                                                                ploring photography and self-fashioning in social
   guistics (CLiC-it 2015). Trento, December 3-4
                                                                media. In Postdigital aesthetics (pp. 109-122).
   2015. Book of Proceedings. Accademia University
                                                                Palgrave Macmillan, London.
   Press srl, Torino.
                                                             Vitale, P., Mancuso, A., & Falco, M. (2019). Muse-
Pelosi, S. (2015b). A Lexicon-based Approach to Sen-
                                                                ums’ tales: visualizing instagram users’ experience.
   timent Analysis: The Italian Module for Nooj. For-
                                                                In International Conference on P2P, Parallel, Grid,
   malising Natural Languages with Nooj 2014, 37.
                                                                Cloud and Internet Computing (pp. 234-245).
Rathnayake, C., & Ntalla, I. (2020). “Visual Afflu-             Springer, Cham.
  ence” in Social Photography: Applicability of Image
                                                             Vitale, P., Palazzo, M., Vollero, A., Siano, A., &
  Segmentation as a Visually Oriented Approach to
                                                                Foroudi, P. (2020). The Role of Igers in the Territo-
  Study Instagram Hashtags. Social Media+ Society,
                                                                rial Dynamics of Sustainable Tourism-Oriented
  6(2), 2056305120924758.
                                                                Destinations. In International Symposium: New
Reuter, J., Pereira-Martins, J., & Kalita, J. (2016). Seg-      Metropolitan Perspectives (pp. 759-767). Springer,
  menting twitter hashtags. Intl. J. on Natural Lang.           Cham.
  Computing, 5(4).
Ronzano, F., Barbieri, F., Wahyu Pamungkas, E., Patti,
  V., & Chiusaroli, F. (2018). Overview of the evalita
  2018 italian emoji prediction (itamoji) task. In 6th
  Evaluation Campaign of Natural Language Pro-


                                                                                                                   7