ItaGLAM: A corpus of Cultural Communication on Twitter during the Pandemic

ItaGLAM: A corpus of Cultural Communication on Twitter during the Pandemic GennaroNolano gnolano@unior.it UniOr NLP Research Group "L'Orientale" University of Naples

Italy

CarolaCarlino ccarlino@unior.it UniOr NLP Research Group "L'Orientale" University of Naples

Italy

MariaPia UniOr NLP Research Group "L'Orientale" University of Naples

Italy

DiBuono mpdibuono@unior.it UniOr NLP Research Group "L'Orientale" University of Naples

Italy

JohannaMonti jmonti@unior.it UniOr NLP Research Group "L'Orientale" University of Naples

Italy

ItaGLAM: A corpus of Cultural Communication on Twitter during the Pandemic D2451AC0A183E7856DFB6204E0E2ED9A GROBID - A machine learning software for extracting information from scholarly documents

This paper describes the compilation and annotation of ItaGLAM, a corpus of tweets written by Italian Galleries, Libreries, Archives and Museums (GLAMs) during the lockdown period in Italy due to the COVID-19 pandemic. ItaGLAM has been annotated with a set of labels which may be useful to identify different types of communication. Furthermore, the collected data have been used to train a set of classifiers. The results are analyzed to evaluate the information flow between GLAM and users and to analyze cultural communication on the Web.

Introduction

Over the last years, Social Networks have become one of the most popular platforms for sharing experiences and opinions through the use of simple strings of text (Zhao and Rosson, 2009). Indeed, this way of communicating has become an essential interaction tool, not only among private users, but also among companies to engage with their audience and to promote their brands (Alturas and Oliveira, 2016).

The use of social networks has also been adopted by museums, that, over time, have changed their way of communicating with their audience 2 . In particular, in regards to the GLAM sector, a new trend has been observed in recent years: the use of the web as a way to create and foster an online community (Langa, 2014;Allen-Greil and MacArthur, 2010).

While during the first decade of this century museum professionals considered the exhibition of collections on social networks (Laws, 2015) as 'excessive', nowadays the use of these platforms has become the norm. As Amanatidis et al. (2020) pointed out in their study about the use of social networks (and in particular Instagram) by museums in the Greek culture scene: 'social media has become a key factor in the way that cultural organizations communicate with their public in supporting the marketing of performing art organizations'.

Such centrality makes the Social network a potentially effective means that allows GLAMs to reach a wide and heterogeneous audience and to adapt to it. Therefore, we believe that the analysis of the cultural communication implies an analysis of how cultural corporations interact with the audience through social networks. After considering the most used social networks (namely Facebook, Instagram, Twitter) in the cultural sector, we have decided to focus our research on the use of Twitter, which has already been proven to be a solid basis to analyze institutional communication, as Preot ¸iuc-Pietro et al. (2015) have highlighted.

Therefore, the main aim of our research is to investigate how Italian GLAMS have extraordinarily (Giraud, 2020) interacted with their audience during the lockdown in Italy due to the COVID-19 pandemic (NEMO -Network of European Museum Organisations, 2020), i.e. in the period from the 8th of March to the 5th of May 2020 (as per DPCM March 11 2020).

Over this time many cultural initiatives have been launched with the aim of strengthening the dialogue with the audience and make sure that, despite the impossibility of any kind of physical access, the connection between GLAMs and their visitors would not be interrupted3 .

It has been observed that during the aforementioned period, while GLAMs institutions have drastically increased their use of Facebook, Instagram and Twitter, the latter one was the only Social Network for which an increase in the interaction audience-institution has been registered (Politecnico di Milano, 2020).

The study of communicative intents by GLAMs through Social Networks in the Italian language is still novel and, as such, best practices and tools to use still need to be tested and honed. In particular, there is still the need for an annotated corpus and a classifier that can be used on large amounts of data.

Despite the time frame taken into account is relatively short (covering 58 days in total), we think that investigating how Italian GLAMs used the web when it was the only form of communication at their disposal, represents a good training ground to test our practices and to train and evaluate different kinds of classifiers useful also in future works.

The paper is organized as follows: Section 2 describes the related works in the analysis of communication on Twitter by cultural institutions. Section 3 introduces the methodology used in this analysis: namely, it describes the creation of the corpus, the creation and use of the annotation set, and the training and evaluation of different classifiers. Finally, in Section 4 we explain the results of the research.

Related Work

The large amount of data available on Twitter makes this platform ideal for several studies. As such, during the years tweets have been used in several research projects regarding disaster response (Zahra et al., 2019), content classification (Dann, 2010;Stvilia and Gibradze, 2014) and, in particular, sentiment analysis (O'Connor et al., 2010;Gamallo and Garcia, 2014;Talbot et al., 2015). Despite these efforts, only a few studies have focused on the classification of communicative intents of organizations and institutions on Twitter, like Lovejoy and Saxton (2012) and Foucault and Courtin (2016), who focused on French tweets written during the MuseumWeek event. Similar kinds of study can be found in researches dealing with Italian tweets, with several contributions dealing with sentiment analysis (Basile and Nissim, 2013;Cimino et al., 2014) and automatic misogyny identification (Anzovino et al., 2018). To the best of our knowledge, no work has been done so far on communicative intent classification for Italian tweets.

Methodology

The task of tweet classification has turned out to be rather challenging for various reasons, many inherent to the platform itself. First and foremost, tweets are very short texts (with the maximum length of 280 characters), and with an average token count of 16.80 in our corpus. Secondly, it is not unusual to find tweets composed only of hashtags, or URLs. While URLs by themselves are rarely if ever useful in a classification task, hashtags could represent a source of information only if they are used according to their original communicative intent or to the initiative to which they are related. In the following subsections we describe how:

• the corpus was created;

• the annotation set has been chosen and then applied;

• the classifiers have been trained and tested.

Dataset

Because of the COVID-19 outbreak, the Italian Government (as many others around the world) imposed a lockdown policy, which lasted from the 8th March to the 5th May 2020 (58 days in total as per DPCM March 11 2020). During this period of time, museums and art galleries adopted several strategies to continue engaging with their audience in order to maintain the communication alive, and to grant access to digital cultural heritage media. As already mentioned in Section 1, they increased the scope of their communication on the main social platforms, i.e. Facebook, Twitter and Instagram. In this context, the focus of our analysis is the use of Twitter. The communication on Twitter is characterised by the use of certain hashtags, which have been used by GLAMs to propose several types of initiatives to their audience. Initially, the set of hashtags we used was made up of 33 hashtags promoted and used by Italian GLAMs and Italy's Ministry of Cultural Heritage and Activities (Italian: Ministero per i Bene e le Attività Culturali e per il Turismo -MiBACT), and selected on the basis of their popularity according to the Twitter trend topics (TT)4 . Among these hashtags, #museitaliani (and its graphic variation #museiitaliani) is the only one already existing before the pandemic, and subsequently adapted by museums for the initiatives proposed during the pandemic; while others, such as #artyouready and #emptymuseum have been created ad hoc during the lockdown period to describe specific initiatives. By using these hashtags as a queue in the public Twitter API5 we have created a corpus with a total of 23,716 tweets. To better focus on the tweets and their intents concerning cultural communication, we have decided to filter out of the corpus any hashtag with less than 1,000 occurrences. We have thus obtained a queue of six hashtags (#artyouready, #emptymuseum, #museitaliani, #museichiusimuseiaperti, #laculturanonsiferma, #laculturaincasa) and a corpus of 15,988 tweets. This corpus has been filtered once again so that only unique tweets (i.e. no retweets) written in Italian have been kept. By using a list of GLAMs manually extracted from the corpus, we have then extracted out of the remaining 8,038 tweets those written by a GLAM institution, thus ending up with our final corpus of 3,429 tweets published by 213 Italian cultural institutions. Table 1 shows the occurrences of the hashtags in the final corpus.

Annotation Process

In order to define the intents of GLAMs towards the users, the corpus has been annotated with four communication categories first presented by Courtin et al. (2015), and then used by Foucault and Courtin (2016), and Juanals and Minel (2018) to annotate the information flow on a social network during a cultural event. The annotation has been done at tweet level, using a set of labels composed as follows:

• characterized by square and cylindrical towers with loopholes and storm drains). A fifth category N/A has been included in order to classify tweets that do not fit in any of the aforementioned categories, like the ones composed of only hashtags. Following this set of categories and our guidelines, the tweets have been annotated using the open source platform INCEpTION6 , and a first round of annotation has been carried on 400 tweets, double annotated by a domain expert and a non-expert in order to calculate the Inter-Annotator Agreement (IAA). The use of a non-expert was necessary so that the annotation would not have been influenced by any external knowledge (for example the original meaning behind the various hashtags). The resulting Fleiss' Kappa has revealed to be moderately good at 0.629, which is considered sufficient for the task at hand. As it can be seen from the confusion matrix in Figure 1, the agreement is very strong on PI and ItC, moderately strong on SE, and very weak on PP. Furthermore, 89 tweets have been deemed unus-Figure 1: Confusion matrix for the agreement on every label. able as they have been tagged with the label N/A, therefore, they have been removed from the corpus. Table 2 presents the number of occurrences for each label for the remaining 3,340 tweets. These results show an issue regarding the label PP, that is severely underrepresented in the corpus. The effects of this underrepresentation on our classifiers will be explained in detail in Section 4, and the analysis of possible solutions will be the focus of future work.

Intent classification

In order to train the classifiers, the corpus has been preprocessed so that all tweets are lowercase, and all punctuation marks, URLs, numbers and The results show that the methodology adopted in this work can be useful in better understanding how cultural institutions communicate on the Web. The tools used in this specific task are adequate in annotating and automatically classifying the way cultural institutions communicate on the Twitter platform. That being said, the results shown in Section 3 demonstrate that our experiments can still be improved. Firstly, the increase in the size of the dataset would surely enhance the performances of the classifiers. In particular, this should be done focusing on the label PP, that, as it can be observed in Table 4, is the less frequent among the four. Furthermore, while the precision for the label PP is usually higher than the average (note how it reaches 1.00 in our baseline), its recall is very low, even for our SVM classifier, which shows the best results overall. The intuition here is that, while it is usually easy for the classifiers to understand which tweet has the PP label, they are also very "picky", and cannot really learn all the features needed in order to classify this label against the others. Other possible solutions to this issue can be the use of techniques such as resampling and cost-based methods. Secondly, by focusing on the textual features of the tweets, we can further investigate where improvements can be made. In particular, looking at the top 5 tf-idf scores for each label (Table 5), we notice that the selected hashtags may occur in all types of tweets with a low difference among their scores. Such a low deviation does not contribute enough Those data could give us some insight on how museums communicate through the Twitter platform. Indeed, usually, GLAMs tend to use the same hashtags regardless of their communicative intents (even when the hashtag used was initially linked to certain initatives), which was already expected with some general hashtags, like #iorestoacasa.

The effects of possible removal or reweighting of these hashtags needs to be further explored.

Conclusion and Future Work

In this work, we have described our project for classifying communicative intents in tweets written by Italian GLAMs during the COVID-19 lockdown. Through the experiments and the following analysis we have shown how this task can be challenging. As future work we will focus on: increasing the size of the corpus, integrating statistical techniques to help dealing with imbalanced labels, and finally improving the selection and reweighting of the features (in particular concerning the hashtags). Another topic which needs further investigation concerns the use of different kinds of textual embeddings, which might improve the result. Once honed, the metholody and the tools we have used in this research could become an important asset in better understanding and analyzing cultural communication on the Web.

Table 1 :1Number of occurrences for each hashtag.Hashtag# Occ.#artyouready367#emptymuseum373#museitaliani906#museichiusimuseiaperti1560#laculturanonsiferma668#laculturaincasa283Total4,157

Sharing Experience -SE: tweets that share an experience, an opinion or one's feelingExample: Eccoci qui oggi a ricordare e a rac-contare come i musei chiusi non siano chiusie i musei vuoti non siano vuoti. Forza!(Here we are today, reminding and telling howclosed museums are not actually closed andempty museums are not actually empty. Comeon!);• Promoting Participation -PP: tweets thatrequire some kind of activity from the users,either in real life or on-lineExample: Art you ready? Domani partecipaanche tu al contest di @ museitaliani con-dividendo con noi le tue foto dei musei prividi persone. Cerca fra i ricordi, seleziona lafoto, e condividi con # artyouready # Muse-umFromHome # iorestoacasa. Ti aspettiamo!(Art you ready? Take part in tomorrow's @museitaliani contest by sharing with us yourphotos of empty museums. Search throughyour memories, choose the photo and shareit with # artyouready # MuseumFromHome #iorestoacasa. We are waiting for you!);• Interacting with the Community -ItC:tweets through which Institutions create andfoster their communities by directly interact-ing with the usersExample: Siete stati davvero tanti ad ac-cogliere l'invito a partecipare al flashmob #artyouready e tutti avete postato foto merav-igliose! Ecco i tre scatti selezionati tra i piùbelli(So many of you accepted to take part in the# artyouready flashmob, and you all postedgreat photos! Here are the three shots selectedamong the most beautiful ones);• Promoting-Informing -PI: tweets that pro-mote or inform other users about activities,exhibitions, or about any sort of informationon the museum.Example: Il castello di Fénis si trova inValle d'Aosta circondato da una doppia cintadi mura merlate è caratterizzato da torriquadrate e cilindriche con feritoie e caditoie.

(Fénis Castle is located in Aosta Valley, with its double crenellated surrounding walls, it is

Table 2 :2Number of occurrences and percentage in the corpus for each label.stopwords 7 removed. The cleaning process hasbeen done via the NLTK package for Python 8 ,which has also been used for tokenization.The experiments have been conducted on sixclassifiers: five more traditional classifiers trainedon a TF-IDF vectorized text (created using themachine learning library for Python Scikit-learn 9 ),and a Feed Forward Neural Network 10 createdwith Keras 11 and trained on a 100-dimensionsGlOve 12 embedded text.The set of classifiers is thus the following: aNaive Bayes (NB, also used as baseline); aSupport Vector Classifier (SVC); a the K-NearestNeighbors classifier (KNN); a Decision Tree (DT);a Multilayer Perceptron (MLP) and a NeuralNetwork classifier (NN).The dataset was split using the train test split toolfound in the sklearn library for Python, whichsplits the data into random train and test subsetsgiven a test set size. With test size set at 0.3, thetraining set is composed of 2,338 tweets, and thetesting set is composed of the remaining 1,002tweets.In order to evaluate the classification task, thevalues of precision, recall and F1 have been allweighted by the number of samples of each label.The final results are shown in Table 3.ClassifierPRF1NB0.69 0.66 0.64SVC0.70 0.68 0.67KNN0.70 0.39 0.35DT0.56 0.55 0.55MLP0.66 0.66 0.66NN0.64 0.63 0.63

Table 3 :3System results.

Table 4 :4Precision (P) and Recall (R) for each label.

ClassifierSEPPItCPINBP 0.66 1.00 0.58 0.61R 0.37 0.02 0.74 0.76SVCP 0.66 0.88 0.72 0.60R 0.48 0.52 0.65 0.81KNCP 0.36 0.69 0.74 0.68R 0.79 0.32 0.43 0.39DTP 0.47 0.43 0.51 0.50R 0.51 0.50 0.53 0.48MLPP 0.59 0.71 0.67 0.68R 0.62 0.54 0.68 0.67NNP 0.61 0.77 0.60 0.71R 0.64 0.47 0.71 0.61

Table 5 :5Top 5 word by their tf-idf score on each label.to the classification process, as shown by #museichiusimuseiaperti values which are seemingly strong enough as a feature to differentiate PP against the others, but does not do a good job differentiating the other labels against each other.TokenSEPPItCPI#museichiusimuseiaperti 1.55 2.47 2.12 1.99#iorestoacasa1.66 1.92 1.51 1.71#museitaliani2.43 2.35 2.05 2.33#laculturanonsiferma2.88 2.47 2.83 2.35#emptymuseum3.02 1.97 3.29 3.4#artyouready2.91.71 3.04 3.23#laculturaincasa4.1-3.47 2.96flashmob-2.1--mibact3.34 2.42.73.09oggi3.03 --2.78youtube--3.16 -cultura4.18 -3.12 4.18

https://icom.museum/en/news/how-toreach-and-engage-your-public-remotely/ This process is described in details inCarlino et al. (2020) https://developer.twitter.com/en https://inception-project.github.io/ The list of stopwords used is the default one for Italian found in the NLTK package. Furthermore, the term 'Twitter' has been added to this list after the first experiments. https://www.nltk.org/ https://scikit-learn.org/stable/ Parameters: 4 layers, dropout=0.7, Adam Optimizer https://keras.io/ https://nlp.stanford.edu/projects/ glove/

Acknowledgments

This work has been partially supported by Programma Operativo Nazionale Ricerca e Innovazione 2014-2020 -Fondo Sociale Europeo, Azione I.2 "Attrazione e Mobilità Internazionale dei Ricercatori" Avviso D.D. n 407 del 27/02/2018 and by PON Ricerca e Innovazione 2014-2020 "Dottorati innovativi con caratterizzazione industriale". Authorship Attribution is as follows: Gennaro Nolano is author of Section 3.3 and 4, Carola Carlino is author of Section 3, 3.1 and 5, Maria Pia di Buono is author of Section 2 and 3.2, and Johanna Monti is author of Section 1.

2 https://www.osservatori.net/it/ ricerche/comunicati\-stampa/laumento\-del\-livello\-di\-interesse\-per\-le\-attivita\-online\-dei-musei\-incentivato\-dal\-covid\-19\-e\-gli\-investimenti\-per\-migliorare\-i\-servizi\-offerti

Small towns and big cities: How museums foster community on-line DanaAllen -Greil MatthewMacarthur Proceedings. Toronto: Archives & Museum Informatics Toronto: Archives & Museum Informatics 2010. 2010. March, 31. 2010 Museums and the Web Consumers using social media: impact on companies' reputation BráulioAlturas LilianaOliveira Proceedings of the Academy of Marketing Conference 2016: Radical Marketing the Academy of Marketing Conference 2016: Radical Marketing Academy of Marketing 2016 Social media for cultural communication: A critical investigation of museums' instagram practices DimitriosAmanatidis IfigeneiaSpyridon Mamalis IreneEiriniKamenidou Journal of Tourism, Heritage & Services Marketing 6 2 2020 Automatic identification and classification of misogynistic language on twitter MariaAnzovino ElisabettaFersini PaoloRosso Natural Language Processing and Information Systems 2018 Sentiment analysis on italian tweets ValerioBasile MalvinaNissim Proceedings of the 4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis the 4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis 2013 CarolaCarlino GennaroNolano MariaPiaDi Buono JohannaMonti arXiv:2005.10527 Laculturanonsifermareport su uso e la diffusione degli hashtag delle istituzioni culturali italiane durante il periodo di lockdown 2020 arXiv preprint Linguistically-motivated and lexicon features for sentiment analysis of italian tweets AndreaCimino StefanoCresci FeliceDell'orletta MaurizioTesconi The 4th Conference for Evaluation of NLP and Speech Tools for Italian (EVALITA) 2014 A tool-based methodology to analyze social network interactions in cultural fields: The use case "museumweek AntoineCourtin BrigitteJuanals Jean-LucMinel Mathilde De SaintLeger Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16) the Tenth International Conference on Language Resources and Evaluation (LREC'16) 2015 Twitter content classifcation StephenDann First Monday 15 2010 Automatic classification of tweets for analyzing communication behavior of museums NicolasFoucault AntoineCourtin Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16) the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Portorož, Slovenia

ELRA 2016. May Citius: A naive-Bayes strategy for sentiment analysis on English tweets PabloGamallo MarcosGarcia Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014) the 8th International Workshop on Semantic Evaluation (SemEval 2014)

Dublin, Ireland

Association for Computational Linguistics 2014 Il Ministero della Cultura presenta un report sulla reputazione online dei musei durante il Covid ClaudiaGiraud 2020. 10/09/2020 Analysing cultural events on twitter BrigitteJuanals Jean-LucMinel Current Trends in Web Engineering IreneGarrigós ManuelWimmer Cham. Springer International Publishing 2018 Does Twitter help museums engage with visitors? LesleyALanga iConference 2014 Proceedings 2014 AnaSánchez Laws Museum websites and social media: issues of participation, sustainability, trust and diversity Berghahn Books 2015 8 Information, community, and action: How nonprofit organizations use social media KLovejoy GregoryDSaxton Economic & Social Impacts of Innovation eJournal 2012 NEMO -Network of European Museum Organisations Survey on the impact of the COVID-19 situation on museums in Europe: Final Report 2020. last accessed 10/09/2020 From tweets to polls: Linking text sentiment to public opinion time series O'Brendan RamnathConnor BryanRBalasubramanyan NoahARoutledge Smith ICWSM 11 2010 La reputazione online di musei, parchi archeologici, istituti e luoghi della cultura italiani -Report relativo al mese di maggio 2020. 2020 Politecnico di Milano An analysis of the user occupational class through Twitter content DanielPreot ¸iuc-Pietro VasileiosLampos NikolaosAletras Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing

Beijing, China

Association for Computational Linguistics 2015 1 : Long Papers) What do academic libraries tweet about, and what makes a library tweet useful? BesikiStvilia LeilaGibradze Library & Information Science Research 2014 Swash: A naive bayes classifier for tweet sentiment identification RuthTalbot ChloeAcheampong RichardHWicentowski SemEval@NAACL-HLT 2015 Automatic identification of eyewitness messages on twitter during disasters KiranZahra MuhammadImran FrankOstermann Information Processing and Management 57 2019 How and why people twitter: the role that micro-blogging plays in informal communication at work DejinZhao MaryBeth Rosson Proceedings of the ACM 2009 international conference on Supporting group work the ACM 2009 international conference on Supporting group work 2009