1. Introduction

MONICA: Monitoring Coverage and Attitudes of Italian Measures in Response to COVID-19

Fabio Pernisi

Giuseppe Attanasio

Debora Nozza

0 0 Department of Computing Sciences, Bocconi University , Milan , Italy 1 Instituto de Telecomunicações , Lisbon , Portugal

Modern social media have long been observed as a mirror for public discourse and opinions. Especially in the face of exceptional events, computational language tools are valuable for understanding public sentiment and reacting quickly. During the coronavirus pandemic, the Italian government issued a series of financial measures, each unique in target, requirements, and benefits. Despite the widespread dissemination of these measures, it is currently unclear how they were perceived and whether they ultimately achieved their goal. In this paper, we document the collection and release of MoniCA, a new social media dataset for MONItoring Coverage and Attitudes to such measures. Data include approximately ten thousand posts discussing a variety of measures in ten months. We collected annotations for sentiment, emotion, irony, and topics for each post. We conducted an extensive analysis using computational models to learn these aspects from text. We release a compliant version of the dataset to foster future research on computational approaches for understanding public opinion about government measures. We release data and code at https://github.com/MilaNLProc/MONICA.

eol>Sentiment Analysis Social Media Computational Social Science Italian

1. Introduction

and Attitudes of Italian measures to COVID-19. MoniCA comprises approximately 10,000 posts spanning ten Understanding public opinion on governmental decisions months collected on X.com. These posts pertain to the has always been crucial for assessing policies’ efective- Italian public’s discussions on diverse financial measures ness, especially when facing exceptional events requiring introduced during the pandemic. Building on an extenprompt decisions. Computational linguistics and social sive body of literature that examines public sentiment scientists have long observed modern social media plat- during the pandemic [e.g., 4, 5, 6, 7, 8], this work offorms as they are a perfect stage for spreading opinions fers new insights into the limited research specifically swiftly and transparently. Natural Language Processing addressing Italy.1 (NLP) techniques have been widely used for analyzing This paper details the dataset’s collection and release. public discussion [e.g., 1, 2, 3]. It introduces the annotations we compiled for each post,

The COVID-19 pandemic, arguably the most promi- including sentiment, emotion, irony, and discussion topnent of such exceptional events, prompted the Italian ics. Then, we conducted an analysis using traditional government—and other European governments—to re- models and transformer-based language models to prelease multiple financial measures to cushion the impact dict these aspects from textual data, demonstrating the on the population. These so-called “bonuses,” issued dataset’s potential usability. Moreover, using state-ofpro bono, i.e., with no interest payments from recipients, the-art interpretability tools, we explained the models’ aimed at increasing liquidity and reducing tax burdens. decision processes. We found that explanations are faithHowever, despite reaching varied recipients, compre- ful and plausible to human judgments. hending the measures’ reception and evaluating their MoniCA will allow a retrospective examination of the efectiveness still needs to be explored. eficacy – and ineficacy – of governmental measures

To address this gap, we collect and release MoniCA, implemented in Italy during the COVID-19 pandemic, a new social media dataset for MONItoring Coverage as perceived by the population. By doing so, we seek to provide insights that can inform policymakers about CLiC-it 2024: Tenth Italian Conference on Computational Linguistics, the strengths and weaknesses of such financial measures, Dec 04 — 06, 2024, Pisa, Italy ensuring better preparedness and response strategies for g$iufsaebpipoe.p.aetrtnainsai@sios@tuldxb.iotc.pcto(nGi..itA(Ftt.aPnearsniois)i;); any future crises. debora.nozza@unibocconi.it (D. Nozza) https://gattanasio.cc/ (G. Attanasio); https://deboranozza.com/ Contributions. We release MoniCA, a GDPR(D. Nozza) compliant dataset of social media posts to monitor 0000-0001-6945-3698 (G. Attanasio); 0000-0002-7998-2267 (D. Nozza) 1See De Rosis et al. [9] for one of the early (and few) works on ©At2tr0i2b4utCioonpy4r.0igIhnttefornratthioisnpaalp(CerCbByYit4s.0a)u.thors. Use permitted under Creative Commons License modelling sentiment from Twitter during the COVID-19 outbreak. the coverage and people’s attitude towards Italy’s government’s financial aid to combat the COVID-19 crisis. We collect annotations of several aspects to allow for a finer-grained analysis. We used state-of-the-art NLP and interpretability tools and reported key insights on public sentiment.

2. MoniCA

To build a comprehensive resource, reflecting multiple facets of the phenomenon and usable for future policymakers, we prioritized 1) topic and time coverage in our collection process (§2.1), and 2) relevance refinement and data annotation to enrich the initial pool with additional metadata (§2.2).

2.1. Data Collection We collected approximately 200,000 posts from X in late

2022. We then filtered each post to obtain data that was in Italian (per the platform-retrieved metadata), not a repost, dated between March 1, 2021, and December 31, 2021, and selected via hard keyword matching.

We chose search keywords and phrases that match the informal name of any of the measures – e.g., “bonus bicicletta” (eng: bike bonus) or “bonus babysitting.” – and download all matching posts. The keywords we used to identify relevant discussions in the posts were selected based on insights from an author who is native to Italy and was residing there during the pandemic period (20192022). Additional keyword refinement was supported by details from the National Social Security Institute (INPS) about COVID-19 measures.2

Below is the complete list of financial measures on which we focused (see Appendix for corresponding keywords): • Reddito di emergenza (Emergency income): a temporary income support measure established by the "Decreto Rilancio" for households facing ifnancial dificulties. • Bonus terme (Spa bonus): it is an incentive (of up to 200 euros) aimed at supporting citizens’ purchases of spa services at accredited facilities. • Bonus babysitter: it is a measure providing parents of children under 14 in remote learning or quarantine with a bonus (up to 1,200 or 2,000 euros) for purchasing babysitting or child care services. It is available to certain workers including those in public security and healthcare sectors involved in the Covid-19 response. • Bonus asilo nido (Daycare/nursery bonus): it is an income support subsidy aimed at families with children under three years old attending public or authorized private nurseries or those suffering from severe chronic illnesses. The bonus amount varies based on the family’s ISEE income level, with maximum yearly benefits ranging from 1,500 to 3,000 euros. • Bonus figli (Child Bonus) : it is a universal financial aid for families with dependent children up to 21 years old, or indefinitely for disabled children. The amount varies based on family income (ISEE), the number and age of children, and any disabilities. • Bonus partite IVA (VAT Bonus) it is a one-time 200 euro aid for self-employed and professional workers who earned less than 35,000 euros in 2021, have an active VAT, and made at least one contributory payment by May 18, 2022. • Bonus sportivi (Sport bonus): it is a one-time

200 euro incentive to sports collaborators. • "Bonus Covid": it provides a 1,600 euro payment for certain categories of workers heavily impacted by the COVID-19 crisis. This bonus is available to occasional self-employed workers who do not have a VAT number and are not enrolled in other mandatory pension schemes. • Bonus mobilità (Mobility bonus): contribution of 750 euros that could be used to purchase electric scooters, electric or traditional bicycles, for public transport subscriptions. • Bonus 600 euro: a 600 euro income support allowance provided under Italy’s "Cura Italia" de- To improve the initial pool quality, we removed duplicree to self-employed professionals with an active cates (n=6543). Moreover, after manually inspecting the VAT number as of February 23, 2020. pool, we discarded posts related to the keywords “decreti” • Bonus vacanza (Holiday bonus): part of "De- (eng: decree) and “credito d’imposta” (eng: tax credit) as creto Rilancio", it ofers up to 500 euros to be used they mainly pulled unrelated or too generic posts. The for payment of tourism services and packages pro- resulting collection counts approximately 100,000 posts vided by national tourist accommodations, travel relative to 12 diferent queries. agencies, tour operators, farm stays, and bed & breakfasts.

2.2. Data Annotation

2https://www.inps.it/it/it/inps-comunica/ notizie/dettaglio-news-page.news.2020.10. misure-covid-19-i-dati-al-10-ottobre-2020.html To balance annotation quantity and quality, we decided to collect extensive annotations for 10% of the initial pool.

Irony 66.7% 16.8% 5.8% 3.2%

2.2% 13.1% 81% 14% 5% When available, the preceding posts and media are the conversational context and can help disambiguate the Table 2 post’s meaning.

Sentiment in MoniCA. Each post was annotated for (1) subjectivity, (2) sentiment, (3) topic, and (4) emotion and (5) irony. Subjectivity was assessed as binary (subjective or not subjec

A critical issue with our initial pool was the presence tive); sentiment classification included negative, neutral, of news posts, most frequently by media agencies and and positive categories; irony was annotated as ironic newspaper accounts. However, these posts are irrelevant or not ironic; The topics were carefully pre-determined to our goal of monitoring public perception of bonuses. together with annotators, taking into account the aspects Following previous work [7], we conducted a first round we aimed to extract from the data (see Table 4 for the list of annotation for relevance. We held round-table meet- of topics); emotions included anger, sadness, joy, disgust, ings to settle on a shared definition of relevance; then, and fear categories; irony was assessed as binary. Annowe assigned 200 posts to each annotator and requested tators were given the possibility to select more than one to choose whether each was relevant. We considered a emotion and topic per post. Moreover, we asked annotweet irrelevant if it mentions a bonus but focuses on tators to highlight the (6) span(s) of text that motivated another topic.3 Next, we trained a supervised classifier their sentiment annotation. (1), (2), (3), (4) and (5) will to detect relevance and used it to select 10,400 additional serve to map the public opinion on the studied measures, posts from 7238 unique users.4 and (6) will allow us to verify whether NLP models detect

The annotation was conducted in three iterations. In sentiment like a human would (§5). the first two, we tasked annotators to annotate a shared set of 100 posts to compute agreement and tune annotation guidelines. Then, we assigned each annotator 3,333 posts, non-overlapping among them. In the next step we aggregated the labels. For subjectivity, sentiment, and irony we selected the annotations through majority voting, while for emotions and topics we used all the identified emotions from all the annotators. During this process, we identified some missing values in annotations that we addressed by removing them. The final set comprises 9,763 posts with one annotation each.

See Appendix B for full details on the annotation process, including pay rates, annotation platform and guidelines, inter-annotator agreement, intra-annotator consistency over time, and classifier performance.

General Statistics. Tables 1,2 and 3 report the distribution of sentiment and emotions over the possible options.

Similar to related work [6, 7, 8], both sentiment and emotion are heavily skewed toward negative attitudes.

The vast majority of posts (96.8%) are subjective; among them, 78% of the posts are negative, whereas 62% show anger. Irony notably appears in 5.4% of the posts. Table 4 shows the discussion topics and their proportion. Half of the posts are directed toward politicians, with even a higher spike in negative sentiment (93.4%).

These findings, taken together, convey a critical message: The majority of social media comments about ifnancial aid in Italy in 2021 are from unhappy people. Such users posted on X with a negative sentiment, Annotation Fields. To conduct the annotation, we showing anger, sadness, disgust, or fear eight times out provided annotators with i) the post’s main text, ii) pub- of ten. Some of our fine-grained annotations disclose lication date, iii) at most two antecedent posts in the con- some potential reasons: 8.5% of posts mention struggling versation tree, and iv) any multimedia content if present. to obtain a bonus, 1.4% not having the requisites, and 1.3% do not benefit from or get the bonus. 3E.g., “@user Ma allora sei grillina ?! Il bonus vacanze l’ha dato lo Stato no De Luca.” En: “@user are you grillina then? De Luca provided bonus vacanze, not the state.—grillina is an idiomatic expression indicating someone who votes for the Movimento Cinque Stelle political party. 4We selected posts with a relevance score above 0.95, stratifying on the publication month, user ID, and matching search query to preserve variety in the data.

3. Experiments

We are particularly interested in verifying whether stateof-the-art NLP tools can help us automatically model Requesting a bonus Asking for information Obtained a bonus Not obtained a bonus Struggling to obtain a bonus Struggling to benefit from a bonus Is interested in a bonus Does not have the requisites to access to a bonus Addressing the political class

Proportion 4. Results

Table 5 reports classification performance for every model-task pair in our setup. Our experiments revealed disparate performance across tasks.

We observed higher scores on the subjectivity detection task, probably due to the easier binary setup and the high unbalance. Emotion detection proved most challenging due to the subtle distinctions between classes. Interestingly, UmBERTo classified instances as either anger or joy, while LR defaulted to anger for all cases. FEEL-IT stood out by successfully identifying sadness and fear, highlighting the need for more data to capture the full spectrum of emotional nuances. None of the classifiers ever detected disgust.

Topic detection was also another dificult task. In addition to a higher number of unique topics, text content among topics might overlap (e.g., users who complain about struggling to get a bonus might use similar language to those who cannot see benefits from it).

UmBERTo demonstrated strong performance, excelling in three out of five tasks (avg. Macro F1: 43.18, Weighted F1: 74.8). Interestingly, simpler methods like logistic regression also performed reliably (avg. Macro F1: 35.68, Weighted F1: 71.88). These results are promising, showing that both straightforward models and advanced large-scale models—pretrained in the target language, Italian—can efectively serve as tools for automatic detection of subjectivity, sentiment, emotion, irony, and public attitudes. However, the natural imbalance in the data plays a significant role in these experiments, suggesting that further work is needed to address this issue more efectively. and detect the users’ opinions. If models succeed at this task, they will serve as a digital barometer for monitoring issues and pitfalls of state-enacted financial aids.

We designed four text classification tasks to train a model for automatic (1) Subjectivity, (2) Sentiment, (3) Emotion, (4) Irony, and (5) Topic detection. (1) and (5) are binary classification tasks; (2), (3), and (5) are three-, 5. Explainability Experiments six-, and nine-way multi-class classification tasks.

We used Logistic Regression (LR), fine-tuned a pre- Interpretability research in NLP has developed methods trained Italian BERT model named UmBERTo [10], and and tools to help explain the rationale behind a model tested an existing BERT model for emotion and sentiment prediction. These tools are beneficial to assess and debug detection in Italian named FEEL-IT [11]5. models, e.g., by checking whether a model “is right for

LR has been trained on preprocessed texts: We con- the right reason” or the cause of the error [12]. verted all posts to lowercase and removed special char- We conducted an additional interpretability analysis acters and stopwords, replaced URLs and user handles on UmBERTo, the best-performing model across our dewith special tags, and performed stemming. tection tasks (see §4). This study aims to verify whether

Given the significant class imbalance in our anno- the model’s decision process aligns with those hightated data, we report both macro and weighted F1 lighted by humans. Transparency on model internals and scores. Macro F1 averages the performance across all human alignment promotes accountability and trust.6 classes, highlighting the model’s efectiveness on minority classes. Weighted F1 adjusts for class distribution, Setup. Following [13, 14], we use four common postreflecting overall performance in line with class preva- hoc token-level attribution methods [15], i.e., LIME [16], lence. This dual reporting provides a balanced view of SHAP [17], Integrated Gradient [18], and Gradient [19] the model’s performance. across diferent configurations. Given a model and a model prediction (e.g., Sentiment: “Negative”), each 5FEEL-IT does not predict the neutral class in the sentiment classification task.

6EU guidelines: https://bit.ly/eu-ai-guide.

LIME Human ... method assigns an importance score to each input to- study to understand how models predict sentiment from ken for that prediction. Table 6 reports an explanation text. We found that explanation quality varies across example in the first row and the human rationale anno- methods and recommended LIME as a sensible starting tated in the second row. choice.

We use faithfulness and plausibility [20] to evaluate Our dataset and study fill a critical research gap by explanations. Faithfulness evaluates how accurately the examining Italian public sentiment towards COVID-19 explanation reflects the inner workings of the model. measures. Future research will build on this groundwork Plausibility, on the other hand, assesses how well the to build more efective opinion monitoring and mining explanations align with human reasoning. We use the hu- tools and ultimately inform prompt and targeted policy man rationales provided by the three annotators during decisions. Additionally, to better understand the severity the annotation phase, and the UmBERTo model trained of negative attitude, future research may concentrate on the sentiment classification task, explaining the most on examining hate speech in relation to public policies likely class label for each test instance. We use three during the pandemic in Italy [22, 23]. faithfulness (Comprehensiveness, Suficiency, and Correlation with leave-out-out) and plausibility (Token IOU, Token F1, AUPRC) metrics as described in DeYoung et al. Acknowledgments [21, ERASER] and leverage ferret [14] for explanation generation and evaluation.

Table 7 shows that LIME is, on average, the best model to explain predictions, indicating that LIME provides explanations that are both comprehensive and suficient.

This project has in part received funding from Fondazione Cariplo (grant No. 2020-4288, MONICA) and from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 101116095, PERSONAE). Debora Nozza and Fabio Pernisi are member 6. Conclusion of the MilaNLP group and the Data and Marketing Insights Unit of the Bocconi Institute for Data Science and We documented the collection and release of MoniCA, Analysis. Giuseppe Attanasio conducted part of the work the first large-scale dataset for monitoring the cover- as a member of the MilaNLP group. Additionally, he age and attitudes of financial aid enacted by the Italian was partially supported by the Portuguese Recovery and government during the COVID-19 pandemic. It counts Resilience Plan through project C645008882-00000055 around 10,000 annotated posts for subjectivity, sentiment, (Center for Responsible AI) and by Fundação para a Ciênemotion, irony, and topic. We conducted a first analysis cia e Tecnologia through contract UIDB/50008/2020. and discovered that (1) most posts have a negative tone and (2) NLP and machine learning models can help detect it. Finally, we conducted a preliminary explainability

Limitations

Our collection might not represent the opinions of the entire population. All posts included in our dataset were taken from X, which might have a specific user demographic that is skewed towards a specific demographic.

Additionally, a potential limitation might arise from the dependency of our data on keyword matching. This form of sampling might prevent some topics from being included in the dataset. However, we carried out keyword selection very carefully, including words and phrases that captured discussions around pro-bono government aid (see Section 2.2).

Another limitation is that our data covers a specific but quite broad temporal window from March 1 to December 31, 2021. This window corresponds to a phase of the pandemic, and changes in public opinion following this period are not captured.

Volume 70, ICML’17, JMLR.org, 2017, p. 3319–3328. • Bonus vacanza (Holiday bonus): "bonus [19] K. Simonyan, A. Vedaldi, A. Zisserman, Deep in- vacanza" OR "bonus vacanze" OR side convolutional networks: Visualising image "bonus vacanze" OR #bonusvacanza OR classification models and saliency maps, CoRR #bonusvacanze abs/1312.6034 (2013). • Reddito di emergenza (Emergency income): [20] A. Jacovi, Y. Goldberg, Towards faithfully inter- "reddito d’emergenza" OR "reddito di pretable NLP systems: How should we define and emergenza" OR #redditodemergenza OR evaluate faithfulness?, in: Proceedings of the 58th #redditodiemergenza OR #REM Annual Meeting of the Association for Computa- • Bonus terme (Spa bonus): "bonus terme" tional Linguistics, Association for Computational OR #bonusterme

Linguistics, Online, 2020, pp. 4198–4205. • Bonus babysitter: "bonus babysitter" [21] J. DeYoung, S. Jain, N. F. Rajani, E. Lehman, OR "bonus baby-sitter" OR C. Xiong, R. Socher, B. C. Wallace, ERASER: "bonus babysitting" OR "bonus A benchmark to evaluate rationalized NLP mod- baby-sitting" OR #bonusbabysitter OR els, in: Proceedings of the 58th Annual Meet- #bonusbabysitting ing of the Association for Computational Linguis- • Bonus asilo nido (Daycare/nursery bonus): tics, Association for Computational Linguistics, On- "bonus asilo nido" OR #bonusasilonido line, 2020, pp. 4443–4458. URL: https://aclanthology. org/2020 .acl-main.408. doi:10.18653/v1/2020. • Bonus figli (Child Bonus) : "bonus figli" acl-main.408. OR #bonusfigli [22] D. Nozza, F. Bianchi, G. Attanasio, HATE-ITA: • Bonus partite IVA (VAT Bonus): "bonus Hate speech detection in Italian social media text, partite iva" OR #bonuspartiteiva in: Proceedings of the Sixth Workshop on Online • Bonus sportivi (Sport bonus): "bonus Abuse and Harms (WOAH), Association for Compu- lavoratori sportivi" OR "bonus tational Linguistics, Seattle, Washington (Hybrid), sportivi" OR (bonus lavoratori 2022, pp. 252–260. sportivi) OR (bonus collaboratori [23] F. M. Plaza-del arco, D. Nozza, D. Hovy, Respectful sportivi) OR "bonus collaboratori or toxic? using zero-shot learning with language sportivi" OR #bonussportivi models to detect hate speech, in: The 7th Workshop • "Bonus Covid": "bonus covid" OR on Online Abuse and Harms (WOAH), Association #bonuscovid for Computational Linguistics, Toronto, Canada, 2023 , pp. 60–68. B. Data Annotation [24] G. Abercrombie, D. Hovy, V. Prabhakaran, Temporal and second language influence on intra- Profile and pay rate. For annotating the MoniCA annotator agreement and stability in hate speech dataset, three student research assistants with backlabelling, in: Proceedings of the 17th Linguistic grounds in Machine Learning and Natural Language ProAnnotation Workshop (LAW-XVII), Association for cessing were hired full-time. They were each compenComputational Linguistics, Toronto, Canada, 2023 . sated for 32 hours of work at a rate of about 18 euros per hour. We provided each annotator with an initial set of annotation guidelines, and we organized initial A. Data Collection meetings to familiarize them with the task and refine the guidelines.

Data for the MoniCA dataset was gathered using X’s

proprietary historical API, via an academic subscription. Platform. We used Label Studio8 using a custom la

Below is the complete list of f keywords used for data beling schema. We report the annotation schema and collection in the form of a tweepy7 query: guidelines in the repository associated with the project. • Bonus mobilità (Mobility bonus): "bonus mo- A screenshot of an annotated example is shown in Figure bilita" OR "bonus bici" OR "bonus monopattino" 1 for reference.

OR #bonusmobilita OR #bonusbici OR #bonusmonopattino. Agreement and consistency. The three annotators • Bonus 600 euro: "bonus 600 euro" OR shared a pool of 100 posts. On these, we computed Krip"bonus 600euro" OR "bonus 600" OR pendorf’s alpha of 0.57 on subjectivity (i.e., is the post #bonus600euro OR #bonus600 subjective or not), 0.60 on the post sentiment, and 0.51 on 7https://www.tweepy.org/ 8https://labelstud.io/ whether the contextual information was used. The agreement on sentiment increases to 0.61 when considering only posts that were considered subjective by everyone.

Moreover, we provided each annotator with a copy of 100 samples randomly shufled later in the pool of posts to validate their consistency over time [24]. Annotators were highly consistent. On average, they annotated subjectivity consistently 95% of the time and sentiment 87% of the time.

china via bert model , Ieee Access 8 ( 2020 ) 138162 -

138169. [9]

De Rosis ,

Lopreite ,

Puliga , M. Vainieri,

The early weeks of the italian covid-19 outbreak:

Policy 125 ( 2021 ) 987 - 994 . [10]

Breiman , Random forests, Machine learning 45

( 2001 ) 5 - 32 . [11]

Bianchi ,

Nozza ,

Hovy , FEEL-IT: Emotion

tational Linguistics , Online, 2021 , pp. 76 - 83 . [12]

Danilevsky ,

Qian ,

Aharonov ,

Katsis ,

ceedings of the 1st Conference of the Asia-Pacific

guistics and the 10th International Joint Conference

Computational

Linguistics , Suzhou, China, 2020 ,

pp. 447 - 459 . [13]

Attanasio ,

Nozza ,

Pastor ,

Hovy , Bench-

tional Linguistics , Dublin, Ireland, 2022 , pp. 100 -

112. [14]

Attanasio ,

Pastor ,

Di Bonaventura ,

Nozza ,

ers on transformers , in: Proceedings of the 17th

Linguistics , Dubrovnik, Croatia, 2023 , pp. 256 - 266 .

URL: https://aclanthology.org/ 2023 .eacl-demo. 29 .

doi:10 .18653/v1/ 2023 .eacl-demo. 29 . [1]

Medhat ,

Hassan ,

Korashy , Sentiment anal-

Shams engineering journal 5 ( 2014 ) 1093 - 1113 . [2]

Giachanou ,

Crestani , Like it or not: A sur-

Computing

Surveys (CSUR) 49 ( 2016 ) 1 - 41 . [3]

Qian ,

Mathur ,

N. H.

Zakaria , R. Arora,

Management 59 ( 2022 ) 103098 . [4]

Müller ,

Salathé ,

P. E.

Kummervold , Covid-

to analyse covid-19 content on twitter , Frontiers in

Artificial Intelligence 6 ( 2023 ) 1023281 . [5]

Chen ,

Lerman , E. Ferrara, Tracking social

media discourse about the covid- 19 pandemic: De-

set , JMIR Public Health Surveill 6 ( 2020 ) e19273 . [15]

Madsen ,

Reddy ,

Chandar , Post-hoc inter-

URL: http://publichealth.jmir.org/ 2020 /2/e19273/. pretability for neural nlp: A survey , ACM Comput-

doi:10.2196/19273. ing Surveys 55 ( 2022 ) 1 - 42 . [6]

Kaur ,

Kaul ,

P. M.

Zadeh , Monitoring the dy- [16]

M. T.

Ribeiro ,

Singh ,

Guestrin , " why should i

namics of emotions during covid-19 using twitter trust you?" explaining the predictions of any clas-

data , Procedia Computer Science 177 ( 2020 ) 423 - sifier , in : Proceedings of the 22nd ACM SIGKDD

430. international conference on knowledge discovery [7]

Scott ,

Delobelle ,

Berendt , Measuring and data mining , 2016 , pp. 1135 - 1144 .

shifts in attitudes towards covid- 19 measures in [17]

S. M.

Lundberg ,

S.-I.

Lee , A unified approach to

lands Journal 11 ( 2021 ) 161 - 171 . URL: https://www. the 31st International Conference on Neural Infor-

clinjournal.org/clinj/article/view/133. mation Processing Systems, NIPS'17, Curran

Asso

[8]

Wang ,

Lu ,

K. P.

Chow ,

Zhu , Covid-19 sens- ciates Inc ., Red

Hook

, NY , USA, 2017 , p. 4768 - 4777 .

ing: negative sentiment analysis on social media in [18]

Sundararajan ,

Taly ,

Yan , Axiomatic attribu-

tion for deep networks , in: Proceedings of the 34th