<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>MONICA: Monitoring Coverage and Attitudes of Italian Measures in Response to COVID-19</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Fabio Pernisi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giuseppe Attanasio</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Debora Nozza</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computing Sciences, Bocconi University</institution>
          ,
          <addr-line>Milan</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Instituto de Telecomunicações</institution>
          ,
          <addr-line>Lisbon</addr-line>
          ,
          <country country="PT">Portugal</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Modern social media have long been observed as a mirror for public discourse and opinions. Especially in the face of exceptional events, computational language tools are valuable for understanding public sentiment and reacting quickly. During the coronavirus pandemic, the Italian government issued a series of financial measures, each unique in target, requirements, and benefits. Despite the widespread dissemination of these measures, it is currently unclear how they were perceived and whether they ultimately achieved their goal. In this paper, we document the collection and release of MoniCA, a new social media dataset for MONItoring Coverage and Attitudes to such measures. Data include approximately ten thousand posts discussing a variety of measures in ten months. We collected annotations for sentiment, emotion, irony, and topics for each post. We conducted an extensive analysis using computational models to learn these aspects from text. We release a compliant version of the dataset to foster future research on computational approaches for understanding public opinion about government measures. We release data and code at https://github.com/MilaNLProc/MONICA.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Sentiment Analysis</kwd>
        <kwd>Social Media</kwd>
        <kwd>Computational Social Science</kwd>
        <kwd>Italian</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>and Attitudes of Italian measures to COVID-19.
MoniCA comprises approximately 10,000 posts spanning ten
Understanding public opinion on governmental decisions months collected on X.com. These posts pertain to the
has always been crucial for assessing policies’ efective- Italian public’s discussions on diverse financial measures
ness, especially when facing exceptional events requiring introduced during the pandemic. Building on an
extenprompt decisions. Computational linguistics and social sive body of literature that examines public sentiment
scientists have long observed modern social media plat- during the pandemic [e.g., 4, 5, 6, 7, 8], this work
offorms as they are a perfect stage for spreading opinions fers new insights into the limited research specifically
swiftly and transparently. Natural Language Processing addressing Italy.1
(NLP) techniques have been widely used for analyzing This paper details the dataset’s collection and release.
public discussion [e.g., 1, 2, 3]. It introduces the annotations we compiled for each post,</p>
      <p>The COVID-19 pandemic, arguably the most promi- including sentiment, emotion, irony, and discussion
topnent of such exceptional events, prompted the Italian ics. Then, we conducted an analysis using traditional
government—and other European governments—to re- models and transformer-based language models to
prelease multiple financial measures to cushion the impact dict these aspects from textual data, demonstrating the
on the population. These so-called “bonuses,” issued dataset’s potential usability. Moreover, using
state-ofpro bono, i.e., with no interest payments from recipients, the-art interpretability tools, we explained the models’
aimed at increasing liquidity and reducing tax burdens. decision processes. We found that explanations are
faithHowever, despite reaching varied recipients, compre- ful and plausible to human judgments.
hending the measures’ reception and evaluating their MoniCA will allow a retrospective examination of the
efectiveness still needs to be explored. eficacy – and ineficacy – of governmental measures</p>
      <p>To address this gap, we collect and release MoniCA, implemented in Italy during the COVID-19 pandemic,
a new social media dataset for MONItoring Coverage as perceived by the population. By doing so, we seek
to provide insights that can inform policymakers about
CLiC-it 2024: Tenth Italian Conference on Computational Linguistics, the strengths and weaknesses of such financial measures,
Dec 04 — 06, 2024, Pisa, Italy ensuring better preparedness and response strategies for
g$iufsaebpipoe.p.aetrtnainsai@sios@tuldxb.iotc.pcto(nGi..itA(Ftt.aPnearsniois)i;); any future crises.
debora.nozza@unibocconi.it (D. Nozza)
 https://gattanasio.cc/ (G. Attanasio); https://deboranozza.com/ Contributions. We release MoniCA, a
GDPR(D. Nozza) compliant dataset of social media posts to monitor
0000-0001-6945-3698 (G. Attanasio); 0000-0002-7998-2267
(D. Nozza) 1See De Rosis et al. [9] for one of the early (and few) works on
©At2tr0i2b4utCioonpy4r.0igIhnttefornratthioisnpaalp(CerCbByYit4s.0a)u.thors. Use permitted under Creative Commons License modelling sentiment from Twitter during the COVID-19 outbreak.
the coverage and people’s attitude towards Italy’s
government’s financial aid to combat the COVID-19
crisis. We collect annotations of several aspects to allow
for a finer-grained analysis. We used state-of-the-art
NLP and interpretability tools and reported key insights
on public sentiment.</p>
    </sec>
    <sec id="sec-2">
      <title>2. MoniCA</title>
      <p>To build a comprehensive resource, reflecting multiple
facets of the phenomenon and usable for future
policymakers, we prioritized 1) topic and time coverage in our
collection process (§2.1), and 2) relevance refinement and
data annotation to enrich the initial pool with additional
metadata (§2.2).</p>
      <sec id="sec-2-1">
        <title>2.1. Data Collection</title>
        <sec id="sec-2-1-1">
          <title>We collected approximately 200,000 posts from X in late</title>
          <p>2022. We then filtered each post to obtain data that was
in Italian (per the platform-retrieved metadata), not a
repost, dated between March 1, 2021, and December 31,
2021, and selected via hard keyword matching.</p>
          <p>We chose search keywords and phrases that match
the informal name of any of the measures – e.g., “bonus
bicicletta” (eng: bike bonus) or “bonus babysitting.” – and
download all matching posts. The keywords we used to
identify relevant discussions in the posts were selected
based on insights from an author who is native to Italy
and was residing there during the pandemic period
(20192022). Additional keyword refinement was supported by
details from the National Social Security Institute (INPS)
about COVID-19 measures.2</p>
          <p>Below is the complete list of financial measures on
which we focused (see Appendix for corresponding
keywords):
• Reddito di emergenza (Emergency income):
a temporary income support measure established
by the "Decreto Rilancio" for households facing
ifnancial dificulties.
• Bonus terme (Spa bonus): it is an incentive
(of up to 200 euros) aimed at supporting citizens’
purchases of spa services at accredited facilities.
• Bonus babysitter: it is a measure providing
parents of children under 14 in remote learning or
quarantine with a bonus (up to 1,200 or 2,000
euros) for purchasing babysitting or child care
services. It is available to certain workers
including those in public security and healthcare sectors
involved in the Covid-19 response.
• Bonus asilo nido (Daycare/nursery bonus): it
is an income support subsidy aimed at families
with children under three years old attending
public or authorized private nurseries or those
suffering from severe chronic illnesses. The bonus
amount varies based on the family’s ISEE
income level, with maximum yearly benefits
ranging from 1,500 to 3,000 euros.
• Bonus figli (Child Bonus) : it is a universal
financial aid for families with dependent children
up to 21 years old, or indefinitely for disabled
children. The amount varies based on family income
(ISEE), the number and age of children, and any
disabilities.
• Bonus partite IVA (VAT Bonus) it is a one-time
200 euro aid for self-employed and professional
workers who earned less than 35,000 euros in
2021, have an active VAT, and made at least one
contributory payment by May 18, 2022.
• Bonus sportivi (Sport bonus): it is a one-time</p>
          <p>200 euro incentive to sports collaborators.
• "Bonus Covid": it provides a 1,600 euro
payment for certain categories of workers heavily
impacted by the COVID-19 crisis. This bonus
is available to occasional self-employed workers
who do not have a VAT number and are not
enrolled in other mandatory pension schemes.
• Bonus mobilità (Mobility bonus):
contribution of 750 euros that could be used to purchase
electric scooters, electric or traditional bicycles,
for public transport subscriptions.
• Bonus 600 euro: a 600 euro income support
allowance provided under Italy’s "Cura Italia" de- To improve the initial pool quality, we removed
duplicree to self-employed professionals with an active cates (n=6543). Moreover, after manually inspecting the
VAT number as of February 23, 2020. pool, we discarded posts related to the keywords “decreti”
• Bonus vacanza (Holiday bonus): part of "De- (eng: decree) and “credito d’imposta” (eng: tax credit) as
creto Rilancio", it ofers up to 500 euros to be used they mainly pulled unrelated or too generic posts. The
for payment of tourism services and packages pro- resulting collection counts approximately 100,000 posts
vided by national tourist accommodations, travel relative to 12 diferent queries.
agencies, tour operators, farm stays, and bed &amp;
breakfasts.</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Data Annotation</title>
        <p>2https://www.inps.it/it/it/inps-comunica/
notizie/dettaglio-news-page.news.2020.10.
misure-covid-19-i-dati-al-10-ottobre-2020.html
To balance annotation quantity and quality, we decided
to collect extensive annotations for 10% of the initial pool.</p>
        <p>Irony
66.7%
16.8%
5.8% 3.2%</p>
        <p>2.2% 13.1%
81% 14% 5% When available, the preceding posts and media are the
conversational context and can help disambiguate the
Table 2 post’s meaning.</p>
        <p>Sentiment in MoniCA. Each post was annotated for (1) subjectivity, (2)
sentiment, (3) topic, and (4) emotion and (5) irony.
Subjectivity was assessed as binary (subjective or not
subjec</p>
        <p>A critical issue with our initial pool was the presence tive); sentiment classification included negative, neutral,
of news posts, most frequently by media agencies and and positive categories; irony was annotated as ironic
newspaper accounts. However, these posts are irrelevant or not ironic; The topics were carefully pre-determined
to our goal of monitoring public perception of bonuses. together with annotators, taking into account the aspects
Following previous work [7], we conducted a first round we aimed to extract from the data (see Table 4 for the list
of annotation for relevance. We held round-table meet- of topics); emotions included anger, sadness, joy, disgust,
ings to settle on a shared definition of relevance; then, and fear categories; irony was assessed as binary.
Annowe assigned 200 posts to each annotator and requested tators were given the possibility to select more than one
to choose whether each was relevant. We considered a emotion and topic per post. Moreover, we asked
annotweet irrelevant if it mentions a bonus but focuses on tators to highlight the (6) span(s) of text that motivated
another topic.3 Next, we trained a supervised classifier their sentiment annotation. (1), (2), (3), (4) and (5) will
to detect relevance and used it to select 10,400 additional serve to map the public opinion on the studied measures,
posts from 7238 unique users.4 and (6) will allow us to verify whether NLP models detect</p>
        <p>The annotation was conducted in three iterations. In sentiment like a human would (§5).
the first two, we tasked annotators to annotate a shared
set of 100 posts to compute agreement and tune
annotation guidelines. Then, we assigned each annotator 3,333
posts, non-overlapping among them. In the next step
we aggregated the labels. For subjectivity, sentiment,
and irony we selected the annotations through majority
voting, while for emotions and topics we used all the
identified emotions from all the annotators. During this
process, we identified some missing values in
annotations that we addressed by removing them. The final set
comprises 9,763 posts with one annotation each.</p>
        <p>See Appendix B for full details on the annotation
process, including pay rates, annotation platform and
guidelines, inter-annotator agreement, intra-annotator
consistency over time, and classifier performance.</p>
        <p>General Statistics. Tables 1,2 and 3 report the
distribution of sentiment and emotions over the possible
options.</p>
        <p>Similar to related work [6, 7, 8], both sentiment and
emotion are heavily skewed toward negative attitudes.</p>
        <p>The vast majority of posts (96.8%) are subjective; among
them, 78% of the posts are negative, whereas 62% show
anger. Irony notably appears in 5.4% of the posts. Table 4
shows the discussion topics and their proportion. Half
of the posts are directed toward politicians, with even a
higher spike in negative sentiment (93.4%).</p>
        <p>These findings, taken together, convey a critical
message: The majority of social media comments about
ifnancial aid in Italy in 2021 are from unhappy
people. Such users posted on X with a negative sentiment,
Annotation Fields. To conduct the annotation, we showing anger, sadness, disgust, or fear eight times out
provided annotators with i) the post’s main text, ii) pub- of ten. Some of our fine-grained annotations disclose
lication date, iii) at most two antecedent posts in the con- some potential reasons: 8.5% of posts mention struggling
versation tree, and iv) any multimedia content if present. to obtain a bonus, 1.4% not having the requisites, and
1.3% do not benefit from or get the bonus.
3E.g., “@user Ma allora sei grillina ?! Il bonus vacanze l’ha dato
lo Stato no De Luca.” En: “@user are you grillina then? De Luca
provided bonus vacanze, not the state.—grillina is an idiomatic
expression indicating someone who votes for the Movimento Cinque
Stelle political party.
4We selected posts with a relevance score above 0.95, stratifying
on the publication month, user ID, and matching search query to
preserve variety in the data.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Experiments</title>
      <p>We are particularly interested in verifying whether
stateof-the-art NLP tools can help us automatically model
Requesting a bonus
Asking for information
Obtained a bonus
Not obtained a bonus
Struggling to obtain a bonus
Struggling to benefit from a bonus
Is interested in a bonus
Does not have the requisites to access to a
bonus
Addressing the political class</p>
    </sec>
    <sec id="sec-4">
      <title>Proportion 4. Results</title>
      <p>Table 5 reports classification performance for every
model-task pair in our setup. Our experiments revealed
disparate performance across tasks.</p>
      <p>We observed higher scores on the subjectivity
detection task, probably due to the easier binary setup and
the high unbalance. Emotion detection proved most
challenging due to the subtle distinctions between classes.
Interestingly, UmBERTo classified instances as either anger
or joy, while LR defaulted to anger for all cases. FEEL-IT
stood out by successfully identifying sadness and fear,
highlighting the need for more data to capture the full
spectrum of emotional nuances. None of the classifiers
ever detected disgust.</p>
      <p>Topic detection was also another dificult task. In
addition to a higher number of unique topics, text content
among topics might overlap (e.g., users who complain
about struggling to get a bonus might use similar
language to those who cannot see benefits from it).</p>
      <p>UmBERTo demonstrated strong performance,
excelling in three out of five tasks (avg. Macro F1: 43.18,
Weighted F1: 74.8). Interestingly, simpler methods like
logistic regression also performed reliably (avg. Macro F1:
35.68, Weighted F1: 71.88). These results are promising,
showing that both straightforward models and advanced
large-scale models—pretrained in the target language,
Italian—can efectively serve as tools for automatic
detection of subjectivity, sentiment, emotion, irony, and public
attitudes. However, the natural imbalance in the data
plays a significant role in these experiments, suggesting
that further work is needed to address this issue more
efectively.
and detect the users’ opinions. If models succeed at this
task, they will serve as a digital barometer for monitoring
issues and pitfalls of state-enacted financial aids.</p>
      <p>We designed four text classification tasks to train a
model for automatic (1) Subjectivity, (2) Sentiment, (3)
Emotion, (4) Irony, and (5) Topic detection. (1) and (5)
are binary classification tasks; (2), (3), and (5) are three-, 5. Explainability Experiments
six-, and nine-way multi-class classification tasks.</p>
      <p>We used Logistic Regression (LR), fine-tuned a pre- Interpretability research in NLP has developed methods
trained Italian BERT model named UmBERTo [10], and and tools to help explain the rationale behind a model
tested an existing BERT model for emotion and sentiment prediction. These tools are beneficial to assess and debug
detection in Italian named FEEL-IT [11]5. models, e.g., by checking whether a model “is right for</p>
      <p>LR has been trained on preprocessed texts: We con- the right reason” or the cause of the error [12].
verted all posts to lowercase and removed special char- We conducted an additional interpretability analysis
acters and stopwords, replaced URLs and user handles on UmBERTo, the best-performing model across our
dewith special tags, and performed stemming. tection tasks (see §4). This study aims to verify whether</p>
      <p>Given the significant class imbalance in our anno- the model’s decision process aligns with those
hightated data, we report both macro and weighted F1 lighted by humans. Transparency on model internals and
scores. Macro F1 averages the performance across all human alignment promotes accountability and trust.6
classes, highlighting the model’s efectiveness on
minority classes. Weighted F1 adjusts for class distribution, Setup. Following [13, 14], we use four common
postreflecting overall performance in line with class preva- hoc token-level attribution methods [15], i.e., LIME [16],
lence. This dual reporting provides a balanced view of SHAP [17], Integrated Gradient [18], and Gradient [19]
the model’s performance. across diferent configurations. Given a model and a
model prediction (e.g., Sentiment: “Negative”), each
5FEEL-IT does not predict the neutral class in the sentiment
classification task.</p>
      <p>6EU guidelines: https://bit.ly/eu-ai-guide.</p>
      <p>LIME
Human
...
method assigns an importance score to each input to- study to understand how models predict sentiment from
ken for that prediction. Table 6 reports an explanation text. We found that explanation quality varies across
example in the first row and the human rationale anno- methods and recommended LIME as a sensible starting
tated in the second row. choice.</p>
      <p>We use faithfulness and plausibility [20] to evaluate Our dataset and study fill a critical research gap by
explanations. Faithfulness evaluates how accurately the examining Italian public sentiment towards COVID-19
explanation reflects the inner workings of the model. measures. Future research will build on this groundwork
Plausibility, on the other hand, assesses how well the to build more efective opinion monitoring and mining
explanations align with human reasoning. We use the hu- tools and ultimately inform prompt and targeted policy
man rationales provided by the three annotators during decisions. Additionally, to better understand the severity
the annotation phase, and the UmBERTo model trained of negative attitude, future research may concentrate
on the sentiment classification task, explaining the most on examining hate speech in relation to public policies
likely class label for each test instance. We use three during the pandemic in Italy [22, 23].
faithfulness (Comprehensiveness, Suficiency, and
Correlation with leave-out-out) and plausibility (Token IOU,
Token F1, AUPRC) metrics as described in DeYoung et al. Acknowledgments
[21, ERASER] and leverage ferret [14] for explanation
generation and evaluation.</p>
      <p>Table 7 shows that LIME is, on average, the best model
to explain predictions, indicating that LIME provides
explanations that are both comprehensive and suficient.</p>
      <p>This project has in part received funding from
Fondazione Cariplo (grant No. 2020-4288, MONICA) and
from the European Research Council (ERC) under the
European Union’s Horizon 2020 research and
innovation programme (grant agreement No. 101116095,
PERSONAE). Debora Nozza and Fabio Pernisi are member
6. Conclusion of the MilaNLP group and the Data and Marketing
Insights Unit of the Bocconi Institute for Data Science and
We documented the collection and release of MoniCA, Analysis. Giuseppe Attanasio conducted part of the work
the first large-scale dataset for monitoring the cover- as a member of the MilaNLP group. Additionally, he
age and attitudes of financial aid enacted by the Italian was partially supported by the Portuguese Recovery and
government during the COVID-19 pandemic. It counts Resilience Plan through project C645008882-00000055
around 10,000 annotated posts for subjectivity, sentiment, (Center for Responsible AI) and by Fundação para a
Ciênemotion, irony, and topic. We conducted a first analysis cia e Tecnologia through contract UIDB/50008/2020.
and discovered that (1) most posts have a negative tone
and (2) NLP and machine learning models can help
detect it. Finally, we conducted a preliminary explainability</p>
    </sec>
    <sec id="sec-5">
      <title>Limitations</title>
      <p>Our collection might not represent the opinions of the
entire population. All posts included in our dataset were
taken from X, which might have a specific user
demographic that is skewed towards a specific demographic.</p>
      <p>Additionally, a potential limitation might arise from
the dependency of our data on keyword matching. This
form of sampling might prevent some topics from being
included in the dataset. However, we carried out keyword
selection very carefully, including words and phrases that
captured discussions around pro-bono government aid
(see Section 2.2).</p>
      <p>Another limitation is that our data covers a specific but
quite broad temporal window from March 1 to December
31, 2021. This window corresponds to a phase of the
pandemic, and changes in public opinion following this
period are not captured.</p>
      <p>Volume 70, ICML’17, JMLR.org, 2017, p. 3319–3328. • Bonus vacanza (Holiday bonus): "bonus
[19] K. Simonyan, A. Vedaldi, A. Zisserman, Deep in- vacanza" OR "bonus vacanze" OR
side convolutional networks: Visualising image "bonus vacanze" OR #bonusvacanza OR
classification models and saliency maps, CoRR #bonusvacanze
abs/1312.6034 (2013). • Reddito di emergenza (Emergency income):
[20] A. Jacovi, Y. Goldberg, Towards faithfully inter- "reddito d’emergenza" OR "reddito di
pretable NLP systems: How should we define and emergenza" OR #redditodemergenza OR
evaluate faithfulness?, in: Proceedings of the 58th #redditodiemergenza OR #REM
Annual Meeting of the Association for Computa- • Bonus terme (Spa bonus): "bonus terme"
tional Linguistics, Association for Computational OR #bonusterme</p>
      <p>
        Linguistics, Online, 2020, pp. 4198–4205. • Bonus babysitter: "bonus babysitter"
[21] J. DeYoung, S. Jain, N. F. Rajani, E. Lehman, OR "bonus baby-sitter" OR
C. Xiong, R. Socher, B. C. Wallace, ERASER: "bonus babysitting" OR "bonus
A benchmark to evaluate rationalized NLP mod- baby-sitting" OR #bonusbabysitter OR
els, in: Proceedings of the 58th Annual Meet- #bonusbabysitting
ing of the Association for Computational Linguis- • Bonus asilo nido (Daycare/nursery bonus):
tics, Association for Computational Linguistics, On- "bonus asilo nido" OR #bonusasilonido
line, 2020, pp. 4443–4458.
        <xref ref-type="bibr" rid="ref24">URL: https://aclanthology.
org/2020</xref>
        .acl-main.408. doi:10.18653/v1/2020. • Bonus figli (Child Bonus) : "bonus figli"
acl-main.408. OR #bonusfigli
[22] D. Nozza, F. Bianchi, G. Attanasio, HATE-ITA: • Bonus partite IVA (VAT Bonus): "bonus
Hate speech detection in Italian social media text, partite iva" OR #bonuspartiteiva
in: Proceedings of the Sixth Workshop on Online • Bonus sportivi (Sport bonus): "bonus
Abuse and Harms (WOAH), Association for Compu- lavoratori sportivi" OR "bonus
tational Linguistics, Seattle, Washington (Hybrid), sportivi" OR
        <xref ref-type="bibr" rid="ref11">(bonus lavoratori
2022, pp. 252–260. sportivi)</xref>
        OR (bonus collaboratori
[23] F. M. Plaza-del arco, D. Nozza, D. Hovy, Respectful sportivi) OR "bonus collaboratori
or toxic? using zero-shot learning with language sportivi" OR #bonussportivi
models to detect hate speech, in: The 7th Workshop • "Bonus Covid": "bonus covid" OR
on Online Abuse and Harms (WOAH), Association #bonuscovid
for Computational
        <xref ref-type="bibr" rid="ref14">Linguistics, Toronto, Canada,
2023</xref>
        , pp. 60–68. B. Data Annotation
[24] G. Abercrombie, D. Hovy, V. Prabhakaran,
Temporal and second language influence on intra- Profile and pay rate. For annotating the MoniCA
annotator agreement and stability in hate speech dataset, three student research assistants with
backlabelling, in: Proceedings of the 17th Linguistic grounds in Machine Learning and Natural Language
ProAnnotation Workshop (LAW-XVII), Association for cessing were hired full-time. They were each
compenComputational
        <xref ref-type="bibr" rid="ref14">Linguistics, Toronto, Canada, 2023</xref>
        . sated for 32 hours of work at a rate of about 18 euros
per hour. We provided each annotator with an initial
set of annotation guidelines, and we organized initial
A. Data Collection meetings to familiarize them with the task and refine the
guidelines.
      </p>
      <sec id="sec-5-1">
        <title>Data for the MoniCA dataset was gathered using X’s</title>
        <p>proprietary historical API, via an academic subscription. Platform. We used Label Studio8 using a custom
la</p>
        <p>Below is the complete list of f keywords used for data beling schema. We report the annotation schema and
collection in the form of a tweepy7 query: guidelines in the repository associated with the project.
• Bonus mobilità (Mobility bonus): "bonus mo- A screenshot of an annotated example is shown in Figure
bilita" OR "bonus bici" OR "bonus monopattino" 1 for reference.</p>
        <p>OR #bonusmobilita OR #bonusbici OR
#bonusmonopattino. Agreement and consistency. The three annotators
• Bonus 600 euro: "bonus 600 euro" OR shared a pool of 100 posts. On these, we computed
Krip"bonus 600euro" OR "bonus 600" OR pendorf’s alpha of 0.57 on subjectivity (i.e., is the post
#bonus600euro OR #bonus600 subjective or not), 0.60 on the post sentiment, and 0.51 on
7https://www.tweepy.org/
8https://labelstud.io/
whether the contextual information was used. The
agreement on sentiment increases to 0.61 when considering
only posts that were considered subjective by everyone.</p>
        <p>Moreover, we provided each annotator with a copy of
100 samples randomly shufled later in the pool of posts
to validate their consistency over time [24]. Annotators
were highly consistent. On average, they annotated
subjectivity consistently 95% of the time and sentiment 87%
of the time.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <article-title>china via bert model</article-title>
          ,
          <source>Ieee Access</source>
          <volume>8</volume>
          (
          <year>2020</year>
          )
          <fpage>138162</fpage>
          -
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          138169. [9]
          <string-name>
            <given-names>S.</given-names>
            <surname>De Rosis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lopreite</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Puliga</surname>
          </string-name>
          , M. Vainieri,
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <article-title>The early weeks of the italian covid-19 outbreak:</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <source>Policy</source>
          <volume>125</volume>
          (
          <year>2021</year>
          )
          <fpage>987</fpage>
          -
          <lpage>994</lpage>
          . [10]
          <string-name>
            <given-names>L.</given-names>
            <surname>Breiman</surname>
          </string-name>
          , Random forests,
          <source>Machine learning 45</source>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          (
          <year>2001</year>
          )
          <fpage>5</fpage>
          -
          <lpage>32</lpage>
          . [11]
          <string-name>
            <given-names>F.</given-names>
            <surname>Bianchi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Nozza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Hovy</surname>
          </string-name>
          , FEEL-IT: Emotion
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>tational Linguistics</surname>
          </string-name>
          , Online,
          <year>2021</year>
          , pp.
          <fpage>76</fpage>
          -
          <lpage>83</lpage>
          . [12]
          <string-name>
            <given-names>M.</given-names>
            <surname>Danilevsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Qian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Aharonov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Katsis</surname>
          </string-name>
          ,
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <article-title>ceedings of the 1st Conference of the Asia-Pacific</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <source>guistics and the 10th International Joint Conference</source>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>Computational</given-names>
            <surname>Linguistics</surname>
          </string-name>
          , Suzhou, China,
          <year>2020</year>
          ,
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          pp.
          <fpage>447</fpage>
          -
          <lpage>459</lpage>
          . [13]
          <string-name>
            <given-names>G.</given-names>
            <surname>Attanasio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Nozza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Pastor</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Hovy</surname>
          </string-name>
          , Bench-
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <source>tional Linguistics</source>
          , Dublin, Ireland,
          <year>2022</year>
          , pp.
          <fpage>100</fpage>
          -
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          112. [14]
          <string-name>
            <given-names>G.</given-names>
            <surname>Attanasio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Pastor</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Di Bonaventura</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Nozza</surname>
          </string-name>
          ,
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <article-title>ers on transformers</article-title>
          ,
          <source>in: Proceedings of the 17th</source>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>Linguistics</surname>
          </string-name>
          , Dubrovnik, Croatia,
          <year>2023</year>
          , pp.
          <fpage>256</fpage>
          -
          <lpage>266</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          URL: https://aclanthology.org/
          <year>2023</year>
          .eacl-demo.
          <volume>29</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <source>doi:10</source>
          .18653/v1/
          <year>2023</year>
          .eacl-demo.
          <volume>29</volume>
          . [1]
          <string-name>
            <given-names>W.</given-names>
            <surname>Medhat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hassan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Korashy</surname>
          </string-name>
          , Sentiment anal-
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <source>Shams engineering journal 5</source>
          (
          <year>2014</year>
          )
          <fpage>1093</fpage>
          -
          <lpage>1113</lpage>
          . [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Giachanou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Crestani</surname>
          </string-name>
          ,
          <article-title>Like it or not: A sur-</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <given-names>Computing</given-names>
            <surname>Surveys</surname>
          </string-name>
          (CSUR)
          <volume>49</volume>
          (
          <year>2016</year>
          )
          <fpage>1</fpage>
          -
          <lpage>41</lpage>
          . [3]
          <string-name>
            <given-names>C.</given-names>
            <surname>Qian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Mathur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. H.</given-names>
            <surname>Zakaria</surname>
          </string-name>
          , R. Arora,
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <source>Management</source>
          <volume>59</volume>
          (
          <year>2022</year>
          )
          <fpage>103098</fpage>
          . [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Müller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Salathé</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. E.</given-names>
            <surname>Kummervold</surname>
          </string-name>
          , Covid-
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <article-title>to analyse covid-19 content on twitter</article-title>
          , Frontiers in
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <source>Artificial Intelligence</source>
          <volume>6</volume>
          (
          <year>2023</year>
          )
          <fpage>1023281</fpage>
          . [5]
          <string-name>
            <given-names>E.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lerman</surname>
          </string-name>
          , E. Ferrara, Tracking social
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <article-title>media discourse about the covid-</article-title>
          19 pandemic: De-
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <string-name>
            <surname>set</surname>
          </string-name>
          ,
          <source>JMIR Public Health Surveill</source>
          <volume>6</volume>
          (
          <year>2020</year>
          )
          <article-title>e19273</article-title>
          . [15]
          <string-name>
            <given-names>A.</given-names>
            <surname>Madsen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Reddy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Chandar</surname>
          </string-name>
          , Post-hoc inter-
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          URL: http://publichealth.jmir.org/
          <year>2020</year>
          /2/e19273/.
          <article-title>pretability for neural nlp: A survey</article-title>
          , ACM Comput-
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          <source>doi:10.2196/19273. ing Surveys</source>
          <volume>55</volume>
          (
          <year>2022</year>
          )
          <fpage>1</fpage>
          -
          <lpage>42</lpage>
          . [6]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kaur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Kaul</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. M.</given-names>
            <surname>Zadeh</surname>
          </string-name>
          , Monitoring the dy- [16]
          <string-name>
            <given-names>M. T.</given-names>
            <surname>Ribeiro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Guestrin</surname>
          </string-name>
          ,
          <article-title>" why should i</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          <article-title>namics of emotions during covid-19 using twitter trust you?" explaining the predictions of any clas-</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          <string-name>
            <surname>data</surname>
          </string-name>
          ,
          <source>Procedia Computer Science</source>
          <volume>177</volume>
          (
          <year>2020</year>
          ) 423
          <article-title>- sifier</article-title>
          , in
          <source>: Proceedings of the 22nd ACM SIGKDD</source>
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          430. international conference on knowledge discovery [7]
          <string-name>
            <given-names>K.</given-names>
            <surname>Scott</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Delobelle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Berendt</surname>
          </string-name>
          ,
          <source>Measuring and data mining</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>1135</fpage>
          -
          <lpage>1144</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          <article-title>shifts in attitudes towards covid-</article-title>
          19 measures in [17]
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Lundberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.-I.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <article-title>A unified approach to</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          <source>lands Journal</source>
          <volume>11</volume>
          (
          <year>2021</year>
          )
          <fpage>161</fpage>
          -
          <lpage>171</lpage>
          . URL: https://www.
          <source>the 31st International Conference on Neural Infor-</source>
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          clinjournal.org/clinj/article/view/133. mation Processing Systems, NIPS'17,
          <string-name>
            <surname>Curran</surname>
            <given-names>Asso</given-names>
          </string-name>
          [8]
          <string-name>
            <given-names>T.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. P.</given-names>
            <surname>Chow</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Zhu</surname>
          </string-name>
          , Covid-19 sens- ciates
          <string-name>
            <surname>Inc</surname>
          </string-name>
          .,
          <string-name>
            <surname>Red</surname>
            <given-names>Hook</given-names>
          </string-name>
          ,
          <string-name>
            <surname>NY</surname>
          </string-name>
          , USA,
          <year>2017</year>
          , p.
          <fpage>4768</fpage>
          -
          <lpage>4777</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          <article-title>ing: negative sentiment analysis on social</article-title>
          media in [18]
          <string-name>
            <given-names>M.</given-names>
            <surname>Sundararajan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Taly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Yan</surname>
          </string-name>
          , Axiomatic attribu-
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          <article-title>tion for deep networks</article-title>
          ,
          <source>in: Proceedings of the 34th</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>