<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Towards a Multilingual System for Vaccine Hesitancy using a Data Mixture Approach</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Oscar Araque</string-name>
          <email>o.araque@upm.es</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>María Felipa Ledesma-Corniel</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kyriaki Kalimeri</string-name>
          <email>kyriaki.kalimeri@isi.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Commons License Attribution 4.0 International</institution>
          ,
          <addr-line>CC BY 4.0</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>ISI Foundation</institution>
          ,
          <addr-line>Turin</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Universidad Politécnica de Madrid, ETSI Telecomunicación, Intelligent Systems Group</institution>
          ,
          <addr-line>Madrid</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Workshop Proce dings</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>hesitancy. Understanding public narratives on contentious topics like vaccination adherence is vital for promoting cooperative behaviors. During the COVID-19 pandemic, significant polarization arose from concerns about vaccines, with misinformation and conspiracy beliefs proliferating on social media. While many studies have analyzed these narratives, the focus has largely been on English-language content. This linguistic bias limits comprehensive global insights. Our study introduces a novel multilingual approach that addresses this gap. By integrating Italian examples into a primarily English dataset, we detect vaccine-hesitant language and demonstrate the model's adaptability to diverse linguistic data. Our findings highlight the importance of incorporating varied linguistic datasets for a more holistic understanding of global narratives on vaccine vaccine hesitancy, natural language processing, machine learning, transformer models</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR
ceur-ws.org</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>
        troversial social issues is fundamantal to eficiently
adAutomatically understanding peoples’ narratives on con- language detection and demonstrate the ability of the
model to generalise on previously unseen data. Often,
rative, prosocial behaviours. Vaccination adherence is
dress the real concerns as they occur fostering collabo- researchers and practitioners have access to large English
an exemplar case where society witnessed a notable po- lacking. We show that including small datasets in
diferdatasets but data in other languages, such as Italian, are
ent languages can improve overall performance when
analyzing texts in several languages.
we progressively include Italian instances in a
predominantly English dataset for the task of vaccine hesitant
larisation concerning possible adverse reactions [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ].
      </p>
      <sec id="sec-2-1">
        <title>Especially during the COVID-19 pandemic and despite</title>
        <p>
          vaccines being the most eficient and cost-efective
intervention, the spread of misinformation [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], the scepticism
around the scientific development of COVID-19 vaccines
and the dissemination of conspiracy beliefs [
          <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
          ],
proliferated on social media platforms.
        </p>
        <p>
          Numerous studies analysed user generated text [
          <xref ref-type="bibr" rid="ref10 ref6 ref7 ref8 ref9">6, 7,
8, 9, 10</xref>
          ], almost exclusively focusing on the English
language due to the availablity of models and tools. Even if
often English is universally spoken limits the analysis in
specific sociodemographic groups. Lenti et al. [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] in a
purely network based approach showed the existence of
a global misinformation network, calling for a
multilingual analysis to further understand the drivers of vaccine
hesitancy in the various languages.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Here, in light of these issues, we propose a novel approach for multilingual language understanding able to deal with language unbalance. More specifically, here</title>
        <p>LGOBE
CLiC-it 2023: 9th Italian Conference on Computational Linguistics,
0000-0003-3224-0001 (O. Araque); 0000-0001-8068-5916
(K. Kalimeri)
CEUR
htp:/ceur-ws.org
ISN1613-073
© 2023 Copyright for this paper by its authors. Use permitted under Creative</p>
        <p>CEUR</p>
        <p>Workshop Proceedings (CEUR-WS.org)</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>2. Data and Methods</title>
      <p>
        2.1. Data Collection
Although several Twitter datasets were constructed to
monitor COVID-19 pandemic and are openly available
to researchers, they difer in the number, timing, and
language of tweets collected, as well as the search
keywords used for collection [
        <xref ref-type="bibr" rid="ref12 ref6">6, 12</xref>
        ]. Here, we opted of a
large multilingual dataset (MultilingTw [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]), an Italian
dataset [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], while we also performed a new data
collection based on a time invariant hashtag list, manually
annotated as per their vaccination stance, which we share
with the community.
      </p>
      <sec id="sec-3-1">
        <title>A. Twitter-AntiVax This dataset was collected for</title>
        <p>this specific study and has been generated by capturing
English Twitter messages ranging from December 2020
to March 2023. It aims to capture opinions and narratives
expressed by anti-vaccination users, balancing between
pro and anti stances. We collected the data using a variety
of phrases and hashtags related to vaccination (e.g., “kill
jab”, “covid jab”, “#vaccineskill”,
“VaccinesAreNotTheAnswer”, “vaccineswork”, “vaccinessavelives”), manually
Dataset
A. Twitter-AntiVax (ours)
B. TwitterVax
C. MultilingTw (EN)
C. MultilingTw (IT)
2.2. Methods.
inspecting the relevance of a sample of the obtained mes- hesitancy in other languages since the internal
represensages. From these hashtags, we have identified a set of tations vary with the language.
users (480 users in total) expressing pro and anti-vaccine Here, our goal is to model the efect of including small
stances. Finally, we extracted the tweets generated by sets of data in a multilingual approach. To do so, we use
these users in the considered period. The dataset with the Twitter-AntiVax train set as English training data
the respective annotations is freely accesible at https: and the TwitterVax train set and Italian training data.
//github.com/gsi-upm/multilingual-vaccine-hesitancy. As test sets, we use the test sets of Twitter-AntiVax and</p>
        <p>
          B. TwitterVax dataset [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. Originally, this dataset TwitterVax, as well as the Multilingual Twitter dataset
contains 9,068,389 Italian tweets on vaccines tweeted in both English and Italian. To modulate the number of
from 1st January 2019 to 1st June 2022. The authors an- Italian instances included in the training set, we define
notate each captured user as anti-vaccine and pro-vaccine the  parameter that can take values in the range [
          <xref ref-type="bibr" rid="ref1">0, 1</xref>
          ].
through network analysis. For this work, to reduce the Thus, the instances in the training set are composed with
computational load and to work with similar dataset sizes, the following expression:
we have selected a sub-sample of approximately 14,000
tweets. We have categorized each tweet as anti-vaccine Train instances =  ∗ IT + (1 −  ) ∗ EN
and pro-vaccine by means of the user’s annotation.
        </p>
        <p>
          C. Multilingual Twitter dataset (MultilingTw) [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ].
        </p>
        <p>This dataset is composed of Twitter messages in 18
languages from October 2019 to March 2021. While the
original size is around 316 million messages, we select a
subsample of 1,246 tweets in English and 449 in Italian,
manually labelled for their vaccine stance. To comply
with the other datasets, we selected the messages labelled
as pro and anti-vaccine. These sets are used as test sets.</p>
        <p>Both the Twitter-AntiVax and TwitterVax dataset have
been split into train and test sets, randomly sampling
20% of instances as test set. Some statistics for the used
datasets are detailed in Table 1.
where IT and EN represent the Italian and English
datasets, respectively. In this way, a training set
composed with  = 0 is composed entirely of English
instances, while the opposite is correct when  = 1 . Of
course, with  = 0.5 , the training set would have the
same number of instances for English and Italian.</p>
        <p>Since the English and Italian datasets contain a
diferent number of instances, this could have the undesired
efect of a varying number of training instances that may
afect the results. We control this to produce the same
number of train instances for all possible values of  .</p>
        <p>
          Evaluation. Finally, all the models are evaluated with
the macro-averaged F-score of each model. This allows
us to consider the efect of unbalanced data. We opted
for an evaluation without label propagation via retweet
networks as proposed in other studies [
          <xref ref-type="bibr" rid="ref11 ref16">16, 11</xref>
          ] since these
are likely to introduce uncertainty in the groundtruth.
        </p>
        <p>Our evaluation is strictly based on manually annotated
data regarding the vaccination stance.</p>
      </sec>
      <sec id="sec-3-2">
        <title>This work is based on a multilingual approach to vaccine hesitancy analysis. To this regard, we use a</title>
        <p>
          DistilBERT [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] model
(distil-base-multilingualcased1). This transformer model was trained in the most
common languages in Wikipedia and thus is capable of
generating internal representations for a variety of
languages, including English and Italian. Nevertheless, it
has been shown that this kind of models do not compute
language-agnostic representations but rather generates
partitioned representations for each language [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. In
practice, this implies that the instances used for training
in English are not directly useful for predicting vaccine
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>1https://huggingface.co/distilbert-base-multilingual-cased</title>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>3. Results</title>
      <p>As described, the proposed experiment aims to study
the efect of including Italian instances in an English
dataset and train a multilingual learning model with the
generated dataset. Figure 3 shows the macro-averaged
f-scores obtained for an increasing number of  . The
horizontal axis shows the variation of the  parameter
(see Sect. 2), and the vertical axis the performance of
0.9
the model in all test sets, both English and Italian. The Attending to the obtained results, we can derive a
average curve is weighted with the number of instances general trend: the higher the percentage of training
inof each test set so that the number of correctly classified stances in a language, the higher the performance in that
instances is better reflected. language. This is to be expected, as follows common
ex</p>
      <p>It is worth noticing a prevalent behaviour in the ob- perimental observations when training learning models.
tained learning curves: the elbow evolutions with  in Besides, it is interesting to see that there is a large portion
three of the four curves. Attending to the evolution of of cases where performances in both English and Italian
the performance in the Twitter-Antivax test set, we see are kept high. Practically, this situation is observed when
that the best performance is obtained when  = 0 , that is,  ∈ [0.1, 0.9] and can be better understood by attending
when all training instances are in English. As  increases, to the averaged curve.
the performance decreases slowly. Nevertheless, when  This interesting behavior may indicate the robustness
changes from 0.9 to 1, a faster reduction is observed. The of the proposed method to the proportion of language
lower performance corresponds to the case where there mixture. That is, it seems that the model successfully
are no English instances in the training set, negatively generalizes to a diferent language even when its
trainafecting the performance in the English language. A ing set is composed in a small proportion (e.g., 10%) by
similar behaviour is shown by the performance on the instances of that language. The previous observation
Multilingual Twitter data in English. indicates that the multilingual model may be learning</p>
      <p>In contrast, the performance in the Italian data pro- to classify Italian documents while being trained with
gresses diferently. We can see a large improvement in English instances, and that adding a small proportion of
performance at the change between  = 0 (no Italian Italian instances facilitates such performance.
training data) and  = 0.1 (10% of training instances in While this work is an initial attempt at describing a
Italian) for the TwitterVax dataset. As more Italian data multilingual system trained with a mixture of data,
furis included in the composition of the training set, the ther work should explore whether the observed behavior
performance on this dataset increases slowly. As for the is maintained with more languages. How the internal
Multilingual Twitter dataset in Italian, the performance representations of the evaluated model can be used for
tends to increase with  . multilingual applications has yet to be thoroughly
stud</p>
    </sec>
    <sec id="sec-5">
      <title>4. Conclusions</title>
      <sec id="sec-5-1">
        <title>Here we design and evaluate a method that achieves mul</title>
        <p>tilingual vaccine hesitancy detection. The experimental
design considers training a multilingual classification
model on a mixture of English and Italian text excerpts.
Progressively varying the combination of languages in
the training data, we obtain a better understanding of
the of the classification problem in the two languages.
Additionally, we undertook a novel data collection
effort on Twitter, manually annotating content based on
vaccination stance. This curated dataset is now freely
accessible to the scientific community, providing a valuable
resource for further research.</p>
        <p>By adjusting the language composition in our training
data, we gained deeper insights into the classification
intricacies across both languages. Notably, our findings
suggest that the model can efectively generalize to a
diferent language even when its training set contains a
minimal proportion (e.g., 10%) of instances from that
language. This indicates the model’s robustness and
adaptability in handling linguistic variations with limited data.</p>
        <p>Importantly, this approach is an important tool for
researchers and practitioners who often have access to large
datasets in English, but limited resources in other widely
spoken languages such as Italian or Spanish. The
evaluation shows that composing a mixture dataset can be
efective in generating a model that classifies instances
in two languages. In fact, the experimentation shows
that this mixture is flexible, maintaining consistent
performances across diferent ratios of language presence.
This consistency suggests that the mixture approach is
promising.</p>
        <p>Given its language-neutral nature, our technique holds
promise for broader applications across multiple
languages and diverse domains. As a next step, we aim
to explore various multilingual models and languages to
further ascertain the scalability and adaptability of our
approach.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <sec id="sec-6-1">
        <title>This work has been funded by the Spanish Ministry of</title>
        <p>Science and Innovation through the COGNOS project
(PID2019-105484RB-I00) and by the European Union with
NextGeneration EU funds. KK gratefully acknowledges
the support from the Lagrange Project of the Institute
for Scientific Interchange Foundation (ISI Foundation)
funded by Fondazione Cassa di Risparmio di Torino
(Fondazione CRT).</p>
        <p>The authors would like to acknowledge the support of
Yelena Mejova, from ISI Foundation in Italy, for sharing
the Multilingual Twitter dataset.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Dror</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Eisenbach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Taiber</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. G.</given-names>
            <surname>Morozov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mizrachi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Zigron</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Srouji</surname>
          </string-name>
          , E. Sela,
          <article-title>Vaccine hesitancy: the next challenge in the fight against covid-19,</article-title>
          <source>European journal of epidemiology 35</source>
          (
          <year>2020</year>
          )
          <fpage>775</fpage>
          -
          <lpage>779</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>L.</given-names>
            <surname>Betti</surname>
          </string-name>
          ,
          <string-name>
            <surname>G. De Francisci Morales</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Gauvin</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Kalimeri</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Mejova</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Paolotti</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Starnini</surname>
          </string-name>
          ,
          <article-title>Detecting adherence to the recommended childhood vaccination schedule from user-generated content in a us parenting forum</article-title>
          ,
          <source>PLoS computational biology 17</source>
          (
          <year>2021</year>
          )
          <article-title>e1008919</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Mejova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kalimeri</surname>
          </string-name>
          , Covid
          <article-title>-19 on facebook ads: competing agendas around a public health crisis</article-title>
          ,
          <source>in: Proceedings of the 3rd ACM SIGCAS Conference on Computing and Sustainable Societies</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>22</fpage>
          -
          <lpage>31</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>K.</given-names>
            <surname>Kalimeri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. G.</given-names>
            <surname>Beiró</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Urbinati</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bonanomi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rosina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Cattuto</surname>
          </string-name>
          ,
          <article-title>Human values and attitudes towards vaccination in social media</article-title>
          ,
          <source>in: Companion Proceedings of The 2019 World Wide Web Conference</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>248</fpage>
          -
          <lpage>254</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M. G.</given-names>
            <surname>Beiró</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. D'Ignazi</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Perez Bustos</surname>
            ,
            <given-names>M. F.</given-names>
          </string-name>
          <string-name>
            <surname>Prado</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Kalimeri</surname>
          </string-name>
          ,
          <article-title>Moral narratives around the vaccination debate on facebook</article-title>
          ,
          <source>in: Proceedings of the ACM Web Conference</source>
          <year>2023</year>
          ,
          <year>2023</year>
          , pp.
          <fpage>4134</fpage>
          -
          <lpage>4141</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>K.</given-names>
            <surname>Hayawi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Shahriar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Serhani</surname>
          </string-name>
          , I. Taleb,
          <string-name>
            <given-names>S. S.</given-names>
            <surname>Mathew</surname>
          </string-name>
          ,
          <article-title>Anti-vax: a novel twitter dataset for covid19 vaccine misinformation detection</article-title>
          ,
          <source>Public health 203</source>
          (
          <year>2022</year>
          )
          <fpage>23</fpage>
          -
          <lpage>30</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Nyawa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Tchuente</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Fosso-Wamba</surname>
          </string-name>
          ,
          <article-title>Covid19 vaccine hesitancy: a social media analysis using deep learning</article-title>
          ,
          <source>Annals of Operations Research</source>
          (
          <year>2022</year>
          )
          <fpage>1</fpage>
          -
          <lpage>39</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A.</given-names>
            <surname>Fasce</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Schmid</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. L.</given-names>
            <surname>Holford</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Bates</surname>
          </string-name>
          , I. Gurevych,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lewandowsky</surname>
          </string-name>
          ,
          <article-title>A taxonomy of anti-vaccination arguments from a systematic literature review and text modelling</article-title>
          ,
          <source>Nature Human Behaviour</source>
          (
          <year>2023</year>
          )
          <fpage>1</fpage>
          -
          <lpage>19</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Mejova</surname>
          </string-name>
          , G. Crupi,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lenti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Tizzani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kalimeri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Paolotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Panisson</surname>
          </string-name>
          ,
          <article-title>Echo chambers of vaccination hesitancy discussion on social media during covid-19 pandemic</article-title>
          , in
          <source>: XX ISA World Congress of Sociology (June 25-July 1</source>
          ,
          <year>2023</year>
          ), ISA,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>N. E.</surname>
          </string-name>
          <article-title>MacDonald, Vaccine hesitancy: Definition, scope and determinants</article-title>
          ,
          <source>Vaccine</source>
          <volume>33</volume>
          (
          <year>2015</year>
          )
          <fpage>4161</fpage>
          -
          <lpage>4164</lpage>
          . URL: https://www.sciencedirect.com/ science/article/pii/S0264410X15005009. doi:https: //doi.org/10.1016/j.vaccine.
          <year>2015</year>
          .
          <volume>04</volume>
          .036, wHO Recommendations Regarding Vaccine Hesitancy.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lenti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Mejova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kalimeri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Panisson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Paolotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Tizzani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Starnini</surname>
          </string-name>
          ,
          <article-title>Global misinformation spillovers in the vaccination debate before and during the covid-19 pandemic: Multilingual twitter study</article-title>
          ,
          <source>JMIR Infodemiology 3</source>
          (
          <year>2023</year>
          )
          <article-title>e44714</article-title>
          . doi:
          <volume>10</volume>
          .2196/44714.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>C. E.</given-names>
            <surname>Lopez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Gallemore</surname>
          </string-name>
          ,
          <article-title>An augmented multilingual twitter dataset for studying the covid-19 infodemic</article-title>
          ,
          <source>Social Network Analysis and Mining</source>
          <volume>11</volume>
          (
          <year>2021</year>
          )
          <fpage>102</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>V.</given-names>
            <surname>Lachi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. M.</given-names>
            <surname>Dimitri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. Di</given-names>
            <surname>Stefano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Liò</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bianchini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Mocenni</surname>
          </string-name>
          ,
          <article-title>Impact of the covid 19 outbreaks on the italian twitter vaccination debat: a network based analysis</article-title>
          ,
          <source>arXiv preprint arXiv:2306.02838</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>V.</given-names>
            <surname>Sanh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Debut</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chaumond</surname>
          </string-name>
          , T. Wolf,
          <article-title>Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter</article-title>
          , arXiv preprint arXiv:
          <year>1910</year>
          .
          <volume>01108</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>J.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <surname>B. McCann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Socher</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>Xiong, BERT is not an interlingua and the bias of tokenization</article-title>
          ,
          <source>in: Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo</source>
          <year>2019</year>
          ),
          <article-title>Association for Computational Linguistics</article-title>
          , Hong Kong, China,
          <year>2019</year>
          , pp.
          <fpage>47</fpage>
          -
          <lpage>55</lpage>
          . URL: https://aclanthology.org/D19-6106. doi:
          <volume>10</volume>
          .18653/ v1/
          <fpage>D19</fpage>
          - 6106.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>F.</given-names>
            <surname>Gargiulo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Cafiero</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Guille-Escuret</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Seror</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. K.</given-names>
            <surname>Ward</surname>
          </string-name>
          ,
          <article-title>Asymmetric participation of defenders and critics of vaccines to debates on frenchspeaking twitter</article-title>
          ,
          <source>Scientific reports 10</source>
          (
          <year>2020</year>
          )
          <fpage>6599</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>