Enhancing Multilabel Classification of Anti-Vaccine
                                Tweets with the COVID-Twitter-BERT
                                Aritra Mandal1
                                1
                                    A.K Chowdhury School of Information Technology, University of Calcutta, Kolkata, West Bengal, India


                                                                         Abstract
                                                                         Social media platforms have revolutionized global communication, enabling billions of individuals to
                                                                         share their perspectives and opinions. With the worldwide rollout of COVID-19 vaccination campaigns,
                                                                         the classification of anti-vaccine tweets assumes significance as it offers valuable insights into people’s
                                                                         concerns about the new vaccines. These tweets not only provide feedback but also offer a glimpse into
                                                                         the specific apprehensions people hold, ranging from potential side effects to concerns related to vaccine
                                                                         effectiveness and political influences. In this research, we employ a specialized BERT model tailored to
                                                                         the domain, achieving notable performance with a macro-F1 score of 0.71 and a Jaccard score of 0.72.

                                                                         Keywords
                                                                          BERT, anti-vaccine tweets, classification


                                1. Introduction
                                In the face of the COVID-19 pandemic, the world finds itself engaged in one of the most
                                formidable battles in recent history. Historically, vaccines have emerged as a reliable and effective
                                weapon against infectious diseases, conferring immunity to individuals and contributing to the
                                global efforts to combat and eradicate deadly viruses. The rapid development and distribution
                                of COVID-19 vaccines have exemplified the power of science and international collaboration,
                                offering hope for a return to normalcy.
                                   Amidst the vaccine rollout, an unprecedented dialogue has unfolded across social media
                                platforms, with Twitter emerging as a prominent arena for discussions surrounding COVID-19
                                vaccines. These discussions encompass a wide spectrum of topics, ranging from the progress
                                of vaccination campaigns, accessibility issues, and vaccine efficacy to the possible side effects.
                                Within this digital discourse, a diverse array of opinions prevails, spanning from enthusiastic
                                support to pronounced skepticism.
                                   Recognizing the significance of these online conversations, government entities, health
                                organizations such as the World Health Organization (WHO), and public health experts have a
                                vested interest in understanding public sentiment and concerns regarding the new COVID-19
                                vaccines. The insights drawn from these micro-blogs offer invaluable guidance for shaping
                                future strategies to promote widespread vaccination. As such, a crucial aspect of this endeavor


                                FIRE’23: Forum for Information Retrieval Evaluation, December 15-18, 2023, India
                                $ aritramandal37@gmail.com (A. Mandal)
                                 0009-0008-1841-5120 (A. Mandal)
                                                                       © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
                                    CEUR
                                    Workshop
                                    Proceedings
                                                  http://ceur-ws.org
                                                  ISSN 1613-0073
                                                                       CEUR Workshop Proceedings (CEUR-WS.org)


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
involves the thorough analysis and interpretation of the reasons underlying vaccine hesitancy
and resistance.


2. Task
The goal is to build an effective multi-label classifier to label a social media post (particularly,
a tweet) according to the specific concern(s) towards vaccines as expressed by the author of
the post.[1] Note that a tweet can have more than one label (concern), e.g., a tweet expressing
3 different concerns towards vaccines will have 3 labels. We consider the following concerns
towards vaccines as the labels for the classification task:
   1. Unnecessary: The tweet indicates vaccines are unnecessary, or that alternate cures are
      better.
   2. Mandatory: Against mandatory vaccination — The tweet suggests that vaccines should
      not be made mandatory.
   3. Pharma: Against Big Pharma — The tweet indicates that the Big Pharmaceutical companies
      are just trying to earn money, or the tweet is against such companies in general because
      of their history.
   4. Conspiracy: Deeper Conspiracy — The tweet suggests some deeper conspiracy, and not
      just that the Big Pharma want to make money (e.g., vaccines are being used to track
      people, COVID is a hoax)
   5. Political: Political side of vaccines — The tweet expresses concerns that the govern-
      ments/politicians are pushing their own agenda though the vaccines.
   6. Country: Country of origin — The tweet is against some vaccine because of the country
      where it was developed / manufactured
   7. Rushed: Untested / Rushed Process — The tweet expresses concerns that the vaccines
      have not been tested properly or that the published data is not accurate.
   8. Ingredients: Vaccine Ingredients/technology — The tweet expresses concerns about the
      ingredients present in the vaccines (eg. fetal cells, chemicals) or the technology used (e.g.,
      mRNA vaccines can change your DNA)
   9. Side-effect: Side Effects / Deaths — The tweet expresses concerns about the side effects of
      the vaccines, including deaths caused.
  10. Ineffective: Vaccine is ineffective — The tweet expresses concerns that the vaccines are
      not effective enough and are useless.
  11. Religious: Religious Reasons — The tweet is against vaccines because of religious reasons
  12. None: No specific reason stated in the tweet, or some reason other than the given ones.


3. Related Work
Due to the exponential growth of social media platforms, sharing content on these platforms
has expanded tremendously, further increasing the malicious content on these platforms[2][3].
Therefore detection of such malicious content has gained significant attraction among the
research community.The traditional machine learning methods like Naive-Bayes classifier, Linear
classifier, Support Vector Machine and Deep neural methods like Long Short Term Memory
(LSTMs) and Bidirectional RNN are very successful for text classification. More recent language
models for natural language processing include BERT (Bidirectional Encoder Representations
from Transformers)[4] and its domain-specific version CT-BERT (COVID-Twitter-BERT) [5].

3.1. BERT
BERT (Bidirectional Encoder Representations from Transformers) revolutionizes natural lan-
guage processing by capturing contextual relationships in both directions of a sequence. In-
troduced by Google in 2018, BERT employs a transformer architecture, considering the entire
input context to enhance understanding and generate more accurate language representations.
Its pre-training on massive datasets enables superior performance in various downstream tasks,
such as question answering and sentiment analysis. BERT’s bidirectional approach significantly
advances contextual embeddings, marking a pivotal milestone in the evolution of language
models.


4. Datasets
The training dataset provided during the track contains 9921 tweets extracted from [6] which
contains anti-vaccine tweets from Twitter. It contains tweets along with the tweet IDs and the
classes. The dataset represents a significant milestone as the inaugural large-scale compilation
of approximately 10,000 COVID-19 anti-vaccine tweets, meticulously categorized into distinct
anti-vaccine concerns within a multi-label framework. It stands out as the pioneering multi-
label classification dataset, offering detailed explanations for each label. Notably, this dataset
also includes class-wise summaries for all the tweets, providing a comprehensive resource for
understanding and analyzing diverse anti-vaccine sentiments.


5. Preprocessing
To enhance the quality of the word embeddings generated by BERT, we conducted pre-processing
on the tweets. Tweets inherently feature distinctive lexicons, such as hashtags, user mentions
(@USER) and URLs (HTTP-URL). Without proper pre-processing, these elements can adversely
impact the model’s performance. Therefore, we implemented a meticulous data cleaning pipeline
as an integral part of tweet pre-processing within our dataset.

   1. Remove URLs: URLs do not help in multilabel classification; thus, we removed them with
      the help of regular expression from the text
   2. Remove non-alphanumeric characters: We removed all the non-letter characters like
      brackets, colon, semi-colon, @, etc.
   3. Remove Mentions: We removed all mentions as it might hinder the process of multilabel
      classification and is found often in tweets
   4. HTML Tag Removal:We used BeautifulSoup to parse HTML and extract the text content,
      effectively removing any HTML tags.
   5. Convert words to lower case: Tweets are written more casually, thus by lower casing
      every word, we are keeping only a single version of every word, enhancing the text
      analysis.


6. Methodology
6.1. COVID-Twitter-BERT (CT-BERT)
CT-BERT[5] is a specialized transformer-based model tailored to the domain of COVID-19
discourse. Pre-trained on an extensive dataset comprising tweets posted from January 12 to
April 16, 2020, it initializes its weights using BERT-Large. Further refinement involves training
on an additional 160 million tweets centered around the coronavirus topic. To ensure privacy,
Twitter usernames were replaced with a standardized text token, and emoticons were substituted
with English words. The selection of CT-BERT is strategic, aligning with the specificity of
our training data, as opposed to BERT-Large trained on generic Wikipedia content, thereby
enhancing the model’s relevance and performance in the context of COVID-19-related tweet
classification.

6.2. Model Summary
This model designed for multilabel classification, utilizing the CT-BERT[5] pre-trained model.
Its architecture consists of a BERT layer, followed by a dropout layer (0.3), and concludes with
a linear layer (12 outputs). The model extracts contextual embeddings from the BERT layer,
applies dropout for regularization, and produces multilabel predictions through the linear layer.
This design effectively adapts pre-trained knowledge to the specific multilabel classification
task.

6.3. Tuning Parameter
The models have been run for 7 epochs with Adam optimizer[7] and the initial learning rate of
1e-5. As no validation dataset was given, we divided the training data points into 80 and 20
split and used the 20 percent as a validation set. We predict the test set for the best validation
performance.


7. Evaluation
The assessment of track results involves a meticulous evaluation, employing both the macro-F1
score and Jaccard index as a tie-breaker. In the context of the specified task, the outcome of our
automated run is presented herein, reflecting a noteworthy achievement. Notably, my model
clinched the top position, surpassing other submissions. The macro-F1 score, a key metric of
performance, stands impressively at 0.71, while the Jaccard index, employed as a supplementary
measure, registers at 0.72. This outcome underscores the effectiveness of our model, positioning
it prominently within the competitive landscape of the task at hand.
Table 1
Results
                   Run File               Summary                Macro-F1   Jacc
                 final_df.csv   Fine-tuning Covid-Twitter Bert     0.71     0.72


8. Conclusion and future work
This study leverages the capabilities of Covid-Twitter-BERT, a transformer-based model pre-
trained on an extensive corpus of COVID-19-related tweets. The primary objective is to conduct
multilabel classification on anti-vaccine tweets, categorizing them into distinct domains en-
compassing conspiracy, country, ineffective, ingredients, mandatory, none, pharma, political,
religious, rushed, side-effect, and unnecessary.
   Our empirical findings underscore the superiority of transformer-based models, particularly
Covid-Twitter-BERT, when contrasted with conventional natural language processing classifiers
such as Naive Bayes, Logistic Regression, and Support Vector Machine[8]. The enhanced per-
formance of transformer-based models emanates from their capacity to derive more expressive
word embeddings, thereby yielding superior results across the designated classification task.
   In pursuit of further refinement, we advocate for the exploration of data augmentation
strategies to bolster the performance of our model. This strategic initiative is particularly
pertinent given the inherent data-hungry nature of transformer-based models. Additionally, an
avenue for potential improvement lies in the incorporation of adversarial training techniques,
aimed at fortifying the model’s robustness against diverse inputs and potential adversarial
attacks. These proposed augmentations signify a commitment to continuous enhancement and
resilience in the model’s classification prowess.


References
[1] S. Poddar, M. Basu, K. Ghosh, S. Ghosh, Overview of the fire 2023 track:artificial intelligence
    on social media (aisome), in: Proceedings of the 15th Annual Meeting of the Forum for
    Information Retrieval Evaluation, 2023.
[2] L.-A. Cotfas, C. Delcea, I. Roxin, C. Ioanăş, D. S. Gherai, F. Tajariol, The longest month:
    Analyzing covid-19 vaccination opinions dynamics from tweets in the month following
    the first vaccine announcement, IEEE Access 9 (2021) 33203–33223. doi:10.1109/ACCESS.
    2021.3059821.
[3] Z. Waseem, T. Davidson, D. Warmsley, I. Weber, Understanding abuse: A typology of
    abusive language detection subtasks, in: Proceedings of the First Workshop on Abusive
    Language Online, Association for Computational Linguistics, Vancouver, BC, Canada, 2017,
    pp. 78–84. URL: https://aclanthology.org/W17-3012. doi:10.18653/v1/W17-3012.
[4] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, Bert: Pre-training of deep bidirectional
    transformers for language understanding, arXiv preprint arXiv:1810.04805 (2018).
[5] M. Müller, M. Salathé, P. E. Kummervold, Covid-twitter-bert: A natural language processing
    model to analyse covid-19 content on twitter, Frontiers in Artificial Intelligence 6 (2023)
    1023281.
[6] S. Poddar, A. M. Samad, R. Mukherjee, N. Ganguly, S. Ghosh, Caves: A dataset to facilitate
    explainable classification and summarization of concerns towards covid vaccines, in:
    Proceedings of the 45th International ACM SIGIR Conference on Research and Development
    in Information Retrieval, 2022, pp. 3154–3164.
[7] I. Loshchilov, F. Hutter, Decoupled weight decay regularization, 2019. arXiv:1711.05101.
[8] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, I. Polosukhin,
    Attention is all you need, 2023. arXiv:1706.03762.