<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Multi-label Classification of Covid-19 Vaccine Tweet</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Palvika Bansal</string-name>
          <email>palvika.bansal@thomsonreuters.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sumit Das</string-name>
          <email>sumit.das@thomsonreuters.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vikas Rai</string-name>
          <email>vikas.rai@thomsonreuters.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Shalini Kumari</string-name>
          <email>shalini.kumari@thomsonreuters.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>COVID-19 Vaccine Tweets, Sentiment Analysis, Multi label Classification, BERT, Prefix-Tuning</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Thomson Reuters Lab</institution>
          ,
          <addr-line>Bangalore</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This research paper presents a novel approach to multi-label classification of tweets expressing concerns about Covid-19 vaccines. It introduces fine-tuned BERT based model, customized for this task, which achieves good performance in accurately categorizing specific concerns within tweets. Through extensive data preprocessing, the model accommodates a wide range of concerns. Our findings have significant implications for public health communication, as they enable precise monitoring of public sentiment and vaccine-related concerns. This research contributes to natural language processing and demonstrates the practical application of advanced machine learning techniques in addressing real-world challenges. It underscores the potential for innovative AI-driven solutions in public health communication.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR
ceur-ws.org</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>Vaccination plays a crucial role in mitigating the risk and transmission of a wide range of
diseases. Over the past few years, vaccination has emerged as a critical tool in combating
the COVID-19 pandemic. Moreover, large-scale vaccination eforts are essential to reduce the
prevalence of various diseases. Nonetheless, skepticism towards vaccines persists among many
individuals, primarily due to a variety of reasons, including political factors and concerns about
potential vaccine side efects.</p>
      <p>
        It is imperative to acknowledge and address these diverse concerns surrounding vaccines. Social
media platforms have proven to be invaluable sources of data for gauging public sentiment and
opinions regarding vaccination. Leveraging platforms like these allows us to rapidly gather
insights from conversations and discussions about vaccines [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. To facilitate this understanding,
our work has utilized training data sourced from a prior project called ”CAVES: A dataset
designed to facilitate the transparent classification and summarization of concerns related to
COVID vaccines.” [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        Our rigorous methodology entailed a systematic experimentation with a wide spectrum of
techniques in the realms of deep learning and machine learning. We experimented with these
approaches to facilitate the precise categorization of tweets that revolved around
vaccinerelated concerns. Within our experimental framework, we started with foundational models
including TF-IDF and LSTM and advanced towards more contextual models which involved
BERT [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] based models. One noteworthy experimentation involved the implementation of prefix
CEUR
Workshop
Proceedings
tuning [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], a refinement technique integrated with state-of-the-art transformer models. This
intricate synergy enabled us to extract nuanced insights from the tweets under examination,
enhancing the accuracy and depth of our classification eforts. To further extract the contextual
meaning of tweet, we experimented with various data processing approaches such as identifying
named entities in the tweets, expansion of tweets, analyzing sentiment of tweet and analysis of
keywords in the tweets. We also experimented with state of the art GPT-4 [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] model to identify
concerns related to the tweet by providing it with few-shot examples.
      </p>
      <p>Furthermore, our investigative pursuits were not confined solely to the broad spectrum of
techniques. We ventured into the specialized domain of model fine-tuning to accommodate
the idiosyncrasies inherent in tweet data. This approach allowed us to harness the unique
characteristics of Twitter’s concise and informal language style, ensuring our models were
ifnely attuned to capture the subtle intricacies of vaccine-related discourse.</p>
    </sec>
    <sec id="sec-3">
      <title>2. Task</title>
      <p>Our primary aim is to develop a highly eficient multi-label classification model that can
accurately assign labels to a social media post, specifically tweets. These labels will correspond
to the specific concerns and sentiments expressed by the post’s author regarding vaccines. This
task involves not only identifying the presence of various concerns but also understanding the
nuances and context in which they are discussed, enabling a comprehensive analysis of public
sentiment and discourse surrounding vaccines on social media platforms.</p>
      <p>In the context of this study, the classification task is centered around a set of predefined concerns
pertaining to vaccines. These concerns serve as the labels for categorizing social media (tweet)
posts, providing a structured framework for analyzing and understanding public discourse on
vaccine-related topics. To gain deeper insights, kindly refer to the following topics:
• Unnecessary: The tweet indicates vaccines are unnecessary, or that alternate cures are
better.
• Mandatory: Against mandatory vaccination — The tweet suggests that vaccines should
not be made mandatory.
• Pharma: Against Big Pharma — The tweet indicates that the Big Pharmaceutical
companies are just trying to earn money, or the tweet is against such companies in general
because of their history.
• Conspiracy: Deeper Conspiracy — The tweet suggests some deeper conspiracy, and
not just that the Big Pharma want to make money (e.g., vaccines are being used to track
people, COVID is a hoax)
• Political: Political side of vaccines — The tweet expresses concerns that the
governments/politicians are pushing their own agenda though the vaccines.
• Country: Country of origin — The tweet is against some vaccine because of the country
where it was developed/manufactured
• Rushed: Untested/Rushed Process — The tweet expresses concerns that the vaccines
have not been tested properly or that the published data is not accurate.
• Ingredients: Vaccine Ingredients/technology — The tweet expresses concerns about the
ingredients present in the vaccines (eg. fetal cells, chemicals) or the technology used (e.g.,
mRNA vaccines can change your DNA)
• Side-efect: Side Efects/Deaths — The tweet expresses concerns about the side efects of
the vaccines, including deaths caused.
• Inefective: Vaccine is inefective — The tweet expresses concerns that the vaccines are
not efective enough and are useless.
• Religious: Religious Reasons — The tweet is against vaccines because of religious reasons
• None: No specific reason stated in the tweet, or some reason other than the given ones.</p>
    </sec>
    <sec id="sec-4">
      <title>3. Related Work</title>
      <p>Users frequently turn to micro-blogging platforms such as Twitter, motivated by a diverse
range of objectives. These include expressing their viewpoints on the Coronavirus pandemic,
disseminating personal health updates to their online connections, flagging symptoms, and
sharing alerts regarding their well-being or that of acquaintances. Robust discussions take place
concerning COVID-19 vaccines and vaccination campaigns, often preceding individuals’ receipt
of their vaccine doses. The extraction of valuable insights from these textual tweets represents
a common application within the field of social computing.</p>
      <p>In the realm of text classification, traditional machine learning techniques such as the
NaiveBayes classifier, Linear classifier, Support Vector Machine (SVM), and cutting-edge deep learning
methods including Long Short Term Memory (LSTM) networks and Bidirectional Recurrent
Neural Networks (RNNs) have demonstrated their efectiveness.</p>
      <p>
        Recent advancements in natural language processing have given rise to notable language
models, with BERT (Bidirectional Encoder Representations from Transformers) [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and its
domain-specific counterpart CT-BERT (COVID-Twitter-BERT) [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] at the forefront. Additionally,
VaccineBERT [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], a BERT-based model specialized in classifying COVID-19 vaccine-related
tweets, has garnered attention.
      </p>
    </sec>
    <sec id="sec-5">
      <title>4. Dataset</title>
      <p>The dataset in its entirety consists of 9,921 tweets records, and it is worth noting that there are
no missing values within this dataset, ensuring a comprehensive and complete collection of
Twitter data for analysis.</p>
      <sec id="sec-5-1">
        <title>4.1. Data Exploration</title>
        <p>Within the scope of this classification task, it is imperative to acknowledge that individual tweets
may be linked with multiple labels. Consequently, it is of utmost importance to undertake
a comprehensive examination of the distribution of these labels within the dataset. This
understanding is vital for efectively categorizing and interpreting the complex and diverse
nature of the tweets in our dataset.For an in-depth analysis and a complete overview of the
results from this analysis, Refer Table 1.</p>
        <p>In addition to this, it is crucial to examine the distribution of the number of labels assigned to
each individual tweet. Upon analyzing the entire dataset, we observed that approximately 7,936
tweet texts were assigned only one label, indicating a prevalent singularity of classification.
Furthermore, around 1,716 tweets exhibited a dual-label configuration, suggesting a moderate
level of complexity in label assignment. Intriguingly, a subset of 269 tweets challenged this
convention by being concurrently linked to three distinct labels, underscoring the presence
of intricately categorized content within the dataset. This meticulous examination of label
distribution not only enhances our understanding of the dataset’s characteristics but also
provides valuable insights into the diverse nature of the classification challenge at hand.
Furthermore, we have undertaken an examination of the distribution of tweet lengths. For a
more comprehensive view of the length distribution, Refer to the Appendix Figure 1.1.</p>
      </sec>
      <sec id="sec-5-2">
        <title>4.2. Trends in the dataset</title>
        <p>Label-Entity Mapping in Tweet Text An analysis aimed at mapping training data labels to
the most prevalent entity types found within the tweet text. This analysis was carried out for
both individual training data labels and when multiple labels were present. For a comprehensive
breakdown of this analysis and its results you can refer to Appendix Tables 1.1 and 1.2.
Extraction and Parsing of URL-Embedded HTML Content in Tweet Text We performed
a two-fold analysis involving the extraction of URLs from tweet text and the subsequent parsing
of HTML content from these URLs. The purpose was to examine the HTML content, particularly
the headlines, associated with each URL and compare it with the tweet text. It was observed that
the majority of these URLs referred to either other tweet threads or news media reports. Among
the complete list of URLs, approximately 20% of the web pages were found to be non-existent.
In the course of our analysis, we discovered that in most cases, the tweet text was concise and
often a partial excerpt from the parsed URL contents. Additionally, there were instances where
the context of the tweet text contradicted the information present in the HTML content of the
URLs. Consequently, we arrived at the conclusion that incorporating this HTML content into
the tweet text would not provide added value and could potentially introduce confusion to the
model.</p>
        <p>Analysis of @Mentioned Users in Tweet Text Furthermore, we conducted an analysis of the
mentions of user profiles (@user) within the tweet text. The intention was to explore whether
the profiles of mentioned users could ofer supplementary information related to the type of
tweet. However, it is important to note that our eforts were hindered by the unavailability of
data due to restrictions imposed by the Twitter API, which prevented access to user profiles.
Exploring Entity Types Within Tweet Text In the course of our research, we leveraged the
Hugging Face’s bertweet-tb2_wnut17-ner API as a cornerstone for detecting entities within
the tweet texts. This API, tailored for the intricacies of social media data, harnessed the
power of advanced Named Entity Recognition (NER) techniques, specifically fine-tuned for
Twitter contexts, to accurately categorize entities amid the informality, hashtags, and mentions
characteristic of tweets. However, it’s noteworthy that given the constraints of time, our
exploration did not yield significant outcomes, warranting further investigation in the future.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>5. Pre-processing</title>
      <p>To enhance the quality of word embeddings that we leveraged in modeling process, we
preprocessed the tweets. Tweets generally encompass distinctive lexical elements such as hashtags,
@username mentions, URLs, RT and special characters. These elements, if left unprocessed, tend
to hinder the model’s performance. Consequently, we implemented a specific data cleansing
procedure as an integral component of our tweet pre-processing strategy within the dataset:
• Removing stop words: In this phase, stop words, which are commonly used words
such as ”the,” ”and,” and ”in,” are systematically removed from the text. We also removed
some words specific to tweets data such as rt which depicts retweets. This step helped
in reducing noise and improving the eficiency of the tasks by focusing on the most
meaningful words and phrases in the text.
• Removing URLs: Initially we explored using external URL content to enhance tweet
meaning but it didn’t add much value to the core meaning of tweet and was distorting
results. So, We removed these extraneous web links using regular expression.
• Removing Username mentions: Removing username mentions in tweets analysis
data is crucial to preserve privacy and reduce bias, as mentions often refer to specific
individuals or accounts. This step ensured that the analysis remains impartial.
• Convert words to lowercase: Converting words to lowercase in tweets analysis data
standardizes text and enhances consistency, ensuring that words with diferent
capitalization patterns are treated as identical. This step prevents discrepancies in analysis and
simplifies text processing.
• Remove non-alphanumeric characters: We removed special symbols, and punctuation
marks that often don’t contribute significantly to the analysis. This step helped in focusing
on the core linguistic content.
• Tweet text expansion: For labels with less data, We utilized GPT-3.5 to augment tweet
content for labels such as country, political, conspiracy, religious, and none, in order to
provide richer context and enhance the relevance of the tweet in accordance with its label.
This initiative aims to assess whether text expansion can contribute to the enhancement
of the model’s performance, particularly for these challenging labels. For additional
details on this analysis and its outcomes, consult the Appendix Table 1.5.</p>
    </sec>
    <sec id="sec-7">
      <title>6. Methodology</title>
      <sec id="sec-7-1">
        <title>6.1. Models</title>
        <p>
          Fine Tuning DeBERTa Large: In one of our experiments we finetuned DeBERTa
(Decodingenhanced BERT with Disentangled Attention) [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] ”large” variant. It builds on RoBERTa [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]
with disentangled attention and enhanced mask decoder training with half of the data used in
RoBERTa. It is a Transformer-based neural language model that aims to improve the BERT [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]and
RoBERTa models with two techniques: a disentangled attention mechanism and an enhanced
mask decoder. The disentangled attention mechanism is where each word is represented
unchanged using two vectors that encode its content and position, respectively, and the attention
weights among words are computed using disentangle matrices on their contents and relative
positions. The enhanced mask decoder is used to replace the output softmax layer to predict the
masked tokens for model pre-training. In addition, a new virtual adversarial training method is
used for fine-tuning to improve model’s generalization on downstream tasks. We used max
length as 128 with padding to right. We used learning rate ad 2e-5 and batch size of 10 to fine
tune model for 15 epochs. To prevent overfitting, We used early stop monitoring the validation
loss with patience value 5.
        </p>
        <p>
          Prefix Tuning of RoBERTa Large: In our experiment, we employed the RoBERTa (A Robustly
Optimized BERT Pretraining Approach) [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] ”large” variant, which is among the state-of-the-art
transformers in the domain of natural language processing. RoBERTa builds on BERT [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] model
architecture using a more efective training procedure and was trained on a much larger dataset.
This variant is pre-trained on 160GB of text from the BookCorpus, OpenWebText, English
Wikipedia etc., making it adept at grasping linguistic nuances and contextual representations
of text. We chose prefix tuning [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] for RoBERTa large because it allows us to adapt the
pretrained model for our specific multi-label classification task without overhauling the underlying
patterns the model had previously learned. By adding a task-specific prefix to the input sequence,
prefix tuning efectively guides the model to tailor its representations for the given task while
leveraging the extensive pre-existing knowledge encoded in the model. We kept 128 virtual
tokens at the prefix of the prompt and 100 tokens to encode the tweet looking at the distribution
of tweet lengths. We used learning rate of 1e-2 and batch size of 8 to fine-tune the model for 15
epochs. We used BCEWithLogitsLoss loss function to suit the multi label classification problem.
To prevent overfitting, We used early stop monitoring the validation loss with patience value
5. For this experiment, we selected probability threshold of 0.5 to assign classes above this
threshold to any tweet.
        </p>
      </sec>
      <sec id="sec-7-2">
        <title>6.2. Experimental Setup</title>
        <p>Our experimental framework was designed to ensure robust model development and evaluation.
We started by randomly shufling the dataset and then splitting it into an 80% training set and a
20% validation set. We pre-processed the training and validation set using the pre-processing
steps mentioned in Section 5. Given the nature of tweets with multiple labels, we applied a
Multilabel Binarizer to appropriately encode and handle these labels. Additionally, to prevent
overfitting, we employed early stopping techniques with configurable parameters. For each
experiment, we systematically varied model hyperparameters. Detailed information on these
parameters and experiment configurations can be found in the Section 6.1.</p>
      </sec>
      <sec id="sec-7-3">
        <title>6.3. Predictions</title>
        <p>For the predictions over the final test data provided, we fine-tuned diferent language based
model architectures with the objective of multi label text classification, details of which are
mentioned in Section 6.1. We predicted the probability scores of each test tweet against all
classes. We also experimented with diferent probability thresholds to assign classes for diferent
models and selected thresholds based on Macro-F1 performance metric. Classes with probability
score greater than the selected threshold were assigned as the predicted classes for that tweet.
We also did some post-processing for scenarios where the model was predicting other class
labels along with “none” class label, so we removed “none” class label in those scenarios and kept
the other predicted class labels as is. Based on our thresholds, there might be a few scenarios,
where the model didn’t make any prediction to ensure precise results. We submitted 3 prediction
ifles from diferent models containing Tweet ID and predicted classes.</p>
      </sec>
      <sec id="sec-7-4">
        <title>6.4. Additional Modeling Experiments</title>
        <p>In addition to the submitted models, we conducted a series of experiments utilizing diverse
feature sets and model architectures. However, these experiments did not yield superior results
and were consequently not included in the final submission. This section provides insights
into our exploration of alternative approaches, ofering valuable context for the chosen model’s
selection.</p>
        <p>
          BERTweet Large: As the cornerstone of our approach, we selected the BERTweet Large model
due to its specialization in processing Twitter data. This model is pre-trained on a massive
Twitter corpus, making it adept at capturing the linguistic nuances and contextual intricacies of
tweets. BERTweet [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] is the first public large-scale language model pre-trained for English
Tweets. BERTweet is trained based on the RoBERTa pre-training procedure. The corpus used to
pre-train BERTweet consists of 850M English Tweets (16B word tokens, 80GB), containing 845M
Tweets streamed from 01/2012 to 08/2019 and 5M Tweets related to the COVID-19 pandemic.
The BERTweet Large model was fine-tuned on our training dataset. During fine-tuning, we
optimized model weights to align with the specific multi-label classification task. This step
included adjusting model parameters, learning rates, and batch sizes. We used learning rate of
2e-5 and batch size of 10 to fine-tune the model for 10 epochs. We used early stopping threshold
of 0.001 for preventing model overfitting. For this experiment, we selected probability threshold
of 0.2 to assign classes above this threshold to any tweet.
tf-idf vectorizer with Deep Neural Network: After pre-processing the text, we used tf-idf
vectorizer to create numerical representations of text features. Then, we used Deep Neural
Network model on these features by adding dense layers and also drop out layers to handle
overfitting.
        </p>
        <p>LSTM with GloVe Twitter Embedding: We did another experimentation by building an LSTM
model. We used GloVe Twitter(2B tweets, 27B tokens, 1.2M vocab, uncased, 100d) embedding 1
to create features. Then we used a dropout layer for handling overfitting, an LSTM layer and a
Dense layer for building the multi-label classifier. We used sigmoid as the activation function at
the output layer, binary cross-entropy as loss. With this experiment we got Macro Average F1
score of 0.296 on the validation set using threshold of 0.2.</p>
        <p>
          Experiment with GPT-4: We experimented with GPT-4 [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] to generate labels for tweets in
validation set by giving it few shots examples of all the class labels along with system and user
prompt, details of which are mentioned in Appendix B. We used temperature of 0 to be more
deterministic and top_p of 1.0. We analyzed the results to find that most of the times GPT-4 was
predicting at least 2 labels for a tweet, even though our data distribution has majority of the
times 1 label for each tweet. Hence, it was significantly lowering the precision of the results.
Expanded Tweet Experiment: As mentioned in the pre-processing section, we expanded the
tweets for certain classes to improve the performance of those classes. We did prefix tuning
of Roberta Large model using expanded tweets for certain classes and normal tweet for other
classes in the train set. In the validation set, we didn’t expand tweets to evaluate performance.
        </p>
        <p>1https://nlp.stanford.edu/projects/glove/
We didn’t see any performance improvement over the prefix tuning of Roberta Large model on
normal tweets.</p>
        <p>Label Enhancement and Similarity Matching: In this experiment we tried to enhance the
label by using GPT-3.5 model. After having enhanced labels we calculated its embedding using
BERTweet model. In runtime we calculated the cosine similarity of embeddings of enhanced
labels and tweets. We noticed the with threshold as 0.8 it was not performing well.</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>7. Evaluation</title>
      <p>This task was evaluated using Macro-F1 score on the 12 diferent classes as metric. The result
of our submitted automated runs on test set for this Task is shown in Table 2.</p>
      <p>Sr No.</p>
      <sec id="sec-8-1">
        <title>Team_name</title>
      </sec>
      <sec id="sec-8-2">
        <title>Model Details</title>
        <p>macro-F1 score</p>
      </sec>
      <sec id="sec-8-3">
        <title>Jacc score</title>
      </sec>
      <sec id="sec-8-4">
        <title>Rank</title>
        <p>1
2
Cognitive
Coders
Cognitive
Coders
DeBERTa Large
Fine-tuning
RoBERTa Large
Prefix tuning</p>
      </sec>
    </sec>
    <sec id="sec-9">
      <title>8. Conclusion and Future Work</title>
      <p>In the final evaluation of this study, we conducted fine-tuning experiments using diferent
language models: DeBERTa Large and Prefix Tuning of RoBERTa Large. Our objective was to
explore the performance of these models in the context of a complex dataset where none of the
labels exhibited a direct correlation with entity, sentiment, length, or word characteristics.
Our findings revealed that transformer-based models outperformed traditional classifiers in
handling the intricacies of our dataset. This observation underscores the potential of
transformerbased architectures in addressing multifaceted classification tasks.</p>
      <p>Furthermore, we explored diferent data augmentation strategies, such as utilizing language
models (LLM) to expand tweet text and provide additional context with the objective that this
approach can potentially enhance model performance, particularly for labels with limited data
points, such as religious, country, and ingredients. Increasing the dataset size for these labels may
lead to improved classification accuracy, as transformer-based models are known to benefit from
larger datasets due to their data-hungry nature. Also, In our research, we employed Hugging
Face’s bertweet-tb2_wnut17-ner API to detect entities in tweet texts. This API, specialized
for social media data, enhanced our Named Entity Recognition (NER) capabilities. It allowed
us to categorize entities efectively in the context of Twitter’s informal language, hashtags,
and mentions. This integration could enable comprehensive analyses of label assignments,
sentiment, and tweet length, shedding light on the intricate entity-label relationships within our
dataset. However, due to time constraints, our exploration yielded limited outcomes, suggesting
the need for further investigation in the future.</p>
      <p>In summary, our study highlights the promising performance of transformer-based models in
tackling complex multi-label classification tasks. Additionally, we recommend future research
eforts that focus on data augmentation and dataset expansion to further enhance model
efectiveness, particularly in scenarios with limited labeled data.</p>
    </sec>
    <sec id="sec-10">
      <title>A. Data Exploration and Observations</title>
      <p>In this section, we delve into an extensive exploration of our data, unveiling key insights across
various dimensions. Specifically, we scrutinize tweet length, dissect word frequency patterns
within each label category, extract entities relevant to each label, investigate the correlations
between label assignments and tweet length, and conduct a thorough analysis of original
versus expanded tweets. The subsequent subsections provide a comprehensive account of these
analyses and observations.
Michael Yeadon, a former employee of
Pfizer, said that the government rollout
of the COVID-19 vaccine is an attempt at
”mass depopulation” with booster
recipients expected to die ...
@MrStache9 Well i believe there won’t be
an election until Trudick get enough covid
vaccine into enough people to claim he did
something right...</p>
      <sec id="sec-10-1">
        <title>Expanded tweet</title>
        <p>It is important to note that the claim made
by Michael Yeadon, a former employee of
Pfizer, that the government rollout of the
COVID-19 vaccine is an attempt at ”mass
depopulation” with booster recipients
expected to die...</p>
      </sec>
      <sec id="sec-10-2">
        <title>Label</title>
        <p>conspiracy
The statement seems to suggest that the
Canadian government, led by Prime
Minister Justin Trudeau, will likely wait until political
a significant portion of the population has
been vaccinated against COVID-19...
System Prompt:
You are a helpful assistant that will help in providing the most relevant labels to a social media
post from a list of labels that express significant concern towards the vaccine.
User Prompt:
Assign most relevant labels to a social media post (particularly, a tweet) according to the specific
concern(s) towards vaccines as expressed by the author of the post.</p>
        <p>Note that a tweet can have more than one label (concern), e.g., a tweet expressing more than 1
diferent concerns towards vaccines will have more labels.</p>
        <p>We consider the following concerns towards vaccines as the labels for this classification task:
{labels with description}
tweet text: {text}
Response: list of labels separated by space
Sample of Few-shot examples:
{
"role": "system",
"name": "example_user",
"content": '''@kentlivenews Let's hope Boris Johnson isn't one of those new
trainees to stick people with the vaccine. Not a good picture to
use.'''
},
{
}
"role": "system",
"name": "example_assistant",
"content": 'Political',</p>
      </sec>
    </sec>
    <sec id="sec-11">
      <title>C. Online Resources</title>
      <p>• GitHub</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Poddar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Basu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Ghosh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ghosh</surname>
          </string-name>
          ,
          <article-title>Overview of the fire 2023 track:artificial intelligence on social media (aisome)</article-title>
          ,
          <source>in: Proceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation</source>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S.</given-names>
            <surname>Poddar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Samad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Mukherjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ganguly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ghosh</surname>
          </string-name>
          ,
          <article-title>Caves: A dataset to facilitate explainable classification and summarization of concerns towards covid vaccines</article-title>
          ,
          <year>2022</year>
          . arXiv:
          <volume>2204</volume>
          .
          <fpage>13746</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , Bert:
          <article-title>Pre-training of deep bidirectional transformers for language understanding</article-title>
          ,
          <year>2019</year>
          . arXiv:
          <year>1810</year>
          .04805.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>X. L.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <article-title>Prefix-tuning: Optimizing continuous prompts for generation</article-title>
          ,
          <year>2021</year>
          . arXiv:
          <volume>2101</volume>
          .
          <fpage>00190</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>R. OpenAI</surname>
          </string-name>
          , Gpt-4
          <source>technical report</source>
          , arXiv (
          <year>2023</year>
          )
          <fpage>2303</fpage>
          -
          <lpage>08774</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Müller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Salathé</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. E.</given-names>
            <surname>Kummervold</surname>
          </string-name>
          ,
          <article-title>Covid-twitter-bert: A natural language processing model to analyse covid-19 content on twitter</article-title>
          ,
          <year>2020</year>
          . arXiv:
          <year>2005</year>
          .07503.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Bithel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Verma</surname>
          </string-name>
          ,
          <article-title>Vaccinebert: Bert for covid-19 vaccine tweet classification</article-title>
          ,
          <source>in: Working Notes of FIRE-13th Forum for Information Retrieval Evaluation</source>
          , FIRE-WN
          <year>2021</year>
          ,
          <year>2021</year>
          , pp.
          <fpage>1199</fpage>
          -
          <lpage>1203</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>P.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gao</surname>
          </string-name>
          , W. Chen, Deberta:
          <article-title>Decoding-enhanced bert with disentangled attention</article-title>
          ,
          <year>2021</year>
          . arXiv:
          <year>2006</year>
          .03654.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ott</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Du</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Joshi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Levy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zettlemoyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Stoyanov</surname>
          </string-name>
          ,
          <article-title>Roberta: A robustly optimized bert pretraining approach</article-title>
          ,
          <year>2019</year>
          . arXiv:
          <year>1907</year>
          .11692.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>D. Q.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Vu</surname>
          </string-name>
          , A. T. Nguyen,
          <article-title>BERTweet: A pre-trained language model for English Tweets</article-title>
          ,
          <source>in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>9</fpage>
          -
          <lpage>14</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>