=Paper=
{{Paper
|id=Vol-3681/T1-4
|storemode=property
|title=Multi-Label Classification of COVID-Tweets Using Large Language Models
|pdfUrl=https://ceur-ws.org/Vol-3681/T1-4.pdf
|volume=Vol-3681
|authors=Aniket Deroy,Subhankar Maity
|dblpUrl=https://dblp.org/rec/conf/fire/DeroyM23
}}
==Multi-Label Classification of COVID-Tweets Using Large Language Models==
Multi-Label Classification of COVID-Tweets Using Large Language Models Aniket Deroy1 , Subhankar Maity1 1 IIT Kharagpur, Khargapur, India Abstract Vaccination is important to minimize the risk and spread of various diseases. In recent years, vaccination has been a key step in countering the COVID-19 pandemic. However, many people are skeptical about the use of vaccines for various reasons, including the politics involved, the potential side effects of vaccines, etc. The goal in this task is to build an effective multi-label classifier to label a social media post (particularly, a tweet) according to the specific concern(s) towards vaccines as expressed by the author of the post. We tried three different models-(a) Supervised BERT-large-uncased, (b) Supervised HateXplain model, and (c) Zero-Shot GPT-3.5 Turbo model. The Supervised BERT-large-uncased model performed best in our case. We achieved a macro-F1 score of 0.66, a Jaccard similarity score of 0.66, and received the sixth rank among other submissions. Code is available at-https://github.com/anonmous1981/AISOME Keywords COVID Vaccines, Multi-label Classification, Large Language Models, Prompt Engineering 1. Introduction Vaccination, as a cornerstone of public health, plays a key role in reducing the risk and spread of various diseases. Over the years, vaccines have proven to be one of the most effective tools in combating infectious diseases, contributing significantly to global efforts to control and eradicate deadly pathogens. In recent times, the importance of vaccination has been underscored by the emergence of the COVID-19 pandemic, where vaccines have emerged as our most potent weapon in curbing the devastating impact of the virus ( https://www.who.int/ emergencies/diseases/novel-coronavirus-2019/covid-19-vaccines). Beyond pandemic control, widespread vaccination is indispensable to prevent a spectrum of diseases, including those affecting vulnerable populations such as children and annual recurring threats such as influenza. Despite the undeniable success stories and scientific consensus surrounding vaccinations, a growing segment of the population remains skeptical about their use. Despite the undeniable benefits of vaccines, a growing phenomenon of vaccine hesitancy has emerged in recent years. Vaccine hesitancy refers to the delay in accepting or refusing vaccines, despite the availability of vaccination services. It is not limited to a specific demographic or geographic region, but is observed in diverse populations around the world. Vaccine hesitancy can manifest itself in various forms, from the outright refusal of vaccines to concerns about Forum for Information Retrieval Evaluation, December 15-18, 2023, India Envelope-Open roydanik18@kgpian.iitkgp.ac.in (A. Deroy); subhankar.ai@kgpian.iitkgp.ac.in (S. Maity) GLOBE https:// (A. Deroy); https:// (S. Maity) Orcid 0000-0000-0000-0000 (A. Deroy); 0000-0000-0000-0000 (S. Maity) © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings vaccine safety, efficacy or mistrust in the motives of health authorities and pharmaceutical companies ( https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9351420/). This hesitancy can have significant consequences, including reduced vaccination coverage rates, increased vulnerability to outbreaks, and a resurgence of preventable diseases. The reasons behind vaccine hesitancy are complex and multifaceted. They often intersect with broader social issues, including the spread of misinformation, the mistrust of institutions, and political polarization. To effectively address vaccine hesitancy, it is imperative to understand the underlying factors that drive it. This track [1] is based on the CAVES dataset [2]. Here, the goal is to build an effective multi-label classifier to label a social media post (particularly a tweet) according to the specific concern(s) towards vaccines as expressed by the author of the post. We tried various classifiers, including BERT-large-uncased, and HateXplain after training on the AISOME training dataset. We use GPT-3.5 Turbo in a zero-shot mode with prompt. The results clearly show that the BERT-large-uncased model trained on the AISOME training data has performed best among all the models. Our team ranked 6th in this task. 2. Problem Definition The goal is to build an effective multi-label classifier to label a social media post (particularly a tweet) according to the specific concern(s) for vaccines as expressed by the author of the post. We consider the following concerns towards vaccines as the labels for the classification task: • Unnecessary: The tweet implies that vaccines may be unnecessary or posits the existence of more effective alternative treatments. • Mandatory: Opposed to compulsory vaccination — The tweet implies that vaccines should not be required by law. • Pharma: Opposed to Big Pharma — The tweet conveys the idea that large pharmaceutical companies are primarily motivated by profit, or it expresses a general distrust of such companies due to their historical actions. • Conspiracy: Darker Conspiracy Angle — The tweet hints at a more intricate conspiracy beyond profit motives, such as the idea that vaccines might be used for surveillance or that COVID is being portrayed as a hoax. • Political: The Political Aspect of Vaccination — The tweet raises concerns about the possibility of governments or politicians advancing their own interests through the promotion of vaccines. • Country: Originating country — The tweet expresses opposition to a vaccine due to the nation in which it was created or produced. • Rushed: Untested / Rushed Process — The tweet raises worries about the vaccines undergoing insufficient testing or questions the accuracy of the published data. • Ingredients: The tweet raises issues regarding the components found in vaccines (e.g., fetal cells, chemicals) or the technology employed (e.g., the claim that mRNA vaccines have the potential to modify your DNA). • Side-effect: Adverse effects and fatalities — The tweet voices concerns regarding the vaccine’s side effects, which include reported cases of deaths. • Ineffective: The tweet conveys concerns that the vaccines are not sufficiently effective and serve no practical purpose. • Religious: Religious grounds - Twitter opposes vaccines based on religious beliefs. • None: The tweet does not provide a particular explanation or offers an explanation different from the ones provided. 3. Related Work BERT (Bidirectional Encoder Representations from Transformers) [3] is a state-of-the-art natural language processing (NLP) model developed by Google in 2018. It has revolutionized the field of NLP with its innovative architecture and pre-training techniques. We used the BERT-large-uncased model for multilabel text classification in this work. The HateXplain model [4] works to address the complex issue of hate speech on social media platforms. The work seems to focus on multiple aspects of hate speech, including bias and interpretability, which are critical components in developing effective solutions to combat this problem. GPT-3.5 Turbo [5] is a highly advanced large language model (LLM) developed by OpenAI. It is part of the GPT-3 family of models and is known for its remarkable natural language understanding and generation capabilities. GPT-3.5 Turbo can comprehend and generate human-like text across a wide range of topics and tasks, making it a versatile tool for various applications, from chatbots and content generation to language translation, and more. This LLM is designed to assist with complex language-related tasks and provides impressive language generation capabilities based on a large amount of text data on which it was trained. GPT-3.5 Turbo is used through prompting in a zero-shot mode. This track is based on the work [2] which is a multi-label classification of COVID-related tweets with the aim of trying to improve the COVID vaccination process. 4. Dataset We have received a training set of 9,921 tweets along with the corresponding labels. We have received 486 tweets in the test set with labels. We received a CSV file for training purposes that contained the ID, tweet, and label for 9,921 tweets. We had received a CSV file for testing purposes that contained the ID, tweet, and label for 486 tweets. 5. Methodology 5.1. Method 1 We used the GPT-3.5 Turbo model in zero-shot mode. We give instructions to GPT-3.5 Turbo in zero-shot mode with the list of labels descriptions and the task to be performed. We also provide the list of the most important keywords corresponding to each label as instructions to the GPT-3.5 Turbo model. Then we provide every query to the model and ask it to provide corresponding labels for a multi-label classification problem. The hyperparameters are as follows: temperature = 0.7, max-tokens = 50, and stop = None. A diagrammatic representation of the model is given in Figure 1. The prompt we use for the model is provided in Figure 2. Figure 1: An overview of GPT for zero-shot multi-label classification. Figure 2: Prompt used for GPT-3.5 Turbo. 5.2. Method 2 We trained the BERT-large-uncased model using the training set of 9,921 tweets and the corre- sponding labels. We tested the model on the 486 tweets present in the test set. The embeddings from the BERT-large-uncased model were obtained and the embeddings were passed through a dense layer to obtain the final predictions. The dimensionality of the embeddings of the model is 1,024 and the maximum input token length is 512. The model is run for 100 epochs with a batch size of 1, a threshold of 0.5, and a learning rate of 2e-5. We use the sigmoid activation function. A diagrammatic representation of the model is shown in Figure 3. Figure 3: An overview of BERT for multi-label classification. 5.3. Method 3 We trained the HateXplain model using the training set of 9,921 tweets and corresponding labels. We tested the model on the 486 tweets present in the test set. The embeddings from the HateXplain model were obtained and the embeddings were passed through a dense layer to obtain the final predictions. The dimensionality of the embeddings of the model is 768 and the maximum input token length is 512. The model is run for 100 epochs at a batch size of 1, threshold of 0.5, and learning rate of 2e-5. 6. Results We observe that the BERT-large-uncased gives the best results when trained on the AISOME training dataset of 9,921 tweets along with their labels. Table 1 shows the results of all models in the AISOME test dataset for Macro-F1 and Jaccard Similarity. Table 1 Result of all methods on the AISOME test dataset for Macro-F1 and Jaccard Similarity. Team_ID Summary of Methodology Macro-F1 Jaccard Similarity Rank Run File TextTitans BERT-large-uncased (Method 2) 0.66 0.66 6 text_titans_social_media2.csv TextTitans HateXplain (Method 3) 0.54 0.57 21 text_titans_hate_explain2.csv TextTitans Zero Shot GPT-3.5 (Method 1) 0.53 0.44 24 text_titans14.csv 7. Conclusion and Future Work The task is focused on building a multi-label classifier that helps predict the nature of a tweet related to COVID or other disease-causing viruses. We have tried various language models like the BERT-large-uncased, and HateXplain model after training on the AISOME training dataset and the GPT-3.5 model in a zero-shot setting by using prompt engineering. The results show that the BERT-large-uncased (Method 2) has provided the best results. Future work will focus on increasing the amount of training data for training the models. In addition, a focus will be on trying several other large language models for the purpose of multilabel classification. References [1] S. Poddar, M. Basu, K. Ghosh, S. Ghosh, Overview of the fire 2023 track:artificial intelligence on social media (aisome), in: Proceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation, 2023. [2] S. Poddar, A. M. Samad, R. Mukherjee, N. Ganguly, S. Ghosh, Caves: A dataset to facilitate explainable classification and summarization of concerns towards covid vaccines, in: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2022, pp. 3154–3164. [3] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, Bert: Pre-training of deep bidirectional transformers for language understanding, arXiv preprint arXiv:1810.04805 (2018). [4] B. Mathew, P. Saha, S. M. Yimam, C. Biemann, P. Goyal, A. Mukherjee, Hatexplain: A benchmark dataset for explainable hate speech detection, in: Proceedings of the AAAI conference on artificial intelligence, volume 35, 2021, pp. 14867–14875. [5] T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, D. Amodei, Language models are few-shot learners, in: H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, H. Lin (Eds.), Advances in Neural Information Processing Systems, volume 33, Curran Associates, Inc., 2020, pp. 1877–1901. URL: https://proceedings.neurips.cc/paper_files/paper/ 2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf.