=Paper=
{{Paper
|id=Vol-3181/paper68
|storemode=property
|title=Detecting Fake News Conspiracies with Multitask and Prompt-Based
Learning
|pdfUrl=https://ceur-ws.org/Vol-3181/paper68.pdf
|volume=Vol-3181
|authors=Cheikh Brahim El Vaigh,Thomas Girault,Cyrielle Mallart,Duc Hau Nguyen
|dblpUrl=https://dblp.org/rec/conf/mediaeval/VaighGMN21
}}
==Detecting Fake News Conspiracies with Multitask and Prompt-Based
Learning==
Detecting Fake News Conspiracies with Multitask and Prompt-Based Learning Cheikh Brahim El Vaigh∗1 , Thomas Girault∗ Cyrielle Mallart1 , Duc Hau Nguyen1 cheikh-brahim.el-vaigh@inria.fr,thomas@girault.fr duc-hau.nguyen@irisa.fr,cyrielle.mallart@inria.fr 1 Univ. Rennes, INRIA, CNRS, IRISA, France ABSTRACT model. Here, inferring the label of a tweet is treated as a text gen- In this paper, we present our participation to the MediaEval-2021 eration task, with the entirety of the tweet to classify as a prompt. challenge on fake news detection about coronavirus related Tweets. Additionally, these three tasks being related, we propose a multitask It consists in three subtasks that can be seen as multi-labels classi- approach that learns on all three sets of labels, and later allows for fication problems we solved with transformer-based models. We filtering the results for a specific subtask. This solution yields better show that each task can be solved independantly with mutiple results than separating all three subtasks. monotasks models or jointly with an unique multitasks model. Moreover, we propose a prompt-based model that has been fine- 2 APPROACH tuned to generate classifications from a pre-trained model based on We explain hereunder the different models we devised to address to DistilGPT-2. Our experimental results show the multitask model to detect conspiracy theories in tweets. We first describe the separated be the best to solve the three tasks. task learning framework in Sec. 2.1, than we introduce the Prompt- based model in Sec. 2.3. Finally, Sec. 2.1 is dedicated to the multi-task setup where all the three different task are performed at once. 1 INTRODUCTION With the worldwide spread in the last few years of the Sars-Cov-2 2.1 Separated multilabel models for each task virus, also known as Coronavirus, fear and concern has grown. The tasks can be solved with multi-labels model to classify a tweet While traditional media often relies on scientifically-vetted sources as a vector of independent probabilities for each label thanks to a to bring information to concerned readers or viewers, social media Sigmoid activation function. The labels of the task 2 are already is not subjected to this obligation to fact check. Therefore, a plethora well encoded as a binary matrix, whereas for the tasks 1 and 3, the of messages of various degrees of truthfulness has emerged across original categorical labels have been converted to binary targets social media platforms, such as Tweeter. This network has been used with one-hot-encoding. as a soapbox for a multitude extreme political theories revolving For each task, a separate instance of several BERT-based models around the Coronavirus epidemic, as well as for the debunking are fine-tuned. We tested bert-tiny [1], vaccinating-covid-tweets 2 , of said conspiracies, making it harder to untangle the conspiracy a model based on BERTweet [6] fine-tuned on Covid related tweets, theories from the real facts. and toxic-bert [5]. The first model is used as a first baseline to com- This paper describes the systems 1 that we developed for Media- pare the approaches, as it is a smaller version of the BERT model Eval 2021 Fake News detection challenge. The main objective of and requires a small amount of time to finetune. The BERTweet the present task is to classify tweets according to whether they based model has already been trained on tweet-formatted docu- are relaying conspiracy theories, and what specific conspiracy is ments, and may therefore learn on subtler aspects of tweets. Finally, evoked. Several sub-tasks contribute to refining the task[7]: task toxic-bert has been chosen as it has been trained on toxic (hateful, 1 aims at deciding whether a tweet contributes to a conspiracy, obscene, threatening, etc.) tweets, and may therefore pick up on mentions a coronavirus-related conspiracy, or is unrelated, task the fear-mongering language used by conspiracy theorists. 2 aims at classifying the topic of a tweet, while task 3 combines the two previous labels into a multiclass problem, with both the 2.2 Multitask model relevance of the tweet and its subject to infer. For each sub-task, data sets were provided by the organizers of the challenge [8]. The multitask model uses the same backbone as the first approach For each of the three aforementioned tasks, we propose a classi- with separated tasks. Each set of labels (for tasks 1, 2 and 3) are con- fication solution relying on the fine-tuning of transformer-based catenated into a single set of labels. The idea behind this multitask models e.g., [1, 3, 5, 6]. We also propose a prompt-based learning approach is to learn one general model that can be used to perform approach for the first task, relying on the DistilGPT-2 [9] pretrained the different tasks taking advantage of the relation between the tasks e.g., the existence of a conspiracy for task-1 or the existence * These authors contributed equally to this work of a particular conspiracy theory for task-3 (fine-grained version of 1 source code available online: https://github.com/CMallart/FakeNewsMediaeval2021 task-1 and task-2). Thus by properly performing task-3, we expect Copyright 2021 for this paper by its authors. Use permitted under Creative Commons the multitask model to be better in task-1 and task-2 as they are License Attribution 4.0 International (CC BY 4.0). more general than task-3. MediaEval’21, December 13-15 2021, Online 2 https://huggingface.co/ans/vaccinating-covid-tweets MediaEval’21, December 13-15 2021, Online C.-B. El Vaig, T. Girault, C. Mallart, D.-H. Nguyen 2.3 Prompt-based model Table 2: Comparing the multi-task and separated task mod- els on dev set and the test set (official MCC), for the covid- According to the prompt-based learning approach, samples are tweets model made into templates, containing the text of the tweet, a label for the tweet, and a binary classification of whether the tweet is related to its label, as in the following : Approach Task MCC-test MCC-dev micro-F1 macro-F1 Tweet : Media succeeded in creating this Covid 19 hoax... 1 0.60 0.58 0.73 0.71 Label : Promotes/Supports Conspiracy sep. tasks 2 0.72 0.72 0.75 0.76 Classification : true 3 0.66 0.68 0.94 0.67 As shown in the previous template, we formulated this prob- 1 0.63 0.70 0.81 0.79 lem as a binary classification task. For each tweet, one template is multitask 2 0.73 0.75 0.78 0.82 created for each possible label. 3 0.68 0.77 0.95 0.75 The language model is then trained to output the final word, chosen in a list that consists of the words ["false", "unlinked", "un- Table 3: Detailed results on tasks 2 and 3 on the test set related", "true", "related", "linked"]. This word should be consistent (MCC), for the multitask approach with the previous prompt, which is the text and the label, and there- fore correctly learn the type of tweets associated with each label. At Label Task 2 Task 3 inference time, three templates are again created for one tweet : the Suppressed cures 0.72 0.70 text of the tweet, followed by each of the possible three labels. The Behaviour and Mind Control 0.80 0.72 chosen label is the one where the model has the highest probability Antivax 0.68 0.62 to output "true", "related" or "linked" as the following word. Fake virus 0.65 0.64 This prompt-based model was implemented with the use of the Intentional Pandemic 0.50 0.51 OpenPrompt library [4], trained on 15 epochs. Due to lack of time Harmful Radiation/Influence 0.80 0.73 and GPU resources, the prompt-based model has been only trained Population reduction 0.83 0.75 to task 1 but it could easily extended to multitask. New World Order 0.86 0.80 Satanism 0.71 0.66 3 RESULTS AND ANALYSIS Official Run Score (official MCC) 0.73 0.68 Table 1 shows the results on task-1 of the different finetuned pre- trained models, as well as the prompt-based approach. The best scores (MCC, macro-f1 and micro-f1) are obtained with the covid- 4 DISCUSSION AND OUTLOOK tweet model which has been clearly optimized on a corpus adapted In this work, we experimented transformer-based models to detect to our task. The toxic-bert reached comparable results but it appears conspiracy theories in tweets. The three tasks have been solved with that pretraining on toxic language is not really transferable to the multilabels classifiers relying on the pretrained models. We showed conspiracy language. Surprisingly, smaller models such as bert- that it is better to train the models jointly on multiple tasks rather small and bert-tiny were able to achieve quite competitive results. than independently. Meanwhile, the official MCC scores, while Prompt-based learning with DistilGPT-2 does not outperform good, still show that there is large space for progress, especially traditional fine-tuning on the task 1. However, we would expect that for the task-1 which has only three labels. The other tasks are this approach would benefits from expanding to multitask where more challenging due to diversity of labels and the small size of the the labels share semantic properties and the number of example dataset. The idea of using the prompt-based model shows promising per label is low. results for task-1, but due to lack of time and resources we would As expected, the multitask approach outperforms the individual rather focus on the other experiments. As shown in [2], we also models on all three tasks, as displayed in Table 2. The detailed tried to generate fake training samples with GPT-2 but we were results for each label are given in table 3. not able to use them due to the lack of annotations. In the future works, we plan to use a larger generative GPT-2 model for prompt- Table 1: Comparing different pre-trained models on task-1 based training, apply the prompt model to the three tasks and try in the separated tasks setup a multitask prompt based model, which will combine two of our promising approaches, to outperform the proposed multitask. Model MCC micro-F1 macro-F1 covid-tweets 0.58 0.73 0.71 toxic-bert 0.56 0.73 0.71 bert-small 0.55 0.73 0.71 bert-tiny 0.48 0.68 0.66 distilGPT-2 - prompt 0.45 0.65 0.63 FakeNews: CoronaVirus and Conspiracies Multimedia Analysis Task MediaEval’21, December 13-15 2021, Online REFERENCES [1] Prajjwal Bhargava, Aleksandr Drozd, and Anna Rogers. 2021. Gener- alization in NLI: Ways (Not) To Go Beyond Simple Heuristics. (2021). arXiv:cs.CL/2110.01518 [2] Vincent Claveau. 2020. Detecting Fake News in Tweets from Text and Propagation Graph: IRISA’s Paritcipation to the FakeNews Task at MediaEval 2020. In Working Notes Proceedings of the MediaEval 2020 Workshop, Online, 14-15 December 2020 (CEUR Workshop Pro- ceedings), Steven Hicks, Debesh Jha, Konstantin Pogorelov, Alba Gar- cía Seco de Herrera, Dmitry Bogdanov, Pierre-Etienne Martin, Stelios Andreadis, Minh-Son Dao, Zhuoran Liu, José Vargas Quiros, Ben- jamin Kille, and Martha A. Larson (Eds.), Vol. 2882. CEUR-WS.org. http://ceur-ws.org/Vol-2882/paper63.pdf [3] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. (2019). arXiv:cs.CL/1810.04805 [4] Ning Ding, Shengding Hu, Weilin Zhao, Yulin Chen, Zhiyuan Liu, Hai- Tao Zheng, and Maosong Sun. 2021. OpenPrompt: An Open-source Framework for Prompt-learning. arXiv preprint arXiv:2111.01998 (2021). [5] Laura Hanu and Unitary team. 2020. Detoxify. Github. https://github.com/unitaryai/detoxify. (2020). [6] Dat Quoc Nguyen, Thanh Vu, and Anh Tuan Nguyen. 2020. BERTweet: A pre-trained language model for English Tweets. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. 9–14. [7] Konstantin Pogorelov, Daniel Thilo Schroeder, Stefan Brenner, and Johannes Langguth. 13-15 December 2021. FakeNews: Corona Virus and Conspiracies Multimedia Analysis Task at MediaEval 2021. In Proceedings of the MediaEval 2021 Workshop, Online. [8] Konstantin Pogorelov, Daniel Thilo Schroeder, Petra Filkuková, Stefan Brenner, and Johannes Langguth. 2021. WICO Text: A Labeled Dataset of Conspiracy Theory and 5G-Corona Misinformation Tweets. In Proceedings of the 2021 Workshop on Open Challenges in Online Social Networks. 21–25. [9] Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. CoRR abs/1910.01108 (2019). arXiv:1910.01108 http://arxiv. org/abs/1910.01108