Fake News Detection in Social Media Using Graph Neural Networks and NLP Techniques: A COVID-19 Use-Case Abdullah Hamid1 *, Nasrullah Sheikh 2 *, Naina Said1 *, Kashif Ahmad3 *, Asma Gul4 , Laiq Hasan 1 , Ala Al-Fuqaha3 1 DCSE, University of Engineering and Technology, Peshawar, Pakistan, 2 IBM Research - Almaden 3 Division of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, Doha, Qatar, 4 Department of Statistics, Shaheed Benazir Bhutto Women University, Peshawar, Pakistan {kahmad,aalfuqaha}@hbku.edu.qa,nasrullah.sheikh@ibm.com {nainasaid,laiqhasan,Abdullahhamid}@uetpeshawar.edu.pk ABSTRACT task consists of two parts, namely (i) text-based misinformation The paper presents our solutions for the MediaEval 2020 task detection (TMD), and (ii) structure-based misinformation detec- namely FakeNews: Corona Virus and 5G Conspiracy Multimedia tion (SMD). The first task (TMD) is based on textual analysis of Twitter-Data-Based Analysis. The task aims to analyze tweets re- COVID-19 related information shared on Twitter during January lated to COVID-19 and 5G conspiracy theories to detect misinfor- 2020 and 15th of July 2020, and aims to detect different types of mation spreaders. The task is composed of two sub-tasks namely (i) conspiracy theories about COVID-19 and its vaccines, such as that text-based, and (ii) structure-based fake news detection. For the first "the 5G weakens the immune system and thus caused the current task, we propose six different solutions relying on Bag of Words corona-virus pandemic etc., [8]. In the SMD task, the participants (BoW) and BERT embedding. Three of the methods aim at binary are provided with a set of graphs, each representing a sub-graph classification task by differentiating in 5G conspiracy and the rest of Twitter, and corresponds to a single tweet where the vertices of the COVID-19 related tweets while the rest of them treat the of the graphs represent accounts. Similar to TMD, in this task, the task as ternary classification problem. In the ternary classification participants need to detect and differentiate between 5G and other task, our BoW and BERT based methods obtained an F1-score of COVID-19 conspiracy theories. .606% and .566% on the development set, respectively. On the bi- nary classification, the BoW and BERT based solutions obtained 2 PROPOSED APPROACH an average F1-score of .666% and .693%, respectively. On the other 2.1 Methodology for TMD Task hand, for structure-based fake news detection, we rely on Graph For the text-based analysis, we employed two different methods Neural Networks (GNNs) achieving an average ROC of .95% on the including a (i) Bag of Words (BoW), and a (ii) BERT model-based development set. solution [3]. Before proceeding with the proposed methods, it is to be noted that the dataset provided for the text-based analysis 1 INTRODUCTION is not balanced where one of the classes namely non-conspiracy In the modern world, social media is playing its part in several ways, contains a very high number of samples while the rest are com- for instance in news dissemination and information sharing, social posed of relatively fewer samples. In total, the majority class con- media outlets, such as Twitter, Facebook, and Instagram, have been tains 4412, while the other two classes, namely 5G conspiracies, and proved very effective [1, 6, 7, 9]. However, it also comes with sev- other conspiracies, are composed of 1263 and 785 samples, respec- eral challenges, such as collecting information from several sources, tively. In order to balance the dataset, we rely on an ensemble of detecting and filtering misinformation [4, 5, 11]. Similar to other different re-sampled datasets, where 𝑁 models are built/trained by events and pandemics, being one of the deadly pandemics in the dividing the class with a higher number of samples into n-differing history, COVID-19 has been the subject of discussion over social parts as illustrated in Figure 1. After training 𝑁 models, the results media since its emergence. Without any surprise, a lot of misin- of the models are combined using two different late fusion methods formation about the pandemic are circulated over social networks. including a majority voting method, and summation of the poste- In order to identify misinformation spreaders and filter fake news rior probabilities. In the majority voting, since we have four models, about COVID-19 and 5G conspiracy, a task namely "FakeNews: in the case of tie we consider the accumulative probabilities/scores Corona Virus and 5G Conspiracy Multimedia Twitter-Data-Based to assign a label to a test sample. Analysis" has been proposed in the benchmark MediaEval 2020 Before deploying BoW and BERT, text has been cleaned by re- competition [8]. moving punctuation’s keys, such as commas, full-stops, emojis, This paper provides a detailed description of the methods pro- URLs, and stop words. Once the text is pre-processed, we proceed posed by team DCSE_UETP for the fake news detection task. The with the tokenization and creation of BoW vocabulary, which is followed by generation of the feature vector for each sentence. A Copyright 2020 for this paper by its authors. Use permitted under Creative Commons Naives Bayes classifier is then trained on the extracted features. License Attribution 4.0 International (CC BY 4.0). MediaEval’20, December 14-15 2020, Online On the other hand, a logistic regression model is trained on word embeddings generated via BERT. MediaEval’20, December 14-15 2020, Online A. Hamid et al. Training Samples Models The last three runs are based on the binary classification task, C1 C2 C_3_1 where the first two (i.e., Run 4 and Run 5) are based on BoW with ma- Late Fusion jority voting and accumulative classification based fusion methods Predicted_Label while the final one (i.e., Run 6) is based on BERT with accumulative C1 C2 C_3_2 score based fusion scheme. As expected, the performance of all the methods is significantly higher on the binary classification task compared to ternary classification task. Similar trend has been also observed on the test set, where overall C1 C2 C_3_n better results are obtained with BoW under majority voting scheme. Figure 1: An illustration of the data balancing techniques used in the work. Table 1: Evaluation of our proposed approaches for (a) TMD and (b) SMD tasks on both development and test sets. In TMD, for the test set the official metric AUC (Area Under The Curve) ROC (Receiver Operating Characteristics) curve is 2.2 Methodology for SMD Task used while the results on the development set are reported in Graphs representation learning using Graph Neural Networks (GNNs) terms of Micro F1-Score. On the other hand, SMD is evaluated have been shown to be effective in various domains such as social in terms of AUC ROC. networks, biological networks, and financial networks. GNNs ag- (a) TMD gregate the neighborhood representation within k hops and then Run Dev. Set (F-Score) Test Set (AUC ROC) (b) SMD apply a pooling such as SUM, MEAN, MAX to obtain the final rep- Run 1 0.6066 0.3815 resentation of the node. Furthermore, GNN’s can be used to learn Run 2 0.5666 0.3588 Run Dev. Set the representation of a simple graph structures [2, 10, 12], which Run 3 0.5333 0.3002 Run 4 0.6933 0.3944 Run 1 .9500 then can be used to classify the graphs. For graph classification, Run 5 0.6666 0.3803 these methods learn the representation of nodes, followed by graph Run 6 0.6533 0.3447 READOUT method, which is aggregating the node features obtained after the final iteration of GNN. We model this problem as a graph classification task. Following 3.3 Runs Description in SMD Task Keyule et al.[12], we train our model using three classes of the For training the model, we divide the dataset into train/valid/valid graphs 5G Conspiracy, non-conspiracy, other-conspiracy, and learn (80/10/10). We used the grid search to obtain the best hyperparam- the representation of the graphs. eters. The model has four MLP layers, and use MAX and MEAN operations for neighbor pooling and graph pooling respectively. 3 RESULTS AND ANALYSIS The model is trained on 1000 epochs with a learning rate of 0.01, and dropout 0.3 is applied on the final layer output. The final embedding 3.1 Evaluation Metric size is 128. We evaluate our model on AUC-ROC and the result of For the evaluation of the proposed methods, we used two different the test set is given in Table 1(b). The results show that the model metrics, namely (i) Micro F1-Score, and (ii) AUC (Area Under The has discriminative power to learn to classify the graph structures. Curve) ROC (Receiver Operating Characteristics) curve. AUC ROC Furthermore, it shows that the diffusion of information depending is the official evaluation metric on the task, and all the test results on the type of information being spread forms a diffusion pattern. are reported in terms of AUC ROC. On the other hand, F1-score is used for the evaluation of the methods on the development set. 4 CONCLUSIONS AND FUTURE WORK The challenge is composed of two tasks, one aiming to analyze and 3.2 Runs Description in TMD Task detect COVID-19 related fake news using tweets’ text while the For TMD, we submitted six different runs mainly relying on two other aims to analyze network structure for the possible detection approaches, namely BERT and BoW, under two late fusion schemes. of the fake news. For the first task, we mainly relied on two state-of- Three of the runs are based on binary classification while the three the-art methods namely BoW and BERT embeddings under different deal the task as ternary classification problem. It is to be noted fusion schemes. Overall better results are obtained with BoW under that the fusion schemes are used to combine the scores/output of the majority voting scheme. For the SMD task, we rely on GNNs the four individual models trained as result of the data balancing to differentiate among different conspiracy theories on COVID- method as described earlier. 19. In the current implementations, both textual and structural The first three runs are based on the ternary classification task, information are used independently, in the future we aim to enrich where run 1 and run 2 are based on BoW with majority voting and the structural information with the textual information for better accumulative classification scores of the individual models. The detection of fake news. third and final ternary run is based on BERT features, where a logistic regression model is trained on word embeddings generated REFERENCES by BERT. As can be seen in Table 1a, overall, better results are [1] Kashif Ahmad, Konstantin Pogorelov, Michael Riegler, Nicola Conci, obtained with BoW approach under the majority voting scheme. and Pal Halvorsen. 2019. Social media and satellites: Disaster event FakeNews: Corona virus and 5G conspiracy MediaEval’20, December 14-15 2020, Online detection, linking and summarization. MULTIMEDIA TOOLS AND New Journal of Physics 17, 11 (2015), 113045. APPLICATIONS 78, 3 (2019), 2837–2875. [8] Konstantin Pogorelov, Daniel Thilo Schroeder, Luk Burchard, Johannes [2] Cătălina Cangea, Petar Veličković, Nikola Jovanović, Thomas Kipf, Moe, Stefan Brenner, Petra Filkukova, and Johannes Langguth. 2020. and Pietro Liò. 2018. Towards Sparse Hierarchical Graph Classifiers. FakeNews: Corona Virus and 5G Conspiracy Task at MediaEval 2020. (2018). arXiv:stat.ML/1811.01287 In MediaEval 2020 Workshop. [3] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. [9] Naina Said, Kashif Ahmad, Michael Riegler, Konstantin Pogorelov, 2018. Bert: Pre-training of deep bidirectional transformers for language Laiq Hassan, Nasir Ahmad, and Nicola Conci. 2019. Natural disasters understanding. arXiv preprint arXiv:1810.04805 (2018). detection in social media and satellite imagery: a survey. Multimedia [4] Siva Charan Reddy Gangireddy, Cheng Long, and Tanmoy Tools and Applications 78, 22 (2019), 31267–31302. Chakraborty. 2020. Unsupervised Fake News Detection: A Graph- [10] Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2018. based Approach. In Proceedings of the 31st ACM Conference on Hyper- How Powerful are Graph Neural Networks? CoRR abs/1810.00826 text and Social Media. 75–83. (2018). arXiv:1810.00826 http://arxiv.org/abs/1810.00826 [5] Yi Han, Shanika Karunasekera, and Christopher Leckie. 2020. Graph [11] Shuo Yang, Kai Shu, Suhang Wang, Renjie Gu, Fan Wu, and Huan Neural Networks with Continual Learning for Fake News Detection Liu. 2019. Unsupervised fake news detection on social media: A from Social Media. arXiv preprint arXiv:2007.03316 (2020). generative approach. In Proceedings of the AAAI Conference on Artificial [6] Muhammad Imran, Prasenjit Mitra, and Carlos Castillo. 2016. Twitter Intelligence, Vol. 33. 5644–5651. as a lifeline: Human-annotated twitter corpora for NLP of crisis-related [12] Rex Ying, Jiaxuan You, Christopher Morris, Xiang Ren, William L. messages. arXiv preprint arXiv:1605.05894 (2016). Hamilton, and Jure Leskovec. 2018. Hierarchical Graph Representation [7] Chuang Liu, Xiu-Xiu Zhan, Zi-Ke Zhang, Gui-Quan Sun, and Pak Ming Learning with Differentiable Pooling. CoRR abs/1806.08804 (2018). Hui. 2015. How events determine spreading patterns: information arXiv:1806.08804 http://arxiv.org/abs/1806.08804 transmission via internal and external influences on social networks.