Fake News Detection in Social Media Using Graph Neural
            Networks and NLP Techniques: A COVID-19 Use-Case
                                       Abdullah Hamid1 *, Nasrullah Sheikh 2 *, Naina Said1 *,
                                     Kashif Ahmad3 *, Asma Gul4 , Laiq Hasan 1 , Ala Al-Fuqaha3
                 1 DCSE, University of Engineering and Technology, Peshawar, Pakistan, 2 IBM Research - Almaden
        3 Division of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa
       University, Qatar Foundation, Doha, Qatar, 4 Department of Statistics, Shaheed Benazir Bhutto Women University,
                                                      Peshawar, Pakistan
                                {kahmad,aalfuqaha}@hbku.edu.qa,nasrullah.sheikh@ibm.com
                                  {nainasaid,laiqhasan,Abdullahhamid}@uetpeshawar.edu.pk

ABSTRACT                                                                             task consists of two parts, namely (i) text-based misinformation
The paper presents our solutions for the MediaEval 2020 task                         detection (TMD), and (ii) structure-based misinformation detec-
namely FakeNews: Corona Virus and 5G Conspiracy Multimedia                           tion (SMD). The first task (TMD) is based on textual analysis of
Twitter-Data-Based Analysis. The task aims to analyze tweets re-                     COVID-19 related information shared on Twitter during January
lated to COVID-19 and 5G conspiracy theories to detect misinfor-                     2020 and 15th of July 2020, and aims to detect different types of
mation spreaders. The task is composed of two sub-tasks namely (i)                   conspiracy theories about COVID-19 and its vaccines, such as that
text-based, and (ii) structure-based fake news detection. For the first              "the 5G weakens the immune system and thus caused the current
task, we propose six different solutions relying on Bag of Words                     corona-virus pandemic etc., [8]. In the SMD task, the participants
(BoW) and BERT embedding. Three of the methods aim at binary                         are provided with a set of graphs, each representing a sub-graph
classification task by differentiating in 5G conspiracy and the rest                 of Twitter, and corresponds to a single tweet where the vertices
of the COVID-19 related tweets while the rest of them treat the                      of the graphs represent accounts. Similar to TMD, in this task, the
task as ternary classification problem. In the ternary classification                participants need to detect and differentiate between 5G and other
task, our BoW and BERT based methods obtained an F1-score of                         COVID-19 conspiracy theories.
.606% and .566% on the development set, respectively. On the bi-
nary classification, the BoW and BERT based solutions obtained                       2 PROPOSED APPROACH
an average F1-score of .666% and .693%, respectively. On the other                   2.1 Methodology for TMD Task
hand, for structure-based fake news detection, we rely on Graph
                                                                                     For the text-based analysis, we employed two different methods
Neural Networks (GNNs) achieving an average ROC of .95% on the
                                                                                     including a (i) Bag of Words (BoW), and a (ii) BERT model-based
development set.
                                                                                     solution [3]. Before proceeding with the proposed methods, it is
                                                                                     to be noted that the dataset provided for the text-based analysis
1    INTRODUCTION                                                                    is not balanced where one of the classes namely non-conspiracy
In the modern world, social media is playing its part in several ways,               contains a very high number of samples while the rest are com-
for instance in news dissemination and information sharing, social                   posed of relatively fewer samples. In total, the majority class con-
media outlets, such as Twitter, Facebook, and Instagram, have been                   tains 4412, while the other two classes, namely 5G conspiracies, and
proved very effective [1, 6, 7, 9]. However, it also comes with sev-                 other conspiracies, are composed of 1263 and 785 samples, respec-
eral challenges, such as collecting information from several sources,                tively. In order to balance the dataset, we rely on an ensemble of
detecting and filtering misinformation [4, 5, 11]. Similar to other                  different re-sampled datasets, where 𝑁 models are built/trained by
events and pandemics, being one of the deadly pandemics in the                       dividing the class with a higher number of samples into n-differing
history, COVID-19 has been the subject of discussion over social                     parts as illustrated in Figure 1. After training 𝑁 models, the results
media since its emergence. Without any surprise, a lot of misin-                     of the models are combined using two different late fusion methods
formation about the pandemic are circulated over social networks.                    including a majority voting method, and summation of the poste-
In order to identify misinformation spreaders and filter fake news                   rior probabilities. In the majority voting, since we have four models,
about COVID-19 and 5G conspiracy, a task namely "FakeNews:                           in the case of tie we consider the accumulative probabilities/scores
Corona Virus and 5G Conspiracy Multimedia Twitter-Data-Based                         to assign a label to a test sample.
Analysis" has been proposed in the benchmark MediaEval 2020                             Before deploying BoW and BERT, text has been cleaned by re-
competition [8].                                                                     moving punctuation’s keys, such as commas, full-stops, emojis,
   This paper provides a detailed description of the methods pro-                    URLs, and stop words. Once the text is pre-processed, we proceed
posed by team DCSE_UETP for the fake news detection task. The                        with the tokenization and creation of BoW vocabulary, which is
                                                                                     followed by generation of the feature vector for each sentence. A
Copyright 2020 for this paper by its authors. Use permitted under Creative Commons   Naives Bayes classifier is then trained on the extracted features.
License Attribution 4.0 International (CC BY 4.0).
MediaEval’20, December 14-15 2020, Online                                            On the other hand, a logistic regression model is trained on word
                                                                                     embeddings generated via BERT.
MediaEval’20, December 14-15 2020, Online                                                                                                A. Hamid et al.

            Training Samples           Models
                                                                                 The last three runs are based on the binary classification task,
      C1          C2           C_3_1                                          where the first two (i.e., Run 4 and Run 5) are based on BoW with ma-
                                                  Late Fusion                 jority voting and accumulative classification based fusion methods
                                                            Predicted_Label   while the final one (i.e., Run 6) is based on BERT with accumulative
      C1          C2           C_3_2
                                                                              score based fusion scheme. As expected, the performance of all the
                                                                              methods is significantly higher on the binary classification task
                                                                              compared to ternary classification task.
                                                                                 Similar trend has been also observed on the test set, where overall
      C1           C2          C_3_n
                                                                              better results are obtained with BoW under majority voting scheme.

Figure 1: An illustration of the data balancing techniques
used in the work.                                                             Table 1: Evaluation of our proposed approaches for (a) TMD
                                                                              and (b) SMD tasks on both development and test sets. In
                                                                              TMD, for the test set the official metric AUC (Area Under
                                                                              The Curve) ROC (Receiver Operating Characteristics) curve is
2.2        Methodology for SMD Task                                           used while the results on the development set are reported in
Graphs representation learning using Graph Neural Networks (GNNs)             terms of Micro F1-Score. On the other hand, SMD is evaluated
have been shown to be effective in various domains such as social             in terms of AUC ROC.
networks, biological networks, and financial networks. GNNs ag-                              (a) TMD
gregate the neighborhood representation within k hops and then
                                                                               Run     Dev. Set (F-Score)   Test Set (AUC ROC)       (b) SMD
apply a pooling such as SUM, MEAN, MAX to obtain the final rep-                Run 1         0.6066                0.3815
resentation of the node. Furthermore, GNN’s can be used to learn               Run 2         0.5666                0.3588        Run      Dev. Set
the representation of a simple graph structures [2, 10, 12], which             Run 3         0.5333                0.3002
                                                                               Run 4         0.6933                0.3944        Run 1     .9500
then can be used to classify the graphs. For graph classification,             Run 5         0.6666                0.3803
these methods learn the representation of nodes, followed by graph             Run 6         0.6533                0.3447
READOUT method, which is aggregating the node features obtained
after the final iteration of GNN.
   We model this problem as a graph classification task. Following            3.3      Runs Description in SMD Task
Keyule et al.[12], we train our model using three classes of the              For training the model, we divide the dataset into train/valid/valid
graphs 5G Conspiracy, non-conspiracy, other-conspiracy, and learn             (80/10/10). We used the grid search to obtain the best hyperparam-
the representation of the graphs.                                             eters. The model has four MLP layers, and use MAX and MEAN
                                                                              operations for neighbor pooling and graph pooling respectively.
3 RESULTS AND ANALYSIS                                                        The model is trained on 1000 epochs with a learning rate of 0.01, and
                                                                              dropout 0.3 is applied on the final layer output. The final embedding
3.1 Evaluation Metric                                                         size is 128. We evaluate our model on AUC-ROC and the result of
For the evaluation of the proposed methods, we used two different             the test set is given in Table 1(b). The results show that the model
metrics, namely (i) Micro F1-Score, and (ii) AUC (Area Under The              has discriminative power to learn to classify the graph structures.
Curve) ROC (Receiver Operating Characteristics) curve. AUC ROC                Furthermore, it shows that the diffusion of information depending
is the official evaluation metric on the task, and all the test results       on the type of information being spread forms a diffusion pattern.
are reported in terms of AUC ROC. On the other hand, F1-score is
used for the evaluation of the methods on the development set.                4     CONCLUSIONS AND FUTURE WORK
                                                                              The challenge is composed of two tasks, one aiming to analyze and
3.2        Runs Description in TMD Task                                       detect COVID-19 related fake news using tweets’ text while the
For TMD, we submitted six different runs mainly relying on two                other aims to analyze network structure for the possible detection
approaches, namely BERT and BoW, under two late fusion schemes.               of the fake news. For the first task, we mainly relied on two state-of-
Three of the runs are based on binary classification while the three          the-art methods namely BoW and BERT embeddings under different
deal the task as ternary classification problem. It is to be noted            fusion schemes. Overall better results are obtained with BoW under
that the fusion schemes are used to combine the scores/output of              the majority voting scheme. For the SMD task, we rely on GNNs
the four individual models trained as result of the data balancing            to differentiate among different conspiracy theories on COVID-
method as described earlier.                                                  19. In the current implementations, both textual and structural
   The first three runs are based on the ternary classification task,         information are used independently, in the future we aim to enrich
where run 1 and run 2 are based on BoW with majority voting and               the structural information with the textual information for better
accumulative classification scores of the individual models. The              detection of fake news.
third and final ternary run is based on BERT features, where a
logistic regression model is trained on word embeddings generated             REFERENCES
by BERT. As can be seen in Table 1a, overall, better results are               [1] Kashif Ahmad, Konstantin Pogorelov, Michael Riegler, Nicola Conci,
obtained with BoW approach under the majority voting scheme.                       and Pal Halvorsen. 2019. Social media and satellites: Disaster event
FakeNews: Corona virus and 5G conspiracy                                                                MediaEval’20, December 14-15 2020, Online


    detection, linking and summarization. MULTIMEDIA TOOLS AND                      New Journal of Physics 17, 11 (2015), 113045.
    APPLICATIONS 78, 3 (2019), 2837–2875.                                       [8] Konstantin Pogorelov, Daniel Thilo Schroeder, Luk Burchard, Johannes
[2] Cătălina Cangea, Petar Veličković, Nikola Jovanović, Thomas Kipf,               Moe, Stefan Brenner, Petra Filkukova, and Johannes Langguth. 2020.
    and Pietro Liò. 2018. Towards Sparse Hierarchical Graph Classifiers.            FakeNews: Corona Virus and 5G Conspiracy Task at MediaEval 2020.
    (2018). arXiv:stat.ML/1811.01287                                                In MediaEval 2020 Workshop.
[3] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova.           [9] Naina Said, Kashif Ahmad, Michael Riegler, Konstantin Pogorelov,
    2018. Bert: Pre-training of deep bidirectional transformers for language        Laiq Hassan, Nasir Ahmad, and Nicola Conci. 2019. Natural disasters
    understanding. arXiv preprint arXiv:1810.04805 (2018).                          detection in social media and satellite imagery: a survey. Multimedia
[4] Siva Charan Reddy Gangireddy, Cheng Long, and Tanmoy                            Tools and Applications 78, 22 (2019), 31267–31302.
    Chakraborty. 2020. Unsupervised Fake News Detection: A Graph-              [10] Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2018.
    based Approach. In Proceedings of the 31st ACM Conference on Hyper-             How Powerful are Graph Neural Networks? CoRR abs/1810.00826
    text and Social Media. 75–83.                                                   (2018). arXiv:1810.00826 http://arxiv.org/abs/1810.00826
[5] Yi Han, Shanika Karunasekera, and Christopher Leckie. 2020. Graph          [11] Shuo Yang, Kai Shu, Suhang Wang, Renjie Gu, Fan Wu, and Huan
    Neural Networks with Continual Learning for Fake News Detection                 Liu. 2019. Unsupervised fake news detection on social media: A
    from Social Media. arXiv preprint arXiv:2007.03316 (2020).                      generative approach. In Proceedings of the AAAI Conference on Artificial
[6] Muhammad Imran, Prasenjit Mitra, and Carlos Castillo. 2016. Twitter             Intelligence, Vol. 33. 5644–5651.
    as a lifeline: Human-annotated twitter corpora for NLP of crisis-related   [12] Rex Ying, Jiaxuan You, Christopher Morris, Xiang Ren, William L.
    messages. arXiv preprint arXiv:1605.05894 (2016).                               Hamilton, and Jure Leskovec. 2018. Hierarchical Graph Representation
[7] Chuang Liu, Xiu-Xiu Zhan, Zi-Ke Zhang, Gui-Quan Sun, and Pak Ming               Learning with Differentiable Pooling. CoRR abs/1806.08804 (2018).
    Hui. 2015. How events determine spreading patterns: information                 arXiv:1806.08804 http://arxiv.org/abs/1806.08804
    transmission via internal and external influences on social networks.