=Paper=
{{Paper
|id=Vol-2882/MediaEval_20_paper_26
|storemode=property
|title=MediaEval
2020: An Ensemble-based Multimodal Approach for Coronavirus and 5G Conspiracy Tweet
Detection
|pdfUrl=https://ceur-ws.org/Vol-2882/paper26.pdf
|volume=Vol-2882
|authors=Chahat Raj,Mihir
Mehta
|dblpUrl=https://dblp.org/rec/conf/mediaeval/RajM20
}}
==MediaEval
2020: An Ensemble-based Multimodal Approach for Coronavirus and 5G Conspiracy Tweet
Detection==
MediaEval 2020: An Ensemble-based Multimodal
Approach for Coronavirus and 5G Conspiracy Tweet
Detection
Chahat Raj1, Mihir P Mehta2,
1Delhi Technological University, Delhi, India
2
Indian Institute of Management Raipur, Chhattisgarh, India
chahatraj58@gmail.com. mihirm3795@gmail.com
ABSTRACT
One such misinformation that has impacted the thoughts and
In the wake of ongoing COVID-19 pandemic, a parallel stream of lifestyle of people and the emergence of technology and
misinformation and conspiracies rises on the internet. People revenue of several brands is 5G Corona Conspiracy. This
around the world are being flooded with texts and visuals conspiracy has played its significant path to impact the minds
claiming false statements linked with coronavirus disease. This of consumers by creating ambiguity about the safety of using
paper presents a multi-modal fake news detection system that 5G communication technology.
uses text and image features to detect conspiracy tweets. This To fight the ongoing misinformation wave amidst the
research has been performed in context with the FakeNews: pandemic, our NLP subtask at MediaEval 2020 uses ensemble
Coronavirus and 5G Conspiracy task of MediaEval 2020. The NLP technique with multiple ML and DL models to identify 5G
subtask we have performed utilizes an ensemble of machine related coronavirus conspiracies prevalent on Twitter. Detailed
learning and deep learning algorithms for the analysis of textual- overview of the task and dataset has been described by
visual data. We demonstrate the performances of experiments Pogorelov et al. [1], [2].
performed for each modality and results obtained after their
fusion. 2 APPROACH
1 INTRODUCTION We adopt an ensembling approach incorporating several machine
learning and deep learning-based text and image classifiers. We
Scientists, Economics, Mathematicians, Analysts and many other
divide our approach into three routines: text-based classification,
professionals have made their claim by formulating theories on
image-based classification and fusion of text and image models.
origination and spread of the Coronavirus Disease 2019 (COVID-
The proposed architecture uses a combination of features
19). Research and Investment are made both on cure and tracing
obtained from multiple classifiers. We experimented with
the cause of the origination of pandemic. And along with the rising
several text classifiers on the development dataset and decided
number of these theories, the spread of misinformation related to
to use a fixed subset of them based on the results obtained on
COVID-19, termed as ‘Infodemic’ has been on the rise too, a lot of
each one of them separately. We have used Support Vector
times from internet users, public figures and potentially trusted
Machine (SVM), Naïve Bayes (NB), K-Nearest Neighbour (KNN),
sources. Messages and media carrying such misinformation are
LSTM (Long-Short Term Memory) and Bi-LSTM (Bidirectional-
spread both intentionally and unintentionally. Several times, they
LSTM) for the NLP classification task. Each tweet undergoes
have been linked with existing theories that make them sound
preprocessing steps before being passed to these classifiers.
true despite not involving either substantial proof or logic. People
These include URL removal, punctation removal, lowercasing,
also get amused by the superficial texts and images carried by the
tokenization, stopword removal, stemming/lemmatization and
misinformation and tend not to verify the credibility that it
padding. We incorporate LSTM and Bi-LSTM with series of
carries. Moreover, they pass it further to their friends and families
Dense layers and setting Dropout value to 0.5. RMSprop
whom they are trusted by and ultimately the misinformation
optimizer has been used while training LSTM and Bi-LSTM
manages to convince a large group of audience that is connected
models for 15 epochs each with a batch size equal to 64. For
via this network and thereby impacting the habit and lifestyle of
text-based approach, classification results obtained from SVM,
the people that accept it. These changes can have an adverse effect
NB, KNN, LSTM and Bi-LSTM are used for majority voting to
or tend to be of no use and consume time and other material
obtain final predictions.
resources of people. Hence it becomes necessary to identify,
For visual classification, we filtered tweets containing
evaluate and share the authenticity of every information,
images and obtained 171 images with the label 5G Coronavirus
especially those involving conspiracy claims.
Conspiracy, 118 belonging to Other Conspiracy class and the
rest 791 were Non-Conspiracy tweet images. The test set
consisted of 617 images. We fine-tune and use three deep
Copyright 2020 for this paper by its authors. Use permitted under Creative Commons
License Attribution 4.0 International (CC BY 4.0). learning models namely, VGG16 [3], Xception [4] and
MediaEval’20, December 14-15 2020, Online InceptionV3 [5] for classifying images and use their results for
majority voting to make final predictions.
MediaEval’20, December 14-15 2020, Online C. Raj, M. Mehta
3 RESULTS AND ANALYSIS
These models have been pre-trained previously and we
fine-tune them setting the dropout value to 0.5 and an added The MediaEval 2020 FakeNews: Coronavirus and 5G Conspiracy
batch normalization layer after the dropout layer. We have NLP subtask requires classification of tweets related to
used sigmoid activation in the last layer for binary predictions coronavirus and 5G conspiracy from other conspiracy and non-
and softmax for multiclass predictions. We have used Adam conspiracy tweets. Table 2 and Table 3 show the classification
optimizer for all visual classification models and trained them results on the development set and test set respectively. We
for 15 epochs each setting batch size to 64. perform five runs on the given task which include three-class
classification and coarse two-class classification wherein non-
conspiracy and other conspiracy tweets are combined into a single
class. Our first run performs ternary classification using text
classifiers only. The second run combines text and image modality
classification results and return results based on both combined.
The third, fourth and fifth runs are coarse two-class classifiers
performing text-based, image-based and classification based on
text and image features combined, respectively.
Table 3: Testing Phase Results
Figure 1: Ensemble Model Architecture
Runs Modality Classes MCC Score
For multi-modal classification, we ensemble all text and Run 1 Text 3- class 0.3408
image-based classifiers utilized and employ max-voting for Run 2 Text + Image 3-class 0.0674
final classification. Figure 1 demonstrates the ensembling
architecture. The class with the highest number of votes is Run 3 Text Binary 0.4179
selected as the predicted class for each tweet. Development on Run 4 Image Binary 0.0644
all runs has been performed by splitting the dataset into 7:3 Run 5 Text + Image Binary 0.0232
ratio for training and validation. We provide the details of
models used and results obtained on validation in Table 1 and
Observing the trend of results obtained in development and
Table 2.
training phases, we observe that binary classifier performed
Table 1: Experimental Model Details better than three-class classifier. Our binary text classifier
achieved third highest score (0.4179) in the challenge. This
Runs Modality Models
demonstrates that our model finds it easier to distinguish 5G
coronavirus conspiracies from all other conspiracies and real
Run 1 Text SVM, NB, KNN, LSTM, Bi-LSTM tweets. Ternary text-based classification achieved a score of
0.3408. Image-based detection quality can be further improved
(SVM, NB, KNN, LSTM, Bi-LSTM), (VGG-16, significantly. Low scores of models using image modality owe
Run 2 Text + Image
Xception, InceptionV3) to the small size of visual data. Proposed method with larger
Run 3 Text SVM, NB, KNN, LSTM, Bi-LSTM dataset would perform eminently. We suggest the use of data
augmentation techniques for better performance.
Run 4 Image VGG-16, Xception, InceptionV3
4 DISCUSSION
(SVM, NB, KNN, LSTM, Bi-LSTM), (VGG-16,
Run 5 Text + Image In this paper, we employ machine learning and deep learning-
Xception, InceptionV3)
based ensembling technique that uses majority voting to deduce
predictions if a tweet is related to 5G Coronavirus conspiracy or
Table 2: Development Phase Results not. We perform a multimodal analysis utilizing text-based NLP
features from the tweet and visual features from the images posted
Runs Class Acc P R F1 ROC
along with those tweets. We build a fusion model that incorporates
both textual and visual features and generates prediction based on
Run 1 Ternary 0.6965 0.5333 0.5111 0.5220 0.6978 each modality separately and their combination. Our classification
Run 2 Ternary 0.6157 0.4043 0.2568 0.3140 0.5394 approach plays with both binary and ternary classifiers to
experiment with the efficiency of the ensemble models. The
Run 3 Binary 0.8357 0.3797 0.6390 0.4764 0.8190
limitation we encounter is the lack of sufficient training data and
Run 4 Binary 0.7824 0.1892 0.2917 0.2295 0.5471 propose to fix it in future works using data augmentation
Run 5 Binary 0.7639 0.1351 0.3846 0.2000 0.5574 techniques on both text and image data to receive better
performance and healthier conspiracy detection.
FakeNews: Corona virus and 5G conspiracy C. Raj, M. Mehta
REFERENCES
[1] Pogorelov, Konstantin, Daniel Thilo Schroeder, Luk
Burchard, Johannes Moe, Stefan Brenner, Petra Filkukova and
Johannes Langguth. "FakeNews: Corona Virus and 5G
Conspiracy Task at MediaEval 2020." Proc. of the MediaEval
2020 Workshop, Online, 14-15 December 2020.
[2] Schroeder, Daniel Thilo, Konstantin Pogorelov, and Johannes
Langguth. "FACT: a Framework for Analysis and Capture of
Twitter Graphs." In 2019 Sixth International Conference on
Social Networks Analysis, Management and Security (SNAMS),
pp. 134-141. IEEE, 2019.
[3] Karen Simonyan and Andrew Zisserman. 2015. Very Deep
Convolutional Networks for Large-Scale Image Recognition. In
International Conference on Learning Representations
[4] Chollet, F. (2017). Xception: Deep learning with depthwise
separable convolutions. In Proceedings of the IEEE conference
on computer vision and pattern recognition (pp. 1251-1258).
[5] Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., ... &
Rabinovich, A. (2015). Going deeper with convolutions. In
Proceedings of the IEEE conference on computer vision and
pattern recognition (pp. 1-9).