=Paper= {{Paper |id=Vol-3181/paper56 |storemode=property |title=FakeNews: Corona Virus and Conspiracies Multimedia Analysis Task at MediaEval 2021 |pdfUrl=https://ceur-ws.org/Vol-3181/paper56.pdf |volume=Vol-3181 |authors=Konstantin Pogorelov,Daniel Thilo Schroeder,Stefan Brenner,Johannes Langguth |dblpUrl=https://dblp.org/rec/conf/mediaeval/PogorelovSBL21 }} ==FakeNews: Corona Virus and Conspiracies Multimedia Analysis Task at MediaEval 2021== https://ceur-ws.org/Vol-3181/paper56.pdf
 FakeNews: Corona Virus and Conspiracies Multimedia Analysis
                   Task at MediaEval 2021
               Konstantin Pogorelov1 ,Daniel Thilo Schroeder13 ,Stefan Brenner5 ,Johannes Langguth1
                                       1 Simula Research Laboratory, Norway2 University of Oslo, Norway
                                           3 Simula Metropolitan Center for Digital Engineering, Norway
                               4 Technical University of Berlin, Germany5 Stuttgart Media University, Germany

                                       {konstantin,daniels,langguth}@simula.no,sb288@hdm-stuttgart.de

ABSTRACT                                                                             30, 2021 to collect a large number of tweets that include keywords
The FakeNews: Corona Virus and Conspiracies Multimedia Anal-                         related to the COVID-19 pandemic. Second, we started [8] the
ysis task, running for the second time as part of MediaEval 2021,                    manual labeling of randomly selected subset of approximately 2𝑘
focuses on the classification of tweet texts aiming detection of fast-               tweets. The annotation process has been performed by a team of
spreading misinformation. Task of this year extends the number of                    researchers, postdocs, PhDs, and master students. Each tweet was
target conspiracy theories and introduces new challenges in terms                    annotated by at least two annotators. Disagreed annotations war
of analysis complexity of the imbalanced dataset. This paper de-                     resolved by a third experienced annotator. In cases when assigning
scribes the task, including use case and motivation, challenges, the                 a class was not obvious, the tweet was discussed with the entire
dataset with ground truth, the required participant runs, and the                    group until consensus was reached.
evaluation metrics.                                                                     We use three classes to label tweets:
                                                                                        Promotes/Supports Conspiracy class contains all tweets that
                                                                                     promotes, supports, claim, insinuate some connection between
1    INTRODUCTION                                                                    COVID-19 and various conspiracies, such as, for example, the idea
During the development of the COVID-crisis, a lot of new COVID-                      that 5G weakens the immune system and thus caused the current
related conspiracy theories have arise. Despite efforts of the major                 corona-virus pandemic; that there is no pandemic and the COVID-
social networks, mass-spread fake facts, irrational theories and                     19 victims were actually harmed by radiation emitted by 5G network
news-like posts are widely presented in the online media sources.                    towers; ideas about an intentional release of the virus, forced or
Rumors and other fast-spreading inaccurate, counterfactual, or in-                   harmful vaccinations, vaccine contains microchips, or the virus
tentionally misleading information can quickly permeate public                       being a hoax, etc. The crucial requirement is the claimed existence
consciousness and have severe real-world implications. Public at-                    of some causal link.
tention to the problem have already allowed content moderation                          Discusses Conspiracy class contains all tweets that just men-
and partial limitation of freedom of speech in order to prevent ma-                  tioning the existing various conspiracies connected to COVID-19, or
nipulation of COVID-related public opinion. Thus, fake news and                      negating such a connection in clearly negative or sarcastic manner.
intentional missinformation are still among the top global risks in                     Non-Conspiracy class contains all tweets not belonging to the
the 21st century [6]. Consequentially, we are particularly interested                previous two classes. Note that this also includes tweets that discuss
in detecting content associated with the fake news and COVID-                        COVID-19 pandemic itself.
related missinformation. We further differentiate between content                       We use the following nine categories that corresponds to the
that does not contain misinformation and content attributed to                       most popular conspiracy theories: Suppressed cures, Behaviour
other misinformation. Our task offers three subtasks, all require                    and Mind Control, Antivax, Fake virus, Intentional Pandemic,
text-based tweets classification.                                                    Harmful Radiation or Influence, Population reduction, New
   Similar to text-only classification challenges, e.g., [1, 4, 7], we ex-           World Order, and Satanism.
pect to see NLP approaches for tweet text analysis, but we aim wider                    The development and test datasets consist of 1, 554 and 266
set of conspiracy theories and different-level detection methodolo-                  tweets respectively. Both datasets are heavily unbalanced in terms
gies. Furthermore, we ask for evaluation of different approaches                     of the number of samples per class, reflecting the distribution of
with respect to real-world imbalanced datasets [3].                                  tweet topics and people’s opinions. The development dataset was
   The task is intended to be of interest to researchers in the ar-                  divided into pre-flight and primary development sets. Pre-flight
eas of online news, social media, multimedia analysis, multimedia                    development set was provided earlier than primary and thus used
information retrieval, natural language processing, and meaning                      to perform the initial approach selection and further as a validation
understanding and situational awareness.                                             set. To comply with the Twitter data publication policy, no data
                                                                                     was publicly shared during the active challenge phase. Thus, all
2    DATASET DETAILS                                                                 the registered participants are, in fact, become a closed group of
Our datasets creation can roughly be divided into four steps. First,                 researchers working together on one topic. To become a member
We used Twitters’ search API between January 17, 2020 and Jun                        of the research team all the registered participants are obliged to
                                                                                     sign an additional strict NDA agreement. Within this research, we
Copyright 2021 for this paper by its authors. Use permitted under Creative Commons
License Attribution 4.0 International (CC BY 4.0).                                   provide only tweet text content without any linking to the user
MediaEval’21, December 13-15 2021, Online                                            accounts or original tweets. The full-text datasets have not been
MediaEval’21, December 13-15 2021, Online                                                                                     K. Pogorelov et al.


made publicly available and they are sent to the members of the              In the submitted runs participants are allowed to use an ad-
research team via the direct emails.                                     ditional Cannot Determine class. This additional class represents
   After the challenge, the annotated datasets containing only tweet     cases, when the output of the classifier is not reliable. This addi-
IDs, but not the tweet text itself will be made publicly available.      tional class is important for evaluation of multi-class classifiers. The
These publicly available datasets will be shuffled and supplied by       effect of using Cannot Determine class is described in the related
the additional content to prevent linking to the full-text datasets      literature [2]. In-short, marking a sample that classifier cannot reli-
was used during the challenge by the researcher team. An additional      able classify as an unknown class affects the resulting classification
tweet content download script will be provided to obtain the tweets      performance less negatively than marking the sample with a wrong
from their ids via the corresponding Twitter API using a user-           class label, exactly as it expected to be implemented in a real-world
supplied API access keys.                                                classification tasks.
                                                                             With respect to the subtasks evaluation, the following method-
                                                                         ology is used. Text-Based Misinformation Detection subtask
3   EVALUATION METRICS AND SUBTASKS
                                                                         is evaluated with Rk-statistic directly. Text-Based Conspiracy
The officially reported metric used for evaluating the multi-class       Theories Recognition and Text-Based Combined Misinfor-
classification performance is the multi-class generalization of the      mation and Conspiracies Detection subtasks are evaluated with
Matthews correlation coefficient (MCC, Rk-statistic) [5]. This met-      the two-steps evaluation procedure. First, evaluation of each con-
ric provides an efficient and reliable comparison for multi-class        spiracy theory individually and independently is performed using
classifiers for both balanced and unbalanced datasets.                   Rk-statistic. Then all the computed Rk-statistic values across all
    In case of equal metric values, we use the timestamp of the          the conspiracy theories are averaged and the resulting averaged
official run submission to rank the teams. For the evaluation, the       value is used to compare results of different teams. Finally, results in
participants must submit at least one run for at least one subtask       each conspiracy theory group are evaluated independently, but this
defined below. Additionally, the participants optionally can submit      evaluation is auxiliary and do not affect the final teams ranking.
four more runs for any of the described subtasks, i.e., participants
can submit up to 15 runs in total.                                       4   DISCUSSION AND OUTLOOK
    Text-Based Misinformation Detection: In this subtask, the
                                                                         The task itself can be seen as very atypical and challenging due to a
participants receive a dataset consisting of tweet text blocks in
                                                                         fairly limited amount of information available to support the tweet
English related to COVID-19 and various conspiracy theories. The
                                                                         classification process. This reflects the real-world conditions in
participants are encouraged to build a multi-class classifier that can
                                                                         which online social media analysis systems are deployed. Thus, this
flag whether a tweet promotes/supports or discusses at least one
                                                                         task is a practical attempt to make a step towards building a usable
(or many) of the conspiracy theories. In the case if the particular
                                                                         multi-modal social network analysis system that is able to combine
tweet promotes/supports one conspiracy theory and just discusses
                                                                         isolated data source properties with inter-source relations. Due to
another, the result of the detection for the particular tweet is ex-
                                                                         the importance of the use case, we hope to motivate researchers
pected to be equal to "stronger" class: promote/support in the given
                                                                         from different research fields to present their approaches, thereby
sample.
                                                                         performing research that can help society to fight against malicious
    Text-Based Conspiracy Theories Recognition: In this sub-
                                                                         manipulations of social networks and threats to society in general.
task, the participants receive a dataset consisting of tweet text
                                                                         We hope that the FakeNews task can help to raise awareness of the
blocks in English related to COVID-19 and various conspiracy the-
                                                                         topic, but also provide an interesting and meaningful use case to
ories. The main goal of this subtask is to build a detector that can
                                                                         researchers interested in this application.
detect whether a text in any form mentions or refers to any of the
predefined conspiracy topics.
    Text-Based Combined Misinformation and Conspiracies
                                                                         ACKNOWLEDGMENTS
Detection: In this subtask, the participants receive a dataset con-      This work was funded by the Norwegian Research Council under
sisting of tweet text blocks in English related to COVID-19 and          contracts #272019 and #303404 and has benefited from the Experi-
various conspiracy theories. The goal of this subtask is to build        mental Infrastructure for Exploration of Exascale Computing (eX3),
a complex multi-labelling multi-class detector that for each topic       which is financially supported by the Research Council of Norway
from a list of predefined conspiracy topics can predict whether a        under contract 270053. We also acknowledge support from Michael
tweet promotes/supports or just discusses that particular topic.         Kreil in the collection of Twitter data.
    All the subtask, in which the team has decided to participate,
requires one mandatory and four optional runs to be submitted.           REFERENCES
The required mandatory run implements a pure NLP classification           [1] 2018. Toxic Comment Classification Challenge - Identify and clas-
of tweets based only on tweet text content without using any addi-            sify toxic online comments. (2018). https://www.kaggle.com/c/
tional sources of data. Optional runs gradually extend the amount             jigsaw-toxic-comment-classification-challenge/
and types of allowed additional information by implementing clas-         [2] Sabri Boughorbel, Fethi Jarray, and Mohammed El-Anbari. 2017. Opti-
                                                                              mal classifier for imbalanced data using Matthews Correlation Coeffi-
sification based on tweet text analysis in combination with pre-
                                                                              cient metric. PloS one 12, 6 (2017), e0177678.
trained models and classification using any automatically scraped
                                                                          [3] Nitesh V Chawla, Nathalie Japkowicz, and Aleksander Kotcz. 2004.
data from any external sources. Manual annotation of tweets or                Special issue on learning from imbalanced data sets. ACM SIGKDD
any externally scraped data is not allowed in any run.                        explorations newsletter 6, 1 (2004), 1–6.
FakeNews: Corona Virus and Conspiracies Multimedia Analysis Task            MediaEval’21, December 13-15 2021, Online


[4] Quan Do. 2019. Jigsaw Unintended Bias in Toxicity Classification.
    (2019).
[5] Jan Gorodkin. 2004. Comparing two K-category assignments by a K-
    category correlation coefficient. Computational biology and chemistry
    28, 5-6 (2004), 367–374.
[6] Lee Howell. 2013. Digital Wildfires in a Hyperconnected World. https:
    //bit.ly/2GiEF4f. (2013).
[7] Akshay Mungekar, Nikita Parab, Prateek Nima, and Sanchit Pereira.
    2019. Quora insincere question classification. National College of
    Ireland (2019).
[8] Konstantin Pogorelov, Daniel Thilo Schroeder, Petra FilkukovĂĄ, Stefan
    Brenner, and Johannes Langguth. 2021. WICO Text: A Labeled Dataset
    of Conspiracy Theory and 5G-Corona Misinformation Tweets. In
    Proceedings of the 2021 Workshop on Open Challenges in Online Social
    Networks. 21–25.