1. Motivation and Related Work

Journal of Information Science 49 (1) Linguistics

2161-4407

10.1177/0165551520985486

XAI-DisInfodemics: eXplainable AI for disinformation and conspiracy detection during infodemics

Paolo Rosso

prosso@dsic.upv.es 0 2

Berta Chulvi

berta.chulvi@symanto.com 1

Damir Korenčić

damir.korencic@gmail.com 0

Mariona Taulé

Xavier Bonet Casals

xbonet@mac.com

David Camacho

Angel Panizo

angel.panizo@upm.es

David Arroyo

Juan Gómez

Francisco Rangel

1 0 PRHLT - Pattern Recognition and Human Language Technology, Universitat Politècnica de València , UPV 1 Symanto Research , Spain 2 ValgrAI - Valencian Graduate School and Research Network of Artificial Intelligence

2021

3583 1 59 69

The aim of the XAI-DisInfodemics project is to investigate strategies for fighting disinformation based on insights from social science. To address the challenges of disinformation, we need interdisciplinary collaboration and the development of tools that private and public entities can use. The developed tools need to address the problem of disinformation detection from an eXplainable Artificial Intelligence (XAI) perspective. We aim to counter disinformation and conspiracy theories on the basis of fact checking of scientific information. Moreover, our aim is to be able to explain not only the AI models in their decision-making but also the narratives that are employed to trigger emotions in the readers and make disinformation and conspiracy theories believable and propagate among the social networks users. We also focus on the important problem of distinguishing between conspiracies and texts which are simply critical and oppositional from a mainstream perspective. The ifnal AI tool should help users to spot in documents those parts whose aim is to grab readers' attention by emotional appeals and that alert about a poor quality of the information. The tool is thought for the general public to improve users' digital literacy and its use will allow media and information platforms to be rated based on the quality of their health information.

eol>Infodemics Disinformation Detection Oppositional Thinking Analysis Conspiracy Theories Critical Thinking COVID-19 Telegram

1. Motivation and Related Work

The problem of the automatic detection of disinformation and conspiracy theories has recently gained popularity [1, 2, 3, 4, 5, 6]. It is framed as a binary classification problem with fine-grained versions corresponding to multi-label or multi-class classification. However, the prevalent true vs. false paradigm runs into dificulties when dealing with conspiracy theories in everyday communication exchanges. Conspiracy Theories (CTs) are complex narratives that attempt to explain the ultimate causes of significant events as cover plots orchestrated by secret, powerful and malicious groups (for a review, see 7). Once the explanation regarding the agency of these groups has entered the public imaginary, these narratives are invoked in social media messages alongside a very small number of factual elements, making them dificult to be debunked by fact-checkers.

In addition to this lack of factual information, another challenging aspect of combating CTs with NLP models stems from the dificulty of distinguishing critical thinking from conspiratorial thinking in automatic content moderation. This distinction is vital because labeling a message as conspiratorial when it is only oppositional could drive those who were simply asking questions into the arms of the conspiracy communities. As several authors from social science suggest, a fully-fledged conspiratorial worldview is the final step in a progressive "spiritual journey" that started questioning social and political orthodoxies [8, 9].

This approach begs the question of what makes people pass from criticizing mainstream views to joining conspiracy communities. Phadke et al. [10] have recently 3. Conspiracy Theories Detection in XAI-DisInfodemics established that the ratio of dyadic interaction with current conspiracist users is the most important feature in predicting whether or not users join conspiracy communities, even after controlling for individual factors. This The MediaEval 2022 FakeNews challenge [6] aimed work has an essential implication: if models do not dif- to tackle the spread of COVID-19 conspiracy theories ferentiate critical from conspiracist thinking, mindless through tweets, encompassing three subtasks: identicensorship may push people toward conspiracy commu- fying the stance of tweets towards conspiracy theories, nities. Another important but still though neglected issue detecting misinformation posters based on social netin the computational analysis of the conspiratorial texts work graphs, and an enhanced version of the first subis the role that these narratives play in intergroup conflict task incorporating graph data. The challenge utilized (for a recent review of intergroup conflict concept see 11). a dataset comprising 1,913 development and 830 test The increased involvement of conspiracist communities tweets/users, supplemented by a large user graph. Perin political processes, including violence, suggests that formance was measured using the Matthews correlation the purpose of CTs is to enforce group dynamics and coeficient (MCC) [26]. coordinate action [12]. Therefore, from a computational Two teams composed of paper authors contributed linguistic approach, we need to pay attention not only to their approaches to address the challenge’s subtasks. the topics [6] but also to the elements of narrative relat- The UPV team [27] focused on enhancing a transformering to the intergroup conflict. This requires fine-grained based system with additional features, model ensembles, span-level detection that has been used as an approach and GPT-3-augmented training data for Subtask 1, while to other problems [13, 14] but, to the best of our knowl- exploring Graph Neural Networks (GNNs) for Subtasks edge, not yet in the domain of computational analysis of 2 and 3. On the other hand, the UPM team [28] applied conspiracy theories. representational learning techniques to automatically discover relevant features from raw data for user classifi2. Disinformation Detection in cation, utilizing Node2vec, FastRP, Random Forest, and XGBoost algorithms.

XAI-DisInfodemics Both teams demonstrated the efectiveness of their respective approaches, with UPV and UPM obtaining the In the framework of the XAI-DisInfodemics project1, sev- best results in Subtask 1 and Subtask 2, respectively. In eral works have been published on disinformation detec- Subtask 1, the UPV team achieved an MCC of 0.738 [27], tion. Disinformation was studied considering also the surpassing the second-best team’s MCC of 0.710 [29]. role that bots and trolls may have [15] and their polar- For Subtask 2, the UPM team achieved an MCC of 0.459, isation dynamics [1]. False information in health was outperforming the second-best team’s MCC of 0.355 [29]. investigated in [16], also with respect to vaccines [17]. Building upon the experience from the MediaEval chal

Disinformation detection was also addressed in [18] lenge, we proceeded to explore the capabilities of large and semi-automated fact-checking through semantic sim- language models (LLMs) for handling the task of conspirilarity in [19]. (author?) [20] investigated the impact that acy theory classification [ 30]. Our investigation utilized psycholinguistic patterns may have in discriminating be- the same dataset to examine the zero-shot performance tween disinformation spreaders and fact checkers. The of GPT-3 in accurately classifying fine-grained, multicorrelation between false information spreaders and po- label conspiracy theories. We also utilized the dataset litical bias were also investigated and a new dataset was to analyze the GPT’s ability to interpret and utilize defiprovided [21]. nitions efectively. We experimented with several types

Moreover, a widget was designed to analyse cloaked of definitions, including descriptive noun phrases and science [22] disinformation and content spread by bots human-crafted definitions, and proposed methods for [23]. Rumor and clickbait detection were addressed by both generating definitions from examples and assessing combining information divergence measures and deep GPT-3’s comprehension of the definitions. The results learning techniques [24], and multiplatform dynamics demonstrate a positive correlation between the quality were investigated in order to study negationists on Twit- of class definitions and the zero-shot performance [30]. ter and Telegram [25].

1Grant PLEC2021-007681 funded by MCIN/AEI/

10.13039/501100011033 and by European Union NextGenerationEU/PRTR.

4. XAI-DisInfodemics Dataset For the creation of XAI-DisInfodemics dataset, we first

manually compiled a list of 2,273 public Telegram channels in English and Spanish that contain oppositional non-mainstream views on the COVID-19 pandemic. We retrieved and filtered messages from the channels based developed with the goal of identifying, at the text span on a set of oppositional and conspiracy keywords related level, how oppositional and conspiracy narratives use to COVID-19. Then the messages were cleaned by re- intergroup conflict. The annotation was performed for moving duplicates, short texts, and texts with a large the described 5,000 binary-labeled messages per language. proportion of non-regular words (such as URLs and men- Inspired by Lasswell’s paradigm [31], we identify the tions). Finally, the messages were ranked using an index following six categories of narrative elements at the span of quality based on the properties of a message and its level (an example, with the abbreviations defined below, channel. The index is composed of several criteria captur- is displayed in Figure 1): ing the prevalence of COVID-19 topics and the channel’s activity. Agents (A): The hidden power that pulls the strings

We developed an annotation schema to diferentiate of the conspiracy. In critical messages, agents are acbetween the messages criticizing the mainstream views tors that design the mainstream public health policies on COVID-19 and the messages evoking the existence of (Government, WHO, among others). a conspiracy. A message was labeled "conspiracy" if any of these four criteria were met: ( 1 ) it framed COVID-19 or Objectives (O): Parts of the narrative that answer the a related public health strategy as the result of the agency question “what is intended by the agents of the CT or of a small and malevolent secret group; ( 2 ) it claimed that by the promoters of the action being criticized from a the pandemic is not real (e.g. a plandemic); ( 3 ) it accused critical thinking perspective?” critics of the conspiracy theory of being a part of the plot; Consequences (CN): Parts of the narrative that de(4) it divided society into two: those who know the truth scribe the efects of the agent’s actions. (the conspiracy theorists) and those who remain ignorant.

A message was labeled “critical” if it opposed publicly Facilitators (F): The facilitators are those who collabaccepted understandings of events but had none of these orate with the conspirators. In critical messages, facilitafour characteristics of the conspiratorial mindset. tors are those who implement the measures dictated by

Using this annotation scheme, 5,000 messages per lan- the authorities. guage were annotated as "conspiracy" or "critical" think- Campaigners (CM): In conspiracy messages, the caming. For these messages we performed anonymization paigners are the ones who uncover the conspiracy theory. by removing sensitive and identifiable information such In critical messages, campaigners are those who resist as nicknames, user IDs and e-mail addresses. The aver- the enforcement of laws and health instructions. age text length is 128 tokens for Spanish texts and 265 tokens for English texts that tend to elaborate more on Victims (V): Victims are the people who are deceived conspiracy theories. into following the conspiratorial plan or the ones who

Each message was annotated by three linguists and sufer due to the decisions of the authorities. the inter-annotator agreement (IAA) was calculated. Dis- In the process of span-level annotation, each of the agreements were discussed with the social psychologist 5,000 Spanish and English messages was annotated by who created the annotation scheme. For English mes- two linguists. Currently, the annotation instructions are sages, the IAA in terms of Krippendorf’s is 0.79 for being discussed and improved and, to this end, we are us“conspiracy” messages and 0.60 for “critical” messages, ing the Gamma ( ) measure of the IAA test [32]. The prewhile the average observed percentage of agreement be- liminary annotation round (first 150 messages) yielded tween the three annotators is 91.4% , and 80.3%, respec- an average of 0.43. The following batch had an average tively. For Spanish messages, Krippendorf’s is 0.80 for gamma of 0.53, and the last one a of 0.61. We deemed “conspiracy” messages and 0.70 for “critical” messages, this a good agreement because it is close to or above corresponding the percentage agreements of 90.9% and the average agreement of other highly conceptual span84.9%. level schemes [33, 34]. So far, 2, 000 messages in both Moreover, a new fine-grained annotation scheme was for improvement, especially for the challenging Subtask 2. We intend to motivate participants to use advanced classification techniques and architectures, with the goal of discovering most accurate solutions for the real-world deployment.

6. XAI-DisInfodemics App

In collaboration with Symanto3, we are developing an application based on the research on detection and analysis of oppositional narratives. The application is envisioned as a tool that will enable users to determine whether a social media text in English or in Spanish contains elements of conspiratorial or critical narratives, and to detect finegrained narrative elements. Target audience for the application are journalists, students and researchers in social sciences, as well as anyone interested in analysing and learning about oppositional narratives.

In addition to predict narrative categories, the application will highlight key parts of text for making the predictions, using XAI techniques[36]. This feature will facilitate the users’ analysis of text, and increase the users’ confidence in the AI model. The application will be easily accessible via a web browser, and feature an easy-to-use graphical user interface. languages have been fully annotated with an average density of six spans per message.

5. XAI-DisInfodemics Task at PAN

At the PAN Lab2 we are organising a shared task on oppositional thinking analysis with the aim of addressing the following two new challenges for the NLP research community: ( 1 ) to distinguish the conspiracy discourse from other oppositional narratives that do not express a conspiracy mentality, and ( 2 ) to identify in online messages the key elements of a narrative that fuels the intergroup conflict in oppositional thinking. Accordingly, we propose two substasks: Subtask 1 A binary classification task consisting of diferentiating between ( 1 ) critical messages that question major decisions in the public health domain, but do not promote a conspiracist mentality; and ( 2 ) messages that view the pandemic or public health decisions as a result of a malevolent conspiracy by secret, influential groups.

3https://www.symanto.com/es/ Subtask 2 A token-level classification task aimed at

recognizing text spans corresponding to the key elements of oppositional narratives. Since conspiracy narratives are a special kind of causal explanation, we developed a span-level annotation scheme that identifies the goals, efects, agents, and the groups in conflict. Acknowledgements

In Subtask 1, model performance is equally important for both the “conspiracy” and the “critical” classes. Addi- This work was carried out in the framework of Project tionally, high-performance classifiers are desirable since XAI-DisInfodemics: eXplainable AI for disinformation errors in automatic content moderation can directly or and conspiracy detection during infodemics (PLEC2021indirectly promote the conspiracist mentality. To this 007681) funded by MICIU/AEI/ 10.13039/501100011033 end, we use MCC since the dataset is balanced, more reli- and by “European Union NextGenerationEU/PRTR”. able and less optimistic then the macro-averaged F1 [26], and compares favorably to other alternatives [35]. For Subtask 2, we will use an adaptation of the F1 measure References suited for a sequence-labeling scenario with long and overlapping spans [33], which was applied in previous SemEval evaluation of systems for span-level propaganda annotation [13].

We already performed experiments with transformerbased baseline models for both subtasks. For Subtask 1 we used pre-trained BERT transformers and fine-tuned them for the binary tasks. This baseline yielded a MCC of 0.68 for Spanish and 0.79 for English texts. For Subtask 2, we experimented on currently annotated data using a pretrained BERT model, with 6 token classification heads (one per category), and fine-tuning the model using multitask learning. This approach yielded the results of 0.54 (English) and 0.45 (Spanish) in terms of the adapted F1 measure of (author?) [33]. The baseline results show that both tasks are feasible, although there is still room 2https://pan.webis.de/clef24/pan24-web/oppositionalthinking-analysis.html

[1]

Rufo ,

Semeraro ,

Giachanou ,

Rosso , Studying fake news spreading, polarisation dynamics, and manipulation by bots: A tale of networks and language , Computer Science Review 47 100531. doi: 10 .1016/j.cosrev. 2022 .100531. URL https://www.sciencedirect.com/science/ article/pii/S157401372200065X

[2]

Gambini ,

Tardelli ,

Tesconi , The anatomy of conspiracy theorists: Unveiling traits using a comprehensive twitter dataset , Computer Communications 217 25 - 40 . doi: 10 .1016/j.comcom. 2024 . 01 .027. URL https://www.sciencedirect.com/science/ article/pii/S0140366424000264

[3]

Giachanou ,

Ghanem ,

Rosso , Detection of conspiracy propagators using psycho-linguistic