=Paper= {{Paper |id=Vol-3782/paper7 |storemode=property |title=Are Misinformation Propagation Models Holistic Enough? Identifying Gaps and Needs |pdfUrl=https://ceur-ws.org/Vol-3782/paper7.pdf |volume=Vol-3782 |authors=Raquel Rodríguez-García,Álvaro Rodrigo,Roberto Centeno |dblpUrl=https://dblp.org/rec/conf/codai2/Rodriguez-Garcia24 }} ==Are Misinformation Propagation Models Holistic Enough? Identifying Gaps and Needs== https://ceur-ws.org/Vol-3782/paper7.pdf
                         Are Misinformation Propagation Models Holistic Enough?
                         Identifying Gaps and Needs
                         Raquel Rodríguez-García1 , Álvaro Rodrigo1 and Roberto Centeno1
                         1
                             NLP & IR Group, Universidad Nacional de Educación a Distancia (UNED), 28040 Madrid, Spain


                                        Abstract
                                        Misinformation has experienced increased online diffusion, mainly due to the low control of published content
                                        and low interest in fact-checking it from social media users. Many efforts have focused on misinformation-related
                                        tasks, although typically centered on one perspective, such as shared texts or users’ connections. There is a
                                        lack of holistic integrations of these local and global perspectives. Misinformation propagation models allow
                                        us to simulate how misinformation spreads through social media, and they are a way to combine both of those
                                        dimensions. In this work, we present a comprehensive study of the state of the art in this task to highlight
                                        these approaches’ limitations and to establish the requirements for these models to approach misinformation
                                        propagation from a more holistic perspective.

                                        Keywords
                                        Rumor Propagation, Fake News, Multi-agent Systems, Epidemiological Models




                         1. Introduction
                         Misinformation has proven to have a perverse effect by manipulating the public through different
                         techniques, such as appealing to their emotions or fears to foster its believability [1]. It has negatively
                         affected democratic processes, such as the 2016 and 2020 US Elections [2, 3], and spread potentially
                         harmful content, such as the misinformation regarding the COVID-19 pandemic [4]. Many efforts
                         are underway to determine what distinguishes fake content from other information [5], to detect its
                         presence [6], or what users are more susceptible [7]. At the micro level, fake news detection is addressed
                         by analyzing the information within a message. Recent efforts exploit Large Language Models (LLMs)
                         [8] for their enhanced performance. Other methods have explored the detection from a more rounded
                         standpoint, exploiting characteristics from Twitter (now X) threads [9], such as the depth of the tree, or
                         subjective metrics such as biases and credibility [10], outperforming state-of-the-art models.
                            At the macro or social network level, there have been efforts to detect profiles sharing misinforma-
                         tion [11], showing that information on user interactions improves results obtained using only user
                         information. The detection of bots is also explored through user features and network topology [12],
                         showing how bot formations foster high propagation rates. It has also been approached from the lens
                         of the differing stances within communities [13]. These features, from user characteristics to network
                         topology, prove informative for these tasks [14].
                            From these efforts, we notice a general lack of holistic integration. Some approaches to detect
                         spreaders have integrated information from different levels [15], such as shared information, user
                         profiles, and ego networks. Nonetheless, most efforts focus on disjointed perspectives, either local
                         [8] or global [12]. Holistic integration might limit the risk misinformation poses [16], especially
                         considering the complexity of organized campaigns. Propagation models, which allow us to simulate
                         how misinformation disseminates online, are a way to combine both dimensions.
                            There are significant efforts toward modeling the users and their psychological capabilities or
                         behaviors [17, 7], although none includes the shared information. Some approaches have considered
                          Proceedings of the 1st Workshop on COuntering Disinformation with Artificial Intelligence (CODAI), co-located with the 27th
                          European Conference on Artificial Intelligence (ECAI), pages 62–73, October 20, 2024, Santiago de Compostela, Spain
                          $ rrodriguez@lsi.uned.es (R. Rodríguez-García); alvarory@lsi.uned.es (Á. Rodrigo); rcenteno@lsi.uned.es (R. Centeno)
                          € https://sites.google.com/view/nlp-uned/people/%C3%A1lvaro-rodrigo-yuste (Á. Rodrigo);
                          http://nlp.uned.es/~rcenteno/indice.php (R. Centeno)
                           0009-0000-6964-5956 (R. Rodríguez-García); 0000-0002-6331-4117 (Á. Rodrigo); 0000-0001-9095-4665 (R. Centeno)
                                       © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings

                                                                                                              62
Raquel Rodríguez-García et al. CODAI Workshop Proceedings                                            62–73


the topics of the messages, their emotion, or users’ common interests [18, 19, 20], disregarding the
value of the content by itself. Regarding the macro level, most efforts employ synthetic networks [21].
Other approaches have used real topologies without the matching shared information [22], although it
is crucial to the diffusion [23].
   As it becomes apparent, misinformation has been commonly studied from the perspective of separate
signals. Although propagation models present an opportunity to connect them, there is still a lack of
research. The content of the shared information plays a significant role in the diffusion [24], differing
from real information [25]. Current efforts disregard this component, which seems counterintuitive,
given that the users interact with a message, textual or otherwise. With this work, we aim to high-
light current limitations within these models, affecting their holistic integration and exploring the
requirements for proper experimental frameworks.
   This paper is structured as follows. We review state-of-the-art propagation models in Section 2. In
Section 3, we expose their limitations, and in Section 4, we identify the requirements for these holistic
models. Finally, in Section 5, we highlight our conclusions.


2. State of the Art
This section reviews the state-of-the-art propagation models. We start with early approaches in Section
2.1, then continue with epidemiological-based models in Section 2.2. In Section 2.3, we introduce
non-epidemiological models, while Section 2.4 covers agent-based social models.

2.1. Early Approaches
Based on ordinary differential equations (ODEs), epidemiological models have been extensively em-
ployed to study the diffusion of a virus within a population [26]. These models introduce one or
several infected individuals into a population. The disease spreads amongst those susceptible until it
has affected the whole group, or its diffusion slowly stops. These models divide the population into
exclusive categories: Susceptible, Infected, and Removed. These are the states the users are in regarding
the disease, and they give this model its name: the SIR model.
   One of the first approaches to information diffusion adapts this epidemiological model [27] as an
“intellectual epidemic”, initially devised for its application in Information Retrieval. This approach
creates a simile between the spread of a disease and the dissemination of information. Using the concepts
in the epidemiological model as an analogy, the disease is now an idea or a piece of information, and
the individuals are readers waiting to come into contact with it.
   Stemming from the initial epidemiological model [26], other variations were proposed, such as the
Daley-Kendall (DK) or Ignorant-Spreader-Stifler (ISS) model [28], including rumor-specific concepts,
such as a decay rate to symbolize the forgetting of the information or its “news value”. A later adaptation,
the Maki-Thompson model [29], simplifies the former by altering the rate at which spreaders turn into
stiflers.
   These previous models, and others in this section, might rely on stochastic or deterministic processes.
In a stochastic process, the transitions between the compartments are probabilistic (finite-state Markov
Chain). In a deterministic model, transitions are expressed through differential equations. A determinis-
tic model is simpler than a stochastic one. However, it presents some drawbacks, such as the transition
rates being proportional to the population size [30] and not allowing for individual behavior or network
heterogeneity. Stochastic models incorporate randomness and are also more realistic [31], at the cost of
higher complexity [27].

2.2. Extension of Epidemiological Models
Based on the previous models, a formal definition of information diffusion we will use for these next
sections corresponds with the interactions between a population of N individuals, with an underlying
graph (directed or undirected) 𝐺 = (𝑉, 𝐸) for a set of vertices 𝑉 = {𝑣0 , ..., 𝑣𝑁 −1 } and a set of edges



                                                    63
Raquel Rodríguez-García et al. CODAI Workshop Proceedings                                           62–73


𝐸 = {𝑒0 , ..., 𝑒𝐸−1 } that connects them. A node represents a user, and the edges between the users
denote the connections, either explicit (follower-followee relationships) or implicit (interaction-based).
Diffusion would be measured in users’ internal stance (state) regarding the information per time unit.
   Many other models inspired by epidemiological diffusion have been proposed since its early ap-
proximation, adapted to rumor diffusion, such as the Susceptible-Infected (SI) model [32], where users
carry the information forever. The Susceptible-Infected-Susceptible (SIS) model [33], where the pop-
ulation would become Susceptible again, reflecting that users might forget the information. Lastly,
the Susceptible-Infected-Recovered-Susceptible (SIRS) model [34] considers the possibility of gaining
immunity after going through the infection (Recovered), and the possibility of losing it after some time
(Susceptible).
   These models face the problem of a clear divergence between information and epidemic transmission
and the complexity of the former. Information diffusion depends on many factors, such as network
topology or social interactions. Epidemiological models work on the assumption of a homogeneously in-
teracting population, which contrasts with complex social media networks, facing unexpected deviations
from the results obtained in epidemic fields [35, 36]. Another shortcoming involves the compartments
for the population. Individuals might not get Infected but rather turn Fact Checkers against misinforma-
tion or undergo a period of indecision. Due to these limitations, other models aim to include complex
factors not directly extracted from epidemiological behaviors but inspired by their interactions.
   In the Susceptible-Exposed-Infected-Recovered (SEIR) model [37], individuals might go through an
Exposed state after being in contact with an Infected node. Some variations consider the fuzziness of
a rumor and a hesitating mechanism before sharing [38], a Skeptic state where users never share the
information received (SEIZ) [39], or a transition to a Recovered state (SEIZR) [40]. The Susceptible-
Known-Infected-Recovered (SKIR) model [41] creates a state for the individuals that spread the anti-rumor,
drawing inspiration from evolutionary game theory for users’ behaviors. Also modeling their opinions,
the Susceptible-Positively Infected-Negatively Infected-Recovered (SPNR) model [42] includes two different
stances towards the rumor: Positively or Negatively Infected. Regarding their emotion, the Emotion-
based SIS model (ESIS) [19] classifies the message into seven differently weighted classes, such as fear
or happiness, thus rendering some emotions more effective for spreading.
   Other more complex models consider more states, such as the SCNDR model [43], where Susceptible
users in this model might turn Credulous, Neutrals, or Denies, as well as turn Recovered. Besides believing
the information or not, individuals might share it, not act or warn other users. The ICSAR model [44]
considers the states: Ignorant, Carrier, Spreader, Advocate and Removed. These states can be further
classified based on whether their information is a rumor or the truth. While users might transition
between the different states and stances, Advocate and Removed are sink states, thus reflecting how
users might not be persuaded to change their opinion.
   As it becomes apparent, many models have drawn inspiration from epidemiology studies. Although
they have been extended to account for information diffusion particularities, they still struggle to
reflect intricate behavior. Dividing the population into compartments simplifies the problem, but it
faces the difficulty of reflecting complex social behavior with a discrete label. As an example, in the
IS1 S2 C1 C2 R1 R2 model [45], the difference between the Super Authoritative and Authoritative or Super
Rumor Spreader and Rumor Spreader states might have more to do with node qualities and network
position rather than a state in a finite state machine.

2.3. Non-Epidemiological Models
Although epidemiological models have been extensively used in information diffusion, other mathemat-
ical models have been proposed. Some include Independent Cascades, the Linear Threshold Maximization
model, or Hawkes Processes.
   Independent Cascades start with a set of active nodes [46]. With each step, they might activate other
surrounding inactive nodes with a set probability dependent on the connecting edge. There is only one
chance for a node to activate its neighbors. This model has been used in the diffusion of information
[47], showing that the dynamics can reflect those of social media [48, 49]. Other variations do not limit



                                                    64
Raquel Rodríguez-García et al. CODAI Workshop Proceedings                                             62–73


influence to a one-time-only event but a window [50].
   The Linear Threshold model [51] establishes a threshold of surrounding neighbors for the users to
change their behavior. Once a node is active, it cannot be deactivated. Other variations introduce
weights between the nodes to account for social dynamics [52]. Further adaptations also introduce user
information and the similarity between previously shared content [53].
   Multivariate Hawkes Process is a type of stochastic point process model characterized by its ability
to self-excite. Hawkes Point Process model was originally proposed to investigate earthquake events
[54]. These models have been used for information diffusion on social media [55] and to devise how to
mitigate its effects [56].
   There are other lesser-known models, such as Push-Pull [57], which employs a pair-wise interaction
where the user shares their information to attempt to “push” or “pull” others. Markov Chains have also
been explored for this task, both discrete-time [58], and continuous-time [59].
   These previous models are more commonly Influence Maximization problems. Normally, a higher-
level controller supervises the optimization and the simulations, so individuality is limited. Additionally,
some of these problems are NP-Hard. Although there are many efforts towards its reduction and
optimization [47, 50], time complexity is high.
   Beyond the epidemiological analogy, other models have been proposed inspired by different naturally
occurring phenomena. Such is the case of the Energy Model [60] and the Forest Fire Model [61]. The
Energy Model is based on the physical theory of heat energy. This model alters the traditional paradigm
of a binary value for the diffusion, whether the user is infected or not, and leverages a continuous range
of agreement with the rumor, constituting their “energy”.
   The Forest Fire Model [61] is influenced by the process of fire spreading in a forest. Drawing inspiration
from the diverse factors that affect the formation and spread of fire, it creates a simile with social
interactions. The forest density relates to users’ ego networks, and the area’s topography relates
to the account activity. Further extensions allow users to receive the information without sharing
it and a similarity score between them to assess their probability of sharing [18]. Although textual
characteristics are included, they are used to model the users to establish similarity scores through
matching keywords, not as part of the shared content.

2.4. Agent-Based Social Models
A more recent trend is to exploit the potential of Agent-Based Social Systems (ABSS). Most previous
models assume homogeneity in user behavior, influence, or topology, which is limiting [30]. ABSS also
adopts compartmental epidemiological models while solving those issues.
   The SIR epidemiological model has been adapted to ABSS technologies [62]. This model considers
Infected users might get Cured by realizing the rumor is fake and stop sharing. Other studies distinguish
malicious and regular users and study their influence and susceptibility based on a belief system [63].
Similarly, it has been extended to account for bots and influencers with different behaviors [64], as well
as time dynamics or trust measures between agents [65]. These approaches face the problem of only
focusing on user-specific characteristics.
   Other efforts have modeled individual processes in users’ perceptions, such as an uncertainty-based
SIR model, where uncertainty is modeled through ambiguity and ignorance [66], or a cognitive-inspired
model where belief is measured based on dissonance and exposure [67]. Other common social effects
and theories, such as homophily or social influence, have also been studied, such as a segregation
between gullibles and skeptics within the population [68], aiding the spread of a rumor, or social context
based o similarity and influence propagation [69].
   Social sciences have been another interesting topic of research. The Big Five model [70] has been
estimated to explore user similarity [20], homophily regarding political views [21], or a trust model
based on users’ identity, behavior, and relationships [71]. Game theory and decision theory have also
been studied in the context of fake news [7], introducing common deception strategies to benefit from
the uncertainty. Social Impact Theory has also been used for modeling rumors [72] by introducing other
components such as persuasiveness or environmental bias. Lastly, echo chambers are also explored



                                                     65
Raquel Rodríguez-García et al. CODAI Workshop Proceedings                                          62–73


from different levels [17]: individual, environmental, and technological. Based on their experiments,
the individual level is enough to polarize the networks, but adding the other two components generates
more distinct groups.
   The last and more recent approach exploits the capacity of large language models (LLMs) to simulate
the opinions shared [73, 74]. Each agent potentially has different individual traits, personalities, and
memories, and they can engage in discussions where they can reflect on their opinions and update
them as needed. This new framework allows for fully customizable and rich environments to simulate
how disinformation spreads.
   An advantage of these approaches is the ability to test complex social-based behavior, such as
simultaneous information [75] or real discussions between the agents. Although mathematical models
have been used with centralized and decentralized measures [22], ABSS is more versatile and has been
studied further in this context to identify influential nodes and delay the diffusion process [76], to
study the simultaneous spread of a rumor and its counterinformation [75], and other measures based
on user attention [64]. The main problem in many of these studies is the lack of real data validation.
When including some of these social theories, the need arises to determine information from the user
that might not be easily extracted or determined. This forces the models to employ estimations or
distributions, which introduce biases.


3. Current Limitations
Propagation simulation models have some limitations, which affect their holistic integration. We can
summarize them based on the five main areas we explore below.
   Users. Whenever users are characterized, their metrics are established through probability distribu-
tions or means that cannot be validated, partly due to their complexity and the difficulty of extracting
them from real data. Some proposals also employ psychological models without contemplating that
associations between social media usage and these traits are not always found [77]; they might not
align with the modeled behavior, or they might vary over time.
   Content. Users have been the main focus in this area. Few studies consider the message through
incomplete dimensions [78] or to establish user similarity based on posted content [18]. As such, content
within the diffusion has not been explored. There is also a very pointed focus on the dichotomy of fake
and real news. A priori, information is unverified and might remain so. Focusing on the characteristics
rather than the truth value seems more realistic and valuable.
   Network. In most cases, the topologies used are synthetic or do not match the real diffusion.
This makes it impossible to connect users with their characteristics and topology, although it is an
essential component. Another limitation is creating a network with as many users as participants in the
conversation, which already creates an implicit bias. Although it would be computationally impossible
to include all the users in any social media network, only including those participating makes another
issue arise: predicting when users will not participate.
   Internal state. Most studies measure interaction based on states, which reflect an internal measure
of the users participating in the diffusion. Messages are used to make an abstraction of the users’ state.
This also allows intermediate states to reflect user behaviors that cannot be found in the real data. This
situation can be avoided by using the messages directly, reflecting diffusion more accurately since users
can share more than one message, but their state would remain a bounded constant. Messages were
only employed in one study [63], aggregating diffusion into zero messages, over 500, and in-between.
This would suggest that 500 retweets have the same relevance as 50.000, which should not be correct.
   Evaluation. In most cases, validation is done through empirical evaluation or the analysis of
mathematical properties of the diffusion within the networks. Although mathematical properties
provide a theoretical background, real complex networks are characterized by their non-trivial features,
which do not appear in synthetic graphs. Regarding empirical evaluation, incomplete data is most
commonly employed, which forces the issue of its validity. Some approaches have been evaluated
aggregating at the time level [63], which dismisses how relevancy works in social media: 500 retweets



                                                   66
Raquel Rodríguez-García et al. CODAI Workshop Proceedings                                             62–73


             Dataset               Content   Temporal     Network   User   Stance         Topic
             FakeNewsNet[79]          ✓          ✓          ✗        ✓       ✗           Politics
             Palin and Obama[80]      ✗          ✓          ✗        ✓       ✓           Politics
             ReCOVery[81]             ✓          ✓          ✗        ✓       ✗          COVID-19
             CoAID[82]                ✓          ✓          ✗        ✓       ✗          COVID-19
             MediaEval[11]            ✓          ✗          ✓        ✗       ✓         Conspiracies
             PHEME-9[83]              ✓          ✓          ✓        ✓       ✓           General
             SNAP[84]                 ✗          ✗          ✓        ✗       ✗              -

Table 1
Available datasets for information propagation models and their main characteristics


in 10 minutes do not equate to 500 retweets in 10 days.


4. Requirements for a Properly Experimental Framework
A proper experimental framework is required to overcome propagation models’ limitations. Within
this empirical evaluation, it is important to recreate the scenarios of a news piece’s diffusion on social
media. Datasets that contain the necessary information to evaluate the models are crucial. Below, we
explore the most relevant information required for this process.
   Information of the shared content (Content) and the users (User). The information shared is an
essential part of the diffusion. This includes the initial post, external website links, or visual content.
In terms of the users, since they are the main focus of these models, it is important to have enough
information on user metrics and engagement to properly characterize them.
   Temporal information of when the texts are shared, by whom, and which users engage with it at
what given times (Temporal). Besides the texts, we need to know the timestamps of when those posts
are shared to determine the evolution of the news: whether more information is added or corrected, as
well as how many times it appears at different times. Within the simulation, it determines when users
are engaging, which is crucial for the evaluation.
   The social network (Network). This is an important element in the diffusion of content online.
Although synthetic networks might reflect some properties of real social media networks, they pose a
significant limitation since diffusion inherently depends on those connections. Users with millions of
followers will have higher chances of broadcasting information than new users.
   Posts labeled with their stance (Stance). This is a relevant measure to study and evaluate diffusion in
terms of epidemiological-based models. Distinguishing between Infected and Vaccinated is essential,
equivalent to users’ stance towards a post (Support or Oppose).
   After establishing these requirements, we review available datasets to determine their suitability.
In Table 1, we include the most relevant ones we found and their relevant characteristics. We have
excluded datasets created ad-hoc since they are not publicly available and typically require retrieving
new data and those centered around topics unrelated to misinformation.
   As illustrated in Table 1, most datasets focus on the Text and Temporal aspects (the tweets and
timestamps), and the User information from the poster. Some datasets, such as MediaEval, anonymize
the tweets by removing the time when tweets were posted and removing the information from the users.
These features are essential to establish the diffusion of information. Regarding this type of content,
the SNAP collection does not provide diffusion information; it only shares the topologies from social
media networks. Although it is valuable information, the diffusion that matches the network is deemed
necessary. Some other collections, such as PHEME-9, include information regarding the users’ state or
stance towards the information. This information is also essential to evaluate epidemiological-based
models.
   Only two of the listed datasets include Network information associated with the diffusion: PHEME-9
and MediaEval. Medieval poses an additional problem due to its topology, created based on an interaction
network. It is also significantly filtered and skewed: 3,800 tweets are associated with a network of 1.7




                                                     67
Raquel Rodríguez-García et al. CODAI Workshop Proceedings                                        62–73


million nodes and 270 million edges. Based on these available resources, we can see that most current
available datasets do not provide enough information for a proper evaluation. This is an important
limitation and highlights the need for more publicly available content for the community to further
research efforts into mitigating fake news.


5. Conclusions and Future Work
Current misinformation-related tasks and approaches show a clear divide between the micro level, or
the content of the information, and the macro level, or the social network. There is a lack of holistic
integration between the different tools to address misinformation. Propagation models are one tool
that would allow a holistic approach by studying the diffusion of online misinformation from local and
global perspectives.
   With this work, we have studied current approaches to propagation diffusion models, from early
approaches with the SIR epidemiological model to non-epidemiological models, such as the Forest Fire
Model, and agent-based systems. From these approaches, we have appreciated some common limitations
that constrain the holistic view. Within these constraints, the most relevant one is disregarding the
information shared within the network, treated as a black box. To overcome these limitations, we
have determined the main requirements for a proper experimental framework that would allow us to
overcome them.
   In terms of future work, we believe it is paramount to focus on overcoming these limitations by
developing new models that consider the impact of the messages on the users. Additionally, posing
new evaluation frameworks that overcome the limitations of the users’ stances, such as focusing on the
messages, is another interesting research avenue. Lastly, developing new publicly available datasets
with the required information for these models is crucial for evaluating these models.


Acknowledgments
This work has been partially funded by the Spanish Research Agency (Agencia Estatal de Investigación),
through the DeepInfo project PID2021-127777OB-C22 (MCIU/AEI/FEDER, UE) and the HOLISTIC
ANALYSIS OF ORGANISED MISINFORMATION ACTIVITY IN SOCIAL NETWORKS project (PCI2022-
135026-2).


References
 [1] C. Martel, G. Pennycook, D. G. Rand, Reliance on Emotion Promotes Belief in Fake News, Cognitive
     Research: Principles and Implications 5 (2020) 1–20. doi:10.1186/s41235-020-00252-3.
 [2] E. Ferrara, H. Chang, E. Chen, G. Muric, J. Patel, Characterizing Social Media Manipulation in the
     2020 U.S. Presidential Election, First Monday 25 (2020). doi:10.5210/fm.v25i11.11431.
 [3] A. Guess, B. Nyhan, J. Reifler, Selective Exposure to Misinformation: Evidence from the Consump-
     tion of Fake News during the 2016 U.S. Presidential Campaign, European Research Council 9
     (2018) 4.
 [4] J. Y. Cuan-Baltazar, M. J. Muñoz-Perez, C. Robledo-Vega, M. F. Pérez-Zepeda, E. Soto-Vega, Misin-
     formation of COVID-19 on the Internet: Infodemiology Study, JMIR Public Health and Surveillance
     6 (2020) 1–9. doi:10.2196/18444.
 [5] H. Rashkin, E. Choi, J. Y. Jang, S. Volkova, Y. Choi, Truth of Varying Shades: Analyzing Language
     in Fake News and Political Fact-Checking, in: Proceedings of the 2017 Conference on Empirical
     Methods in Natural Language Processing, Association for Computational Linguistics, Copenhagen,
     Denmark, 2017, pp. 2931–2937. doi:10.18653/v1/D17-1317.
 [6] S. Raza, C. Ding, Fake News Detection Based on News Content and Social Contexts: A Transformer-
     Based Approach, International Journal of Data Science and Analytics 13 (2022) 335–362. doi:10.
     1007/s41060-021-00302-z.



                                                   68
Raquel Rodríguez-García et al. CODAI Workshop Proceedings                                       62–73


 [7] C. Kopp, K. B. Korb, B. I. Mills, Information-Theoretic Models of Deception: Modelling Cooperation
     and Diffusion in Populations Exposed to "Fake News", PLOS ONE 13 (2018) e0207383. doi:10.
     1371/journal.pone.0207383.
 [8] B. Hu, Q. Sheng, J. Cao, Y. Shi, Y. Li, D. Wang, P. Qi, Bad Actor, Good Advisor: Exploring the
     Role of Large Language Models in Fake News Detection, Proceedings of the AAAI Conference on
     Artificial Intelligence 38 (2024) 22105–22113. doi:10.1609/aaai.v38i20.30214.
 [9] C. Buntain, J. Golbeck, Automatically Identifying Fake News in Popular Twitter Threads, in: 2017
     IEEE International Conference on Smart Cloud (SmartCloud), 2017, pp. 208–215. doi:10.1109/
     SmartCloud.2017.40.
[10] P. Bazmi, M. Asadpour, A. Shakery, Multi-View Co-Attention Network for Fake News Detection by
     Modeling Topic-Specific User and News Source Credibility, Information Processing & Management
     60 (2023) 103146. doi:10.1016/j.ipm.2022.103146.
[11] K. Pogorelov, D. T. Schroeder, S. Brenner, A. Maulana, J. Langguth, Combining Tweets and
     Connections Graph for FakeNews Detection at MediaEval 2022, in: MediaEval 2022, volume 3583,
     2022, pp. 1–4.
[12] G. Caldarelli, R. De Nicola, F. Del Vigna, M. Petrocchi, F. Saracco, The Role of Bot Squads in
     the Political Propaganda on Twitter, Communications Physics 3 (2020) 1–15. doi:10.1038/
     s42005-020-0340-4.
[13] K. Neha, V. Agrawal, S. Chhatani, R. Sharma, A. B. Buduru, P. Kumaraguru, Understanding
     Coordinated Communities through the Lens of Protest-Centric Narratives: A Case Study on
     #CAA Protest, in: Proceedings of the International AAAI Conference on Web and Social Media,
     volume 18, 2024, pp. 1123–1133. doi:10.1609/icwsm.v18i1.31377.
[14] K. Shu, H. R. Bernard, H. Liu, Studying Fake News via Network Analysis: Detection and Mitigation,
     Springer International Publishing, Cham, 2019, pp. 43–65. doi:10.1007/978-3-319-94105-9_
     3.
[15] S. Sharma, R. Sharma, Identifying Possible Rumor Spreaders on Twitter: A Weak Supervised
     Learning Approach, in: 2021 International Joint Conference on Neural Networks (IJCNN), IEEE,
     2021, pp. 1–8. doi:10.1109/ijcnn52387.2021.9534185.
[16] A. Peñas, J. Deriu, R. Sharma, G. Valentin, J. Reyes-Montesinos, Holistic Analysis of Organised
     Misinformation Activity in Social Networks, in: Disinformation in Open Online Media, Springer
     Nature Switzerland, Cham, 2023, pp. 132–143. doi:10.1007/978-3-031-47896-3_10.
[17] D. Geschke, J. Lorenz, P. Holtz, The Triple-filter Bubble: Using Agent-based Modelling to Test
     a Meta-theoretical Framework for the Emergence of Filter Bubbles and Echo Chambers, British
     Journal of Social Psychology 58 (2019) 129–149. doi:10.1111/bjso.12286.
[18] S. Kumar, M. Saini, M. Goel, B. S. Panda, Modeling Information Diffusion in Online Social Networks
     Using a Modified Forest-Fire Model, Journal of Intelligent Information Systems 56 (2021) 355–377.
     doi:10.1007/s10844-020-00623-8.
[19] Q. Wang, Z. Lin, Y. Jin, S. Cheng, T. Yang, ESIS: Emotion-based Spreader–Ignorant–Stifler Model
     for Information Diffusion, Knowledge-Based Systems 81 (2015) 46–55. doi:10.1016/j.knosys.
     2015.02.006.
[20] L. Milli, Opinion Dynamic Modeling of News Perception, Applied Network Science 6 (2021) 76.
     doi:10.1007/s41109-021-00412-4.
[21] A. Coates, T. Muller, S. Sirur, Simulating the Impact of Personality on Fake News, in: TRUST@
     AAMAS, 2021, pp. 1–12.
[22] A. N. Zehmakan, C. Out, S. Hesamipour Khelejan, Why Rumors Spread Fast in Social Networks,
     and How to Stop It, in: Proceedings of the Thirty-Second International Joint Conference on
     Artificial Intelligence, International Joint Conferences on Artificial Intelligence Organization,
     Macau, SAR China, 2023, pp. 234–242. doi:10.24963/ijcai.2023/27.
[23] M. Karnstedt, M. Rowe, J. Chan, H. Alani, C. Hayes, The Effect of User Features on Churn in
     Social Networks, in: Proceedings of the 3rd International Web Science Conference, WebSci ’11,
     Association for Computing Machinery, New York, NY, USA, 2011, pp. 1–8. doi:10.1145/2527031.
     2527051.



                                                   69
Raquel Rodríguez-García et al. CODAI Workshop Proceedings                                         62–73


[24] B. Horne, S. Adali, This Just In: Fake News Packs A Lot In Title, Uses Simpler, Repetitive Content
     in Text Body, More Similar To Satire Than Real News, Proceedings of the International AAAI
     Conference on Web and Social Media 11 (2017) 759–766. doi:10.1609/icwsm.v11i1.14976.
[25] S. Vosoughi, D. Roy, S. Aral, The Spread of True and False News Online, Science 359 (2018)
     1146–1151. doi:10.1126/science.aap9559.
[26] W. O. Kermack, A. G. McKendrick, A Contribution to the Mathematical Theory of Epidemics,
     Proceedings of the Royal Society of London. Series A, Containing Papers of a Mathematical and
     Physical Character 115 (1927) 700–721. doi:10.1098/rspa.1927.0118.
[27] W. Goffman, V. A. Newill, Generalization of Epidemic Theory: An Application to the Transmission
     of Ideas, Nature 204 (1964) 225–228. doi:10.1038/204225a0.
[28] D. J. Daley, D. G. Kendall, Epidemics and Rumours, Nature 204 (1964) 1118–1118. doi:10.1038/
     2041118a0.
[29] D. P. Maki, M. Thompson, Mathematical Models and Applications, Prentice-Hall, 1973.
[30] D. J. Daley, D. G. Kendall, Stochastic Rumours, IMA Journal of Applied Mathematics 1 (1965)
     42–55. doi:10.1093/imamat/1.1.42.
[31] T. Britton, Stochastic Epidemic Models: A Survey, Mathematical Biosciences 225 (2010) 24–35.
     doi:10.1016/j.mbs.2010.01.006.
[32] D. Shah, T. Zaman, Rumors in a Network: Who’s the Culprit?, IEEE Transactions on Information
     Theory 57 (2011) 5163–5181. doi:10.1109/tit.2011.2158885.
[33] M. Kimura, K. Saito, H. Motoda, Efficient Estimation of Influence Functions for SIS Model
     on Social Networks, in: Proceedings of the 21st International Joint Conference on Artificial
     Intelligence, IJCAI’09, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2009, pp.
     2046–2051. doi:10.5555/1661445.1661772.
[34] R. Escalante, M. Odehnal, A Deterministic Mathematical Model for the Spread of Two Rumors,
     Afrika Matematika 31 (2019) 315–331. doi:10.1007/s13370-019-00726-8.
[35] M. Nekovee, Y. Moreno, G. Bianconi, M. Marsili, Theory of Rumour Spreading in Complex Social
     Networks, Physica A: Statistical Mechanics and its Applications 374 (2007) 457–470. doi:10.1016/
     j.physa.2006.07.017.
[36] R. Pastor-Satorras, A. Vespignani, Epidemic Spreading in Scale-Free Networks, Physical Review
     Letters 86 (2001) 3200–3203. doi:10.1103/PhysRevLett.86.3200.
[37] C. Wang, K. Xu, G. Zhang, A SEIR-based Model for Virus Propagation on SNS, in: 2013 Fourth
     International Conference on Emerging Intelligent Data and Web Technologies, IEEE, Xi’an, 2013,
     pp. 479–482. doi:10.1109/EIDWT.2013.86.
[38] L.-L. Xia, G.-P. Jiang, B. Song, Y.-R. Song, Rumor Spreading Model Considering Hesitating
     Mechanism in Complex Social Networks, Physica A: Statistical Mechanics and its Applications
     437 (2015) 295–303. doi:10.1016/j.physa.2015.05.113.
[39] F. Jin, E. Dougherty, P. Saraf, Y. Cao, N. Ramakrishnan, Epidemiological Modeling of News and
     Rumors on Twitter, in: Proceedings of the 7th Workshop on Social Network Mining and Analysis,
     ACM, Chicago Illinois, 2013, pp. 1–9. doi:10.1145/2501025.2501027.
[40] L. M. A. Bettencourt, A. Cintrón-Arias, D. I. Kaiser, C. Castillo-Chávez, The Power of a Good Idea:
     Quantitative Modeling of the Spread of Ideas from Epidemiological Models, Physica A: Statistical
     Mechanics and its Applications 364 (2006) 513–536. doi:10.1016/j.physa.2005.08.083.
[41] Y. Xiao, D. Chen, S. Wei, Q. Li, H. Wang, M. Xu, Rumor Propagation Dynamic Model Based
     on Evolutionary Game and Anti-Rumor, Nonlinear Dynamics 95 (2019) 523–539. doi:10.1007/
     s11071-018-4579-1.
[42] Y. Bao, C. Yi, Y. Xue, Y. Dong, A New Rumor Propagation Model and Control Strategy on Social
     Networks, in: 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis
     and Mining (ASONAM 2013), 2013, pp. 1472–1473. doi:10.1109/ASONAM.2013.6785909.
[43] W. Hong, Z. Gao, Y. Hao, X. Li, A Novel SCNDR Rumor Propagation Model on Online Social
     Networks, in: 2015 IEEE International Conference on Consumer Electronics - Taiwan, IEEE, 2015,
     pp. 154–155. doi:10.1109/ICCE-TW.2015.7216829.
[44] N. Zhang, H. Huang, B. Su, J. Zhao, B. Zhang, Dynamic 8-State ICSAR Rumor Propagation Model



                                                   70
Raquel Rodríguez-García et al. CODAI Workshop Proceedings                                              62–73


     Considering Official Rumor Refutation, Physica A: Statistical Mechanics and its Applications 415
     (2014) 333–346. doi:10.1016/j.physa.2014.07.023.
[45] Y. Zhang, Y. Su, L. Weigang, H. Liu, Rumor and Authoritative Information Propagation Model
     Considering Super Spreading in Complex Social Networks, Physica A: Statistical Mechanics and
     its Applications 506 (2018) 395–411. doi:10.1016/j.physa.2018.04.082.
[46] D. Kempe, J. Kleinberg, É. Tardos, Maximizing the Spread of Influence through a Social Network,
     in: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery
     and Data Mining, ACM, Washington, D.C., 2003, pp. 137–146. doi:10.1145/956750.956769.
[47] N. P. Nguyen, G. Yan, M. T. Thai, S. Eidenbenz, Containment of Misinformation Spread in Online
     Social Networks, in: Proceedings of the 4th Annual ACM Web Science Conference, ACM, Evanston
     Illinois, 2012, pp. 213–222. doi:10.1145/2380718.2380746.
[48] A. Kalogeratos, K. Scaman, L. Corinzia, N. Vayatis, Chapter 24 - Information Diffusion and Rumor
     Spreading, in: Cooperative and Graph Signal Processing, Academic Press, 2018, pp. 651–678.
     doi:10.1016/B978-0-12-813677-5.00024-9.
[49] J. Leskovec, L. Backstrom, J. Kleinberg, Meme-Tracking and the Dynamics of the News Cycle,
     in: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery
     and Data Mining, KDD ’09, Association for Computing Machinery, New York, NY, USA, 2009, pp.
     497–506. doi:10.1145/1557019.1557077.
[50] W. Lee, J. Kim, H. Yu, CT-IC: Continuously Activated and Time-Restricted Independent Cascade
     Model for Viral Marketing, in: 2012 IEEE 12th International Conference on Data Mining, IEEE,
     2012, pp. 960–965. doi:10.1109/icdm.2012.40.
[51] D. J. Watts, A Simple Model of Global Cascades on Random Networks, Proceedings of the National
     Academy of Sciences 99 (2002) 5766–5771. doi:10.1073/pnas.082090499.
[52] Y. Zhuang, A. Arenas, O. Yağan, Clustering Determines the Dynamics of Complex Contagions in
     Multiplex Networks, Physical Review E 95 (2017) 012312. doi:10.1103/PhysRevE.95.012312.
[53] C. Lagnier, L. Denoyer, E. Gaussier, P. Gallinari, Predicting Information Diffusion in Social
     Networks Using Content and User’s Profiles, in: Advances in Information Retrieval, Springer,
     Berlin, Heidelberg, 2013, pp. 74–85. doi:10.1007/978-3-642-36973-5_7.
[54] A. G. Hawkes, Spectra of Some Self-Exciting and Mutually Exciting Point Processes, Biometrika
     58 (1971) 83–90. doi:10.2307/2334319.
[55] Y. Jiang, M. D. Porter, Simulating Fake News Dissemination on Twitter with Multivariate Hawkes
     Processes, in: 2022 IEEE International Conference on Big Data (Big Data), IEEE, Osaka, Japan,
     2022, pp. 3597–3606. doi:10.1109/BigData55660.2022.10020285.
[56] M. Farajtabar, J. Yang, X. Ye, H. Xu, R. Trivedi, E. Khalil, S. Li, L. Song, H. Zha, Fake News Mitigation
     via Point Process Based Intervention, in: Proceedings of the 34th International Conference on
     Machine Learning, PMLR, 2017, pp. 1097–1106.
[57] M. Caglar, O. Ozkasap, A Chain-Binomial Model for Pull and Push-Based Information Diffusion,
     in: 2006 IEEE International Conference on Communications, IEEE, Istanbul, 2006, pp. 909–914.
     doi:10.1109/ICC.2006.254823.
[58] D. A. Vega-Oliveros, L. d. F. Costa, F. A. Rodrigues, Rumor Propagation with Heterogeneous
     Transmission in Social Networks, Journal of Statistical Mechanics: Theory and Experiment 2017
     (2017) 023401. doi:10.1088/1742-5468/aa58ef.
[59] T. Zhu, B. Wang, B. Wu, C. Zhu, Maximizing the Spread of Influence Ranking in Social Networks,
     Information Sciences 278 (2014) 535–544. doi:10.1016/j.ins.2014.03.070.
[60] S. Han, F. Zhuang, Q. He, Z. Shi, X. Ao, Energy Model for Rumor Propagation on Social Networks,
     Physica A: Statistical Mechanics and its Applications 394 (2014) 99–109. doi:10.1016/j.physa.
     2013.10.003.
[61] V. Indu, S. M. Thampi, A Nature - Inspired Approach Based on Forest Fire Model for Modeling
     Rumor Propagation in Social Networks, Journal of Network and Computer Applications 125 (2019)
     28–41. doi:10.1016/j.jnca.2018.10.003.
[62] E. Serrano, C. A. Iglesias, Validating Viral Marketing Strategies in Twitter via Agent-Based Social
     Simulation, Expert Systems with Applications 50 (2016) 140–150. doi:10.1016/j.eswa.2015.



                                                     71
Raquel Rodríguez-García et al. CODAI Workshop Proceedings                                        62–73


     12.021.
[63] A. Averza, K. Slhoub, S. Bhattacharyya, Evaluating the Influence of Twitter Bots via Agent-Based
     Social Simulation, IEEE Access 10 (2022) 129394–129407. doi:10.1109/ACCESS.2022.3228258.
[64] A. Gausen, W. Luk, C. Guo, Can We Stop Fake News? Using Agent-Based Modelling to Evaluate
     Countermeasures for Misinformation on Social Media, in: ICWSM Workshops, 2021, pp. 1–5.
[65] Q. F. Lotito, D. Zanella, P. Casari, Realistic Aspects of Simulation Models for Fake News Epidemics
     over Social Networks, Future Internet 13 (2021) 76. doi:10.3390/fi13030076.
[66] J.-H. Cho, S. Rager, J. O’Donovan, S. Adali, B. D. Horne, Uncertainty-Based False Information
     Propagation in Social Networks, ACM Transactions on Social Computing 2 (2019) 1–34. doi:10.
     1145/3311091.
[67] N. Rabb, L. Cowen, J. P. De Ruiter, M. Scheutz, Cognitive Cascades: How to Model (and Potentially
     Counter) the Spread of Fake News, PLOS ONE 17 (2022) e0261811. doi:10.1371/journal.pone.
     0261811.
[68] M. Tambuscio, D. F. M. Oliveira, G. L. Ciampaglia, G. Ruffo, Network Segregation in a Model of
     Misinformation and Fact-Checking, Journal of Computational Social Science 1 (2018) 261–275.
     doi:10.1007/s42001-018-0018-9.
[69] W. Li, Q. Bai, M. Zhang, A Multi-agent System for Modelling Preference-Based Complex Influence
     Diffusion in Social Networks, The Computer Journal 62 (2019) 430–447. doi:10.1093/comjnl/
     bxy078.
[70] P. Costa, R. McCrae, Personality in Adulthood: A Five-Factor Theory Perspective, Management
     Information Systems Quarterly - MISQ (2002). doi:10.4324/9780203428412.
[71] R. F. Muhammad, S. Kasahara, Agent-Based Simulation of Fake News Dissemination: The Role of
     Trust Assessment and Big Five Personality Traits on News Spreading, Social Network Analysis
     and Mining 14 (2024) 75. doi:10.1007/s13278-024-01235-8.
[72] S.-H. Tseng, T. Son Nguyen, Agent-Based Modeling of Rumor Propagation Using Expected
     Integrated Mean Squared Error Optimal Design, Applied System Innovation 3 (2020) 48. doi:10.
     3390/asi3040048.
[73] Y. Liu, X. Chen, X. Zhang, X. Gao, J. Zhang, R. Yan, From Skepticism to Acceptance: Simulating
     the Attitude Dynamics Toward Fake News, arXiv preprint (2024). ArXiv:2403.09498.
[74] J. Pastor-Galindo, P. Nespoli, J. A. Ruipérez-Valiente, Large-Language-Model-Powered Agent-
     Based Framework for Misinformation and Disinformation Research: Opportunities and Open
     Challenges, IEEE Security & Privacy 22 (2024) 24–36. doi:10.1109/MSEC.2024.3380511.
[75] J. Brainard, P. R. Hunter, Misinformation Making a Disease Outbreak Worse: Outcomes Com-
     pared for Influenza, Monkeypox, and Norovirus, SIMULATION 96 (2019) 365–374. doi:10.1177/
     0037549719885021.
[76] C. Marshall, J. Cruickshank, C. O’Riordan, Identifying Influential Nodes to Inhibit Bootstrap
     Percolation on Hyperbolic Networks, in: 2018 IEEE/ACM International Conference on Advances
     in Social Networks Analysis and Mining (ASONAM), IEEE, Barcelona, 2018, pp. 1266–1273. doi:10.
     1109/ASONAM.2018.8508248.
[77] D. Azucar, D. Marengo, M. Settanni, Predicting the Big 5 Personality Traits from Digital Footprints
     on Social Media: A Meta-Analysis, Personality and Individual Differences 124 (2018) 150–159.
     doi:10.1016/j.paid.2017.12.018.
[78] Y. Wang, D. Jin, C. Yang, J. Dang, Integrating Group Homophily and Individual Personality of
     Topics Can Better Model Network Communities, in: 2020 IEEE International Conference on Data
     Mining (ICDM), IEEE, Sorrento, Italy, 2020, pp. 611–620. doi:10.1109/ICDM50108.2020.00070.
[79] K. Shu, D. Mahudeswaran, S. Wang, D. Lee, H. Liu, FakeNewsNet: A Data Repository with News
     Content, Social Context, and Spatiotemporal Information for Studying Fake News on Social Media,
     Big Data 8 (2020) 171–188. doi:10.1089/big.2020.0062.
[80] V. Qazvinian, E. Rosengren, D. R. Radev, Q. Mei, Rumor Has It: Identifying Misinformation in
     Microblogs, in: Proceedings of the 2011 Conference on Empirical Methods in Natural Language
     Processing, Association for Computational Linguistics, Edinburgh, Scotland, UK., 2011, pp. 1589–
     1599. URL: https://aclanthology.org/D11-1147.



                                                   72
Raquel Rodríguez-García et al. CODAI Workshop Proceedings                                    62–73


[81] X. Zhou, A. Mulay, E. Ferrara, R. Zafarani, ReCOVery: A Multimodal Repository for COVID-
     19 News Credibility Research, in: Proceedings of the 29th ACM International Conference on
     Information & Knowledge Management, 2020, pp. 3205–3212. doi:10.1145/3340531.3412880.
[82] L. Cui, D. Lee, CoAID: COVID-19 Healthcare Misinformation Dataset, arXiv preprint (2020).
     doi:10.48550/arXiv.2006.00885, arXiv:2006.00885.
[83] A. Zubiaga, M. Liakata, R. Procter, G. Wong Sak Hoi, P. Tolmie, Analysing How People Orient to
     and Spread Rumours in Social Media by Looking at Conversational Threads, PLOS ONE 11 (2016)
     e0150989. doi:10.1371/journal.pone.0150989.
[84] J. Leskovec, A. Krevl, SNAP Datasets: Stanford Large Network Dataset Collection, http://snap.
     stanford.edu/data, 2014.




                                                   73