<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Julio Reyes-Montesinos</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>IR group at UNED</string-name>
          <email>jreyes@lsi.uned.es</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Madrid</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Spain</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anselmo Peñas</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jan Deriu</string-name>
          <email>jan.deriu@zhaw.ch</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rajesh Sharma</string-name>
          <email>rajesh.sharma@ut.ee</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Guilhem Valentin</string-name>
          <email>guilhem.valentin@synapse-fr.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>CAI, ZHAW</institution>
          ,
          <addr-line>Zürich</addr-line>
          ,
          <country country="CH">Switzerland</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Synapse Développement</institution>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Tartu</institution>
          ,
          <country country="EE">Estonia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In order to face the current context of organised intentional misinformation campaigns, collectively known as disinformation, society must be aware of not just fake news, but also of the agents introducing misleading information, their supporting media, the nodes they use in social networks, their propaganda techniques and their narratives and intentions. This is a challenge that must be addressed in a holistic way, considering all these dimensions in order to identify, characterise and describe orchestrated disinformation campaigns. The HAMiSoN project aims at treating misinformation from this holistic view. The main challenge is integrating the message and the network level. To tackle this challenge, we propose to reveal misinformation's hidden intents: which agents introduce disinformation in the social media, which narratives do they use and with which concrete aims (such as polarising, destabilising, generating distrust, destroying reputation, etc.) We must also identify malicious and harmed agents and provide this information to the final analysts and users in explainable ways. Identifying misleading messages, knowing their narratives and hidden intentions, modelling the difusion in social networks, and monitoring the sources of disinformation will also give us the chance to react faster to the spreading of disinformation.</p>
      </abstract>
      <kwd-group>
        <kwd>LATEX misinformation</kwd>
        <kwd>fake news</kwd>
        <kwd>media analysis</kwd>
        <kwd>multi-modal analysis</kwd>
        <kwd>natural language processing</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Among the diferent kinds of misinformation, perhaps the most dangerous is the one created
with the intention to harm, polarise, destabilise, generate distrust, destroy reputation, etc.
by means of spreading untrue information [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ]. In this scenario of organised intentional
misinformation campaigns (also called disinformation1) current fact-checking strategies are not
enough [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
nEvelop-O
LGOBE
NLP-MisInfo 2023: SEPLN 2023 Workshop on NLP applied to Misinformation, held as part of SEPLN 2023: 39th
      </p>
      <p>https://nlp.uned.es/~anselmo (A. Peñas); https://www.zhaw.ch/en/about-us/person/deri/ (J. Deriu);
© 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
CEUR
Workshop
Proceedings</p>
      <sec id="sec-1-1">
        <title>1From here we use misinformation and disinformation interchangeably.</title>
        <p>
          Fact Checkers need Artificial Intelligence tools to help them to identify the most important
claims to check (check-worthiness), detect claims that they already have checked (verified claim
retrieval), and check claims as soon as possible [
          <xref ref-type="bibr" rid="ref4">4, 5</xref>
          ]. However, fake news spreads 6 times faster
than true ones [6], and 50% of the fake news propagation occurs in the first 10 minutes [ 7].
They are carefully prepared to behave in this way, they have an intention (not always explicit)
and a coordinated spreading.
        </p>
        <p>Given this scenario of organised intentional misinformation campaigns, we need strategies
to anticipate and mitigate the spreading of disinformation [8]. We, as a society, must be
aware not only about fake news, but also about the agents that introduce false or misleading
information, their supporting media, the nodes they use in the social networks, the propaganda
techniques they use, their narratives and their intentions. Therefore, we must address this
challenge in a holistic way, considering the diferent dimensions involved in the spreading
of disinformation and bring them together to really identify and describe the orchestrated
disinformation campaigns. These dimensions are:</p>
      </sec>
      <sec id="sec-1-2">
        <title>1. Detect misinformation 2. Acknowledging their organised spreading in social networks 3. Identifying its malicious intent 4. Bring everything together</title>
        <p>To address the challenge of working on these dimensions, we have proposed the HAMiSoN
(Holistic Analysis of Organised Misinformation Activity in Social Networks) project2 under the
CHIST-ERA3 2021 call. Our main audience are the analysts that make use of services such as
factcheckers for a further analysis and better understanding of the agents and narratives involved
in disinformation campaigns. Making explicit the hidden intention behind disinformation
campaigns will raise citizens’ awareness. But for this purpose we need to move from just
checking single messages or just analysing alterations in the social network, to seeing the
complete picture. For example, one of our use cases is related to the international observers
in political elections. These observers analyse the whole bunch of fake news as a whole, and
identify the communication campaigns and their narratives employed to destroy the political
opponent’s reputation.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Project goals</title>
      <p>The overall goals of this project are to raise social awareness and mitigate disinformation
propagation by making explicit the context behind the intentional spreading of misleading information:
sources and means of difusion, stance and bias, intentionality and narratives.</p>
      <p>This project is articulated around the following specific goals:
1. Develop models and systems for disinformation identification at message level: Claim
check-worthiness detection, Stance detection, Multilingual verified claim retrieval.
2http://nlp.uned.es/hamison-project/
3https://www.chistera.eu/
2. Analyse the organised difusion of disinformation and their narratives at social network
level.
3. Integrate both sources of evidence (message level and network level) for better
identification of organised misinformation campaigns.
4. Create evaluation datasets in multiple languages (English, Spanish, French, German,</p>
      <p>Estonian), and two modalities (text in Twitter messages, and video streams).
5. Organize shared tasks for competitive evaluation on stance detection in Twitter and
claim-checking worthiness on videos.
6. Develop demonstration applications and a simulation tool for analysing potential
coordinated disinformation campaigns in social networks.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Description</title>
      <p>Integrating multimodal models [9] for misinformation detection with network models of
misinformation difusion to identify large misinformation campaigns [ 10] and their narratives
constitutes a novel, holistic view of misinformation. It poses considerable and exciting challenges
both on the conceptual and technical level.</p>
      <p>There are two main current technologies to deal with the detection of disinformation. One,
related to the needs of fact-checkers, focuses on the processing and analysis of single messages.
The other, related to the detection of disinformation campaigns organised to influence a social
network, relies on social network analysis: highly similar behaviour of diferent user accounts
along time series are indications of a disinformation campaign.</p>
      <p>However, both research lines remain separate research fields, although one gives context to
the other. In fact, current AI models for misinformation detection are limited in the ability to
represent and consider contextual information. It is still a research frontier we want to address.</p>
      <p>HAMiSoN’s most breakthrough goal is the integration of diferent technologies at both
message and social network levels into a single system. Although we plan to take advantage of
the hidden variable they share (their intentionality), there are many research questions that
have to be addressed.</p>
      <p>A straightforward approach would be to run all involved systems separately and then compare
and combine their output. However, they don’t leverage each other’s signals and, in fact, the
current state of the art achieves rather low performance.</p>
      <p>The alternative we want to explore is what we call an “holistic” approach, where all tasks are
considered simultaneously by one integrated system. This resembles in a way the end-to-end
approach with neural networks which replaced component-based architectures for several NLP
tasks.</p>
      <p>Apart from solving the “whole” task - i.e. detection and description of organised
disinformation campaigns - we see a great potential to also improve each single subtask, since they have
access to much more data and insights. This hope is motivated by the success of multi-task
learning, where additional unrelated subtasks help each other (Zhang and Yang, 2021). Messages
that would be missed by local analysis level could be uncovered at this deeper latent level if
they are strongly connected to an identified potential harmful network and, provided with
contextual information to better interpret their intention, eventually bring them to the attention
of analysts.</p>
      <p>The project aims to establish solid proofs of concept to validate the core of the technology
developed to articulate the various AI models to detect, analyse and mitigate coordinated
disinformation on social networks, bringing it to a Technology Readiness Level of 4. Qualitative
evaluation will be performed to validate the holistic modelling of disinformation and in vitro
experiments will be performed in simulated environments, using the simulation tools that will
be developed.</p>
      <p>The project includes several areas of computer science (CS), together with network science
and social sciences. With respect to CS, the project requires expertise on: (i) Natural Language
Processing for textual disinformation detection; (ii) Automatic Speech Recognition and Image
Analysis for multimodal disinformation detection. Network science and theoretical modelling
of disinformation difusion is required to simulate various real scenarios. The holistic approach
proposed in the project brings a natural way to integrate techniques from these diferent fields
enabling cross-fertilization and synergy.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Expected impacts</title>
      <sec id="sec-4-1">
        <title>4.1. Scientific and technological impact</title>
        <p>The integration of evidence coming from the message, the social network, and the intended
hidden goals, will be an important step towards a new modelling that can improve the state
of the art, and produce more efective tools for the detection of organised misinformation
campaigns.</p>
        <p>The project will apply and combine Natural Language Processing, Artificial Intelligence
techniques and Social Network analysis (Machine Learning, Multi-Agent based simulation,
speech-to-text for audio transcription, image recognition, lifelong learning, etc.) in the
identification of disinformation in text, images and video in several platforms. This joint approach
will enable better modelling of coordinated disinformation campaigns, helping to train more
eficient and adaptive methods for stance detection, claim worthiness checking, verified claim
retrieval, similar message clustering and disinformation propagation modelling.</p>
        <p>Whereas some techniques to perform misinformation identification tasks at both message
level and network level may be validated with existing metrics, the combination of textual and
non-textual features as well as the aggregation of both message content and network perspectives
for orchestrated disinformation campaigns identification are still open research problems and
involves finding new metrics correlating with the disinformation network modelling, as well as
with the impact of mitigation actions over the disinformation propagation through this network.</p>
        <p>In particular, multilingual claim similarity models will have an impact in verified claim
retrieval and clustering of similar messages across diferent languages. The consideration of
textual and non-textual features will have an impact in stance detection and news fact-checking.
The simulation of disinformation spreading in social networks will have an impact in the
evaluation of mitigation actions. The design of new architectures to gather and leverage evidence
from diferent levels (message, network, intention) will have an impact in the methodologies
for disinformation detection. The release of datasets in languages other than English for stance
detection, for claim detection in transcripts of video posts, and for harming narratives during
political elections will have an impact in the research community and in technology transfer.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Social impact</title>
        <p>By analysing the sources of disinformation, how they interconnect in various posts and messages,
as well as in various modalities and languages, shedding light on their narratives, the project
will provide means for a deeper understanding of disinformation in social networks and media
and, indirectly, better anticipate and limit its propagation.</p>
        <p>The project will develop tools to assist Fact Checking organisations, civil society, NGO
and political observers to better and sooner detect disinformation campaigns, which will
indirectly benefit the whole community of social media users, providing means to contextualise
disinformation messages.</p>
        <p>In particular, being able to make explicit and attach to the messages the intentions behind,
the propaganda techniques they may use, the sources, their subjectivity, stance and bias degree
will help provide evidence of disinformation activity and increase awareness in society for
social media users. By modelling the disinformation propagation in simulated social network
environments, the project will also provide means to adjust mitigation actions against the
spreading of disinformation, by evaluating and measuring their potential efect on the agents
and on the propagation.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments References</title>
      <p>This work was supported by the CHIST-ERA HAMiSoN project grant
CHIST-ERA-21-OSNEM002, by AEI PCI2022-135026-2, SNF 20CH21 209672, ANR ANR-22-CHR4-0004 and ETAg.
[5] S. Vasileva, P. Atanasova, L. Màrquez, A. Barrón-Cedeño, P. Nakov, It takes nine to
smell a rat: Neural multi-task learning for check-worthiness prediction, arXiv preprint
arXiv:1908.07912 (2019).
[6] S. Vosoughi, D. Roy, S. Aral, The spread of true and false news online, science 359 (2018)
1146–1151.
[7] T. Zaman, E. B. Fox, E. T. Bradlow, A bayesian approach for predicting the popularity of
tweets (2014).
[8] A. Zubiaga, M. Liakata, R. Procter, Learning reporting dynamics during breaking news for
rumour detection in social media, arXiv preprint arXiv:1610.07363 (2016).
[9] M. Dhawan, S. Sharma, A. Kadam, R. Sharma, P. Kumaraguru, Game-on: Graph attention
network based multimodal fusion for fake news detection, arXiv preprint arXiv:2202.12478
(2022).
[10] S. Sharma, R. Sharma, Identifying possible rumor spreaders on twitter: A weak supervised
learning approach, in: 2021 International Joint Conference on Neural Networks (IJCNN),
IEEE, 2021, pp. 1–8.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Schütz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Schindler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Siegel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Nazemi</surname>
          </string-name>
          ,
          <article-title>Automatic fake news detection with pretrained transformer models, in: Pattern Recognition</article-title>
          .
          <source>ICPR International Workshops and Challenges: Virtual Event, January 10-15</source>
          ,
          <year>2021</year>
          , Proceedings,
          <string-name>
            <surname>Part</surname>
            <given-names>VII</given-names>
          </string-name>
          , Springer,
          <year>2021</year>
          , pp.
          <fpage>627</fpage>
          -
          <lpage>641</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>K.</given-names>
            <surname>Shu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Cui</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lee</surname>
          </string-name>
          , H. Liu,
          <article-title>defend: Explainable fake news detection</article-title>
          ,
          <source>in: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery &amp; data mining</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>395</fpage>
          -
          <lpage>405</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>G.</given-names>
            <surname>Da San Martino</surname>
          </string-name>
          , A.
          <string-name>
            <surname>Barron-Cedeno</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Nakov</surname>
          </string-name>
          ,
          <article-title>Findings of the nlp4if-2019 shared task on fine-grained propaganda detection, in: Proceedings of the second workshop on natural language processing for internet freedom: censorship, disinformation</article-title>
          , and propaganda,
          <year>2019</year>
          , pp.
          <fpage>162</fpage>
          -
          <lpage>170</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Patwari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Goldwasser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bagchi</surname>
          </string-name>
          ,
          <article-title>Tathya: A multi-classifier system for detecting checkworthy statements in political debates</article-title>
          ,
          <source>in: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>2259</fpage>
          -
          <lpage>2262</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>