=Paper= {{Paper |id=Vol-3677/preface |storemode=property |title=Overview of ROMCIR 2024: The 4th Workshop on Reducing Online Misinformation through Credible Information Retrieval |pdfUrl=https://ceur-ws.org/Vol-3677/xpreface.pdf |volume=Vol-3677 |authors=Marinella Petrocchi,Marco Viviani |dblpUrl=https://dblp.org/rec/conf/ecir/X24 }} ==Overview of ROMCIR 2024: The 4th Workshop on Reducing Online Misinformation through Credible Information Retrieval== https://ceur-ws.org/Vol-3677/xpreface.pdf
Overview of ROMCIR 2024: The 4th Workshop on
Reducing Online Misinformation through Credible
Information Retrieval
Marinella Petrocchi1,2 , Marco Viviani3,∗
1
  IIT-CNR, Via G. Moruzzi, 1 – 56124 Pisa, Italy
2
  IMT School for Advanced Studies, Piazza San Francesco, 19 – 55100 Lucca, Italy
3
  University of Milano-Bicocca (DISCo – IKR3 Lab), Edificio U14 (ABACUS), Viale Sarca, 336 – 20126 Milan, Italy


            Abstract
            The Fourth Edition of the 2024 Workshop on Reducing Online Misinformation through Credible Infor-
            mation Retrieval (ROMCIR 2024) is part of the Satellite Events of the 46th European Conference on
            Information Retrieval (ECIR 2024). ROMCIR serves as a platform for discussions on accessing accurate
            information and addressing the issue of information disorder prevalent in today’s online landscape. The
            challenge is multifaceted, encompassing various types of information sources (e.g., websites, social media
            posts) across different platforms and domains (e.g., fake news detection, health-related information
            retrieval, propaganda reduction). Additionally, there is a critical need to assess the impact of generative
            models like Large Language Models (LLMs) on inadvertently amplifying misinformation and explore their
            potential role in supporting Information Retrieval Systems (IRSs). In this context, diverse approaches to
            the problem of access to truthful information are welcomed. Keynote speech and articles in this year’s
            workshop focus on themes such as health misinformation, multimedia and multimodal fact-checking,
            and information filtering to combat misinformation.

            Keywords
            Information Retrieval, information disorder, information truthfulness, misinformation, explainability,
            Large Language Models.




1. Introduction
In the July 1983 issue of The New York Times, American historian Daniel J. Boorstin remarked
on the computerization of libraries as follows:

       Technology is so much fun but we can drown in our technology. The fog of information
       can drive out knowledge.

  Approximately four decades later, its relevance may arguably surpass its initial significance.
Indeed, the advent of Web 2.0 technologies has ushered in a process of disintermediation in

ROMCIR 2024: The 4th Workshop on Reducing Online Misinformation through Credible Information Retrieval (held as
part of ECIR 2024: the 46th European Conference on Information Retrieval), March 24, 2024, Glasgow, UK
∗
    Corresponding author.
Envelope-Open marinella.petrocchi@iit.cnr.it (M. Petrocchi); marco.viviani@unimib.it (M. Viviani)
GLOBE https://www.iit.cnr.it/en/marinella.petrocchi (M. Petrocchi); http://www.ir.disco.unimib.it/people/marco-viviani/
(M. Viviani)
Orcid 0000-0003-0591-877X (M. Petrocchi); 0000-0002-2274-9050 (M. Viviani)
          © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
the creation and dissemination of online content within the Social Web, resulting in well-
documented challenges such as information overload [1, 2] and the proliferation of misinforma-
tion [3]. These issues hamper users’ ability to access truly valuable information for their needs
[4, 5]. Furthermore, recent advances in generative models such as Large Language Models
(LLMs) pose a new threat, as they can generate text that mimics human writing but may lack
accuracy and truthfulness [6, 7, 8].
   Therefore, the ROMCIR Workshop focuses on studying and developing Information Retrieval
(IR) solutions aimed at providing users with access to relevant and truthful information, while
also addressing the phenomenon of information disorder across various domains. Information
disorder encompasses a spectrum of issues, from unintentional misinformation rooted in igno-
rance or bias to deliberate dissemination of false content, both manually and through automated
means [9, 10]. This challenge is exacerbated by filter bubbles and echo chambers prevalent in
the digital ecosystem [11, 12, 13, 14, 15].
   The resolution of the information disorder issue is inherently complex, involving various
types of content, web platforms, and user objectives. Furthermore, emerging AI-related concerns
such as the explainability of search results [16, 17], assessment of truthfulness in user-generated
content [5, 18], and the use of generative models to support IR systems require attention [19, 20].
In addition, ensuring data confidentiality, especially in unstructured data, is paramount [21, 22].
In this context, the development of appropriate experimental evaluation paradigms for IR
systems is crucial [23, 24].


2. Aim and topics of interest
The Workshop seeks to explore the challenges surrounding online information disorder within
the realm of Information Retrieval, while also delving into associated domains of Artificial
Intelligence such as Natural Language Processing (NLP), Natural Language Understanding
(NLU), Computer Vision, Machine Learning, and Deep Learning. Therefore, the focal points of
interest for ROMCIR 2024 encompass, yet are not confined to:
    • Artificial Intelligence and information truthfulness assessment
    • Bot/spam/troll detection
    • Computational fact-checking/truthfulness assessment
    • Crowdsourcing for information truthfulness assessment
    • Disinformation/misinformation and bias detection
    • Generative models and information truthfulness assessment
    • Harassment/bullying/hate speech detection
    • Information polarization in online communities, echo chambers
    • Propaganda identification/analysis
    • Retrieval and evaluation of truthful information
    • Security, privacy, and information truthfulness
    • Sentiment/emotional analysis and stance detection
    • Societal reaction to misinformation
    • Trust, reputation, and misinformation
3. Keynote Speaker
                       David E. Losada. He is a full professor in Computer Science and Ar-
                       tificial Intelligence at the University of Santiago de Compostela (Spain).
                       He received his BS in Computer Science in 1997 and his PhD in Com-
                       puter Science in 2001, both of them with honors, from the University
                       of A Coruña (Spain). From 2001 to 2002, he was a Lecturer in the San
                       Pablo-CEU University (Spain) and in 2003 joined the University of San-
                       tiago de Compostela as a senior research fellow (“Ramón y Cajal” R&D
programme). His current research interests include a wide range of Information Retrieval
(IR) and related areas such as: IR probabilistic models, summarization, novelty detection,
sentence retrieval, patent search and opinion mining. He is an active member of the IR com-
munity. He participated in the Programme Committee of prestigious international confer-
ences such as SIGIR or ECIR. He also led several R&D projects and contracts in the area of
search technologies. In 2011 he was recognized with an ACM senior member award. Website:
https://citius.gal/team/david-enrique-losada-carril/

Health misinformation detection: search challenges, annotation issues and reliability
of LLMs. In this presentation, we will share insights from our work at CiTIUS (University of
Santiago de Compostela, Spain) on the development of technological and scientific solutions
for detecting health misinformation. We will delve into the complexities of developing a multi-
faceted retrieval system for misinformation detection that integrates multiple content-based
features. The challenges of creating robust credibility benchmarks, given the subjective nature
of credibility, will also be discussed. Lastly, we will share our recent efforts to evaluate the
quality of LLMs’ responses to health-related queries.


4. Submissions
The ROMCIR 2024 workshop received 14 submissions, of which 6 were accepted, resulting in
an acceptance rate of approximately 43%. The submissions came from five different countries,
including Germany (2), Italy (1), The Netherlands (1), Poland (1), and Spain (1). This year, submis-
sions have particularly focused on the issues of health misinformation, fact-checking (including
multimedia and multimodal approaches), and information filtering and misinformation.
   The work by Fernández Pichel, Bink, Losada, and Elsweiler focuses on assessing the credibility
of information in the context of health information seeking. The authors acknowledge the
subjectivity and bias susceptibility of this process and emphasize the importance of defining
robust guidelines for credibility assessment. Through a study involving 1,000 participants, they
demonstrate a correlation between participants’ judgments and the reference values established
following such guidelines. Further data analyses reveal concerning insights into people’s ability
to evaluate the credibility of online medical content, posing the risk of personal harm.
   Mongelli, Maiano, and Amerini’s work focuses on enhancing deepfake detection by simulta-
neously analyzing audio and visual cues, proposing the Convolutional Multimodal deepfake
detection model (CMDD). This novel approach improves detection accuracy by leveraging
the power of Convolutional Neural Networks (CNNs) to extract spatial and temporal features
concurrently. Frick and Steinebach address the challenge of combating false information on
social media by proposing a method to assess the check-worthiness of tweets. Their approach
incorporates analysis of image content, captions, and text obtained from optical character recog-
nition to outperform existing recognition techniques. Vogel, Möhle, Meghana, and Steinebach’s
work focuses on detecting check-worthy statements to prioritize claims for fact-checking. They
propose an adapter fusion model combining task and Named Entity Recognition (NER) adapters,
achieving state-of-the-art results in check-worthiness benchmarks.
   In relation to the last research topic, Hornig, Pera, and Scholtes delve into the issue of propa-
gation of misinformation in the domain of video recommendation. They evaluate a range of
top-N recommendation algorithms to assess their effectiveness in minimizing misinformation
while optimizing overall performance. Their empirical exploration highlights the potential of
certain algorithms, including neighborhood-based, neural, and advanced collaborative filtering
approaches, in combating misinformation and promoting responsible recommender systems.
Hasimi and Poniszewska-Maranda focus on the broader implications of fake news and disin-
formation on human rights, particularly freedom of speech. They explore the use of Artificial
Intelligence (AI) in detecting and filtering disinformation, highlighting the risks to freedom of
expression posed by censorship and the suppression of critical thinking.


5. Past Editions
The first three editions of the ROMCIR Workshop, all co-located with the ECIR conference, led
to fervent discussion and presentation of innovative work concerning a variety of open issues
related to information disorder and IR. The first edition took place in online mode on April
1, 2021. The second edition took place both in presence in Stavanger, Norway, and online, on
April 10, 2022. The third edition took place in presence in Dublin, Ireland, on April 2, 2023.
The papers accepted for the first three editions of ROMCIR are collected in CEUR Proceedings
[25, 26, 27], which are freely accessible. Updated information on past and current ROMCIR
editions can be found on the official website: https://romcir.disco.unimib.it/


6. Workshop Organization
                     Marinella Petrocchi. She is a Senior Researcher at the Institute of
                     Informatics and Telematics of the National Research Council (IIT-CNR) in
                     Pisa, Italy, under the Trust, Security, and Privacy research unit. She also
                     collaborates with the Sysma unit at IMT School for Advanced Studies, in
                     Lucca, Italy. Her field of research lies between Cybersecurity, Artificial
                     Intelligence, and Data Science. Specifically, she studies novel techniques
                     for online fake news/fake accounts detection and automated methods
to rank the reputability of online news media. She is the author of several international
publications on these themes and she usually gives talks and lectures on the topic. She is
CNR lead and WP leader of Humane: Holistic sUpports to inforMAtioN disordErs, sub-project
of SERICS (PE00000014), https://serics.eu/en/, NRRP MUR program funded by the EU - NGEU.
Website: https://www.iit.cnr.it/en/marinella.petrocchi/

                       Marco Viviani. He is an Associate Professor at the Department of
                       Informatics Systems, and Communication of the University of Milano-
                       Bicocca, Italy. He received his M.Sc. and Ph.D. in Computer Science from
                       the University of Milan (La Statale), Italy. He was later a postdoctoral
                       fellow at both Italian (University of Insubria) and foreign institutions
                       (University of Burgundy and INSA Lyon, France). He is involved in
                       organizing several research initiatives at the international level. He was
the General Co-chair of MDAI 2019 and organized several Workshops and Special Tracks at
International Conferences. He is an Associate Editor of “Social Network Analysis and Mining”
(SNAM) and “Frontiers in Artificial Intelligence - Natural Language Processing”, an Area Editor
(Web Intelligence and E-Services) of the “International Journal of Computational Intelligence
Systems” (IJCIS), and an Editorial Board Member of “Online Social Networks and Media”. His
main research interests include Social Computing, Information Retrieval, Natural Language
Processing, Privacy, and Trust. On these topics, he has published more than 90 research works
in International Journals, at International Conferences, as Monographs and Book Chapters.
Website: https://ikr3.disco.unimib.it/people/marco-viviani/

6.1. Program Committee Members
    • John Bianchi, IMT Scuola Alti Studi Lucca, Italy
    • Edoardo Di Paolo, Università degli Studi di Roma “La Sapienza”, Italy
    • Tiziano Fagni, Istituto di Informatica e Telematica – CNR, Italy
    • Carlos A. Iglesias, Universidad Politécnica de Madrid, Spain
    • Udo Kruschwitz, Universität Regensburg, Germany
    • David Losada, Universidad de Santiago de Compostela, Spain
    • Lorenzo Mannocci, Università degli Studi di Pisa, Italy
    • Gabriella Pasi, Università degli Studi di Milano-Bicocca, Italy
    • Marinella Petrocchi, Istituto di Informatica e Telematica – CNR, Italy
    • Manuel Pratelli, IMT Scuola Alti Studi Lucca, Italy
    • Daisy Romanini, Istituto di Informatica e Telematica – CNR, Italy
    • Paolo Rosso, Universitat Politècnica de València, Spain
    • Irene Sánchez Rodríguez, IMT Scuola Alti Studi Lucca, Italy
    • Fabio Saracco, Centro Ricerche Enrico Fermi, Italy
    • Serena Tardelli, Istituto di Informatica e Telematica – CNR, Italy
    • Marco Viviani, Università degli Studi di Milano-Bicocca, Italy
Acknowledgments
Partially supported by re-DESIRE: DissEmination of ScIentific REsults 2.0, funded by IIT–CNR;
by SERICS (PE00000014) under the NRRP MUR program funded by the EU - NGEU; by the
PNRR ICSC National Research Centre for High Performance Computing, Big Data and Quantum
Computing (CN00000013), under the NRRP MUR program funded by the NextGenerationEU; by
KURAMi: Knowledge-based, explainable User empowerment in Releasing private data and Assess-
ing Misinformation in online environments, under the PRIN MUR 2022 program (20225WTRFN),
https://kurami.disco.unimib.it/.


References
 [1] D. Bawden, C. Holtham, N. Courtney, Perspectives on information overload, in: Aslib
     proceedings, volume 51, MCB UP Ltd, 1999, pp. 249–255.
 [2] I. Khaleel, B. C. Wimmer, G. M. Peterson, S. T. R. Zaidi, E. Roehrer, E. Cummings, K. Lee,
     Health information overload among health consumers: A scoping review, Patient education
     and counseling 103 (2020) 15–32.
 [3] S. Chen, L. Xiao, A. Kumar, Spread of misinformation on social media: What contributes
     to it and how to combat it, Computers in Human Behavior 141 (2023) 107643.
 [4] G. Pasi, M. Viviani, Information credibility in the social web: Contexts, approaches, and
     open issues, arXiv preprint arXiv:2001.09473 (2020).
 [5] M. Viviani, G. Pasi, Credibility in social media: opinions, news, and health information—a
     survey, Wiley interdisciplinary reviews: Data mining and knowledge discovery 7 (2017)
     e1209.
 [6] T. Kojima, S. S. Gu, M. Reid, Y. Matsuo, Y. Iwasawa, Large language models are zero-shot
     reasoners, Advances in neural information processing systems 35 (2022) 22199–22213.
 [7] S. Monteith, T. Glenn, J. R. Geddes, P. C. Whybrow, E. Achtyes, M. Bauer, Artificial
     intelligence and increasing misinformation, The British Journal of Psychiatry (2023) 1–3.
 [8] D. Xu, S. Fan, M. Kankanhalli, Combating Misinformation in the Era of Generative AI
     Models, in: Proceedings of the 31st ACM International Conference on Multimedia, 2023,
     pp. 9291–9298.
 [9] C. Wardle, H. Derakhshan, Information disorder: Toward an interdisciplinary framework
     for research and policy making, Council of Europe 27 (2017).
[10] M. Pratelli, M. Petrocchi, F. Saracco, R. De Nicola, Online disinformation in the 2020 U.S.
     election: swing vs. safe states, EPJ Data Sci. 13 (2024) 25. URL: https://doi.org/10.1140/
     epjds/s13688-024-00461-6.
[11] E. Bozdag, J. Van Den Hoven, Breaking the filter bubble: democracy and design, Ethics
     and information technology 17 (2015) 249–265.
[12] M. Del Vicario, A. Bessi, F. Zollo, F. Petroni, A. Scala, G. Caldarelli, H. E. Stanley, W. Quat-
     trociocchi, The spreading of misinformation online, Proceedings of the National Academy
     of Sciences 113 (2016) 554–559.
[13] G. Villa, G. Pasi, M. Viviani, Echo chamber detection and analysis: a topology-and content-
     based approach in the covid-19 scenario, Social Network Analysis and Mining 11 (2021)
     78.
[14] M. Pratelli, F. Saracco, M. Petrocchi, Entropy-based detection of twitter echo chambers,
     CoRR abs/2308.01750 (2023). URL: https://doi.org/10.48550/arXiv.2308.01750.
[15] M. Mattei, et al., Bow-tie structures of twitter discursive communities, Scientific Reports
     12 (2022). URL: https://doi.org/10.1038/s41598-022-16603-7.
[16] A. Anand, P. Sen, S. Saha, M. Verma, M. Mitra, Explainable information retrieval, in: Pro-
     ceedings of the 46th International ACM SIGIR Conference on Research and Development
     in Information Retrieval, 2023, pp. 3448–3451.
[17] R. Upadhyay, P. Knoth, G. Pasi, M. Viviani, Explainable online health information truthful-
     ness in consumer health search, Frontiers in Artificial Intelligence 6 (2023) 1184851.
[18] M. Soprano, K. Roitero, D. La Barbera, D. Ceolin, D. Spina, S. Mizzaro, G. Demartini,
     The many dimensions of truthfulness: Crowdsourcing misinformation assessments on a
     multidimensional scale, Information Processing & Management 58 (2021) 102710.
[19] F. Cabitza, D. Ciucci, G. Pasi, M. Viviani, Responsible ai in healthcare, arXiv preprint
     arXiv:2203.03616 (2022).
[20] M. Najork, Generative information retrieval, in: Proceedings of the 46th International
     ACM SIGIR Conference on Research and Development in Information Retrieval, 2023, pp.
     1–1.
[21] G. Livraga, M. Viviani, Data confidentiality and information credibility in on-line ecosys-
     tems, in: Proceedings of the 11th international conference on management of digital
     ecosystems, 2019, pp. 191–198.
[22] G. Livraga, A. Olzojevs, M. Viviani, Unveiling the privacy risk: A trade-off between user
     behavior and information propagation in social media, in: International Conference on
     Complex Networks and Their Applications, Springer, 2023, pp. 277–290.
[23] C. Lioma, J. G. Simonsen, B. Larsen, Evaluation measures for relevance and credibility in
     ranked lists, in: Proceedings of the ACM SIGIR International Conference on Theory of
     Information Retrieval, 2017, pp. 91–98.
[24] H. Suominen, L. Goeuriot, L. Kelly, L. A. Alemany, E. Bassani, N. Brew-Sam, V. Cotik,
     D. Filippo, G. González-Sáez, F. Luque, P. Mulhem, G. Pasi, R. Roller, S. Seneviratne,
     R. Upadhyay, J. Vivaldi, M. Viviani, C. Xu, Overview of the CLEF eHealth Evaluation Lab
     2021, in: K. S. Candan, B. Ionescu, L. Goeuriot, B. Larsen, H. Müller, A. Joly, M. Maistro,
     F. Piroi, G. Faggioli, N. Ferro (Eds.), Experimental IR Meets Multilinguality, Multimodality,
     and Interaction, Springer International Publishing, Cham, 2021, pp. 308–323.
[25] M. Petrocchi, M. Viviani, Overview of ROMCIR 2022: The 2nd Workshop on Reducing
     Online Misinformation through Credible Information Retrieval, in: ROMCIR 2022 CEUR
     Workshop Proc, volume 3138, 2022, pp. i–vii.
[26] M. Petrocchi, M. Viviani, Overview of ROMCIR 2023: The 3rd Workshop on Reducing
     Online Misinformation through Credible Information Retrieval, in: ROMCIR 2023 CEUR
     Workshop Proc, volume 3406, 2023, pp. i–ix.
[27] F. Saracco, M. Viviani, Overview of ROMCIR 2021: Workshop on Reducing Online Misin-
     formation through Credible Information Retrieval, in: ROMCIR 2021 CEUR Workshop
     Proc, volume 2838, 2021, pp. i–vii.