Beyond Facts: Online Discourse and Knowledge Graphs A preface to the proceedings of the 1st International Workshop on Knowledge Graphs for Online Discourse Analysis (KnOD 2021, co-located with TheWebConf’21) Konstantin Todorov Pavlos Fafalios Stefan Dietze todorov@lirmm.fr fafalios@ics.forth.gr stefan.dietze@gesis.org LIRMM, University of Montpellier, Information Systems Laboratory, GESIS & Heinrich-Heine-University CNRS, Montpellier, France ICS-FORTH, Heraklion, Greece Düsseldorf, Cologne, Germany ABSTRACT [12], understanding/quantifying hidden biases [7], approaches for Expressing opinions and interacting with others on the Web has classifying sources of news, such as Web pages, pay-level domains, led to the production of an abundance of online discourse data, users or posts [8], or research into fake news detection [9], and such as claims and viewpoints on controversial topics, their sources automatic fact-checking [1]. and contexts. This data constitutes a valuable source of insights for One crucial requirement to facilitate the aforementioned re- studies into misinformation spread, bias reinforcement, echo cham- search areas is the availability of reliable structured knowledge bers or political agenda setting. While knowledge graphs promise about key notions such as claims, truth ratings, evidence, sources, to provide the key to a Web of structured information, they are arguments and their relations. On the one hand, initiatives such as mainly focused on facts without keeping track of the diversity, con- the schema.org Claim Review vocabulary1 aim at encouraging web- nection or temporal evolution of online discourse data. As opposed site providers to offer such date through embedded Web markup. to facts, claims are inherently more complex. Their interpretation On the other hand, initial knowledge graphs (KG) such as Mul- strongly depends on the context and a variety of intentional or tiFC [2], ClaimsKG [10]2 , TweetsCOV19 [5]3 or TweetsKB [6]4 have unintended meanings, where terminology and conceptual under- been proposed aimed at consolidating Web-mined data about the standings strongly diverge across communities from computational aforementioned notions. While knowledge graphs (KGs) promise social science, to argumentation mining, fact-checking, or view- to provide the key to a Web of structured information, they are point/stance detection. The 1st International Workshop on Knowl- mainly focused on facts without keeping track of the diversity, edge Graphs for Online Discourse Analysis (KnOD 2021) aims at connection or temporal evolution of online discourse. As opposed strengthening the relations between these communities, providing to facts, claims are inherently more complex. Their interpretation a forum for shared works on the modeling, extraction and anal- strongly depends on the context and a variety of intentional or ysis of discourse on the Web. It addresses the need for a shared unintended meanings, where terminology and conceptual under- understanding and structured knowledge about discourse data in standings strongly diverge across communities from computational order to enable machine-interpretation, discoverability and reuse, social science, to argumentation mining, fact-checking, or view- in support of scientific or journalistic studies into the analysis of point/stance detection [3, 4]. societal debates on the Web. Initial efforts have been made to gather communities working in those areas, for instance through dedicated challenges, such KEYWORDS as the Fake News Challenge,5 or sessions at major conferences, such as the Journalism, Misinformation and Fact Checking track Online Discourse Analysis, Knowledge Graphs, Social Web Min- at The Web Conf 2018.6 The KnOD Workshop brings together the ing, Computational Fact-checking, Mis-/Dis-information Spread, various disciplines involved in or benefitting from (a) approaches Stance/Viewpoint Detection for representing online discourse and involved notions, (b) methods for mining such notions (for instance, claims, stances, sources, etc.) BEYOND FACTS: A CROSS-DISCIPLINARY and their relations from the Web, and (c) inter-disciplinary research COMMUNITY investigating online discourse. With the Web evolving into a ubiquitous platform giving the op- Beyond research into information and knowledge extraction, and portunity to everyone to publish content, express opinions and data modeling and consolidation for KG building, the workshop interact with others, understanding online discourse has become targets communities focusing on the analysis of online discourse, an increasingly important issue. We define online discourse as any relying on methods from Machine Learning (ML), Natural Language kind of narrative, debate or conversation that happens on the Web, Processing (NLP) and Data Mining (DM). These include communities including social networks or news media, involving claims and on: stances on controversial topics, their sources and contexts (such as • discourse analysis related events or entities). • social web mining Recently, a wide range of interdisciplinary research directions are being explored involving a variety of scientific disciplines. Such 1 http://schema.org/ClaimReview 2 https://data.gesis.org/claimskg/site works either are focused on gaining new scientific insights, for 3 https://data.gesis.org/tweetscov19/ instance, by investigating the spreading pattern of false claims on 4 https://data.gesis.org/tweetskb/ Twitter [11], or they aim at computational methods, for instance, 5 http://www.fakenewschallenge.org/ pipelines for detecting the stance of claim-relevant Web documents 6 https://www2018.thewebconf.org/call-for-papers/misinformation-cfp/ KnOD'21 Workshop - April 14, 2021 Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). KnOD’21 Workshop, April 14, 2021, Virtual Event Konstantin Todorov, Pavlos Fafalios, and Stefan Dietze • argumentation mining information disorders seen as disputes of narratives, while Ioana • computational fact-checking Manolescu gave an overview and lessons learned from the ANR • mis- and dis-information spread ContentCheck project, focusing on content management approaches • bias and controversy detection and analysis for assisting journalists in their day-to-day fact-checking efforts. • stance/viewpoint detection and representation We consider the first edition of the workshop a successful first • opinion mining step towards fostering a community on discourse analysis via struc- • rumour, propaganda and hate-speech detection tured knowledge in the context of the Web. We would like to warmly • computational social science thank all authors and keynote speakers for their contributions, par- Hence, KnOD provides a meeting point for these related but ticipation and exciting discussions during the workshop day. We distinct communities that address similar or closely related ques- also thank the members of the programme committee (see Appen- tions from different perspectives and in different fields, using dif- dix) for their constructive reviews, as well as the WebConf 2021 ferent models and definitions of the main notions of interest. The organizers and workshop chairs for their cooperation and support. workshop aims at strengthening the relations between these com- munities, providing a forum for shared works on the modeling, REFERENCES extraction and analysis of discourse on the Web. It addresses the [1] Pepa Atanasova, Preslav Nakov, Lluís Màrquez, Alberto Barrón-Cedeño, Georgi Karadzhov, Tsvetomila Mihaylova, Mitra Mohtarami, and James Glass. 2019. need for a shared understanding and structured knowledge about Automatic fact-checking using context and discourse information. Journal of discourse data in order to enable machine-interpretation, discov- Data and Information Quality (JDIQ) 11, 3 (2019), 1–27. erability and reuse, in support of scientific or journalistic studies [2] Isabelle Augenstein, Christina Lioma, Dongsheng Wang, Lucas Chaves Lima, Casper Hansen, Christian Hansen, and Jakob Grue Simonsen. 2019. MultiFC: A into the analysis of societal debates on the Web. Often the afore- Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims. mentioned communities apply their research in particular domains, In Proceedings of the 2019 EMNLP and the 9th IJCNLP. 4677–4691. such as scientific publishing, medicine, journalism or social science. [3] Katarina Boland, Pavlos Fafalios, Andon Tchechmedjiev, Stefan Dietze, and Kon- stantin Todorov. 2021. Beyond Facts–A Survey and Conceptualisation of Claims Therefore, the workshop is particularly interested in works that ap- in Online Discourse Analysis. Under review for the Semantic Web Journal (2021). ply an interdisciplinary approach, such as works on computational [4] Katarina Boland, Pavlos Fafalios, Andon Tchechmedjiev, Konstantin Todorov, and Stefan Dietze. 2019. Modeling and contextualizing claims. In 2nd International social sciences or computational journalism. Workshop on Contextualized Knowledge Graphs (CKG 2019 @ISWC). [5] Dimitar Dimitrov, Erdal Baran, Pavlos Fafalios, Ran Yu, Xiaofei Zhu, Matthäus WORKSHOP OVERVIEW Zloch, and Stefan Dietze. 2020. TweetsCOV19-A Knowledge Base of Semantically Annotated Tweets about the COVID-19 Pandemic. In Proceedings of the 29th ACM The KnOD 2021 workshop7 took place as a virtual event (due to International Conference on Information & Knowledge Management. 2991–2998. COVID-19 outbreak) jointly with the 30th The Web Conference [6] Pavlos Fafalios, Vasileios Iosifidis, Eirini Ntoutsi, and Stefan Dietze. 2018. Tweet- skb: A public and large-scale rdf corpus of annotated tweets. In European Semantic (WWW 2021)8 , as it closely relates to the topics of the venue in Web Conference. Springer, 177–190. terms of the nature of the analysed data and the targeted communi- [7] Kiran Garimella, Gianmarco De Francisci Morales, Aristides Gionis, and Michael Mathioudakis. 2018. Quantifying controversy on social media. ACM Trans. on ties. In particular, it complements, and bridges a number of research Soc. Comp. 1, 1 (2018), 3. tracks of the conference, such as "Semantics and Knowledge", "Web [8] Kashyap Popat, Subhabrata Mukherjee, Jannik Strötgen, and Gerhard Weikum. and Society", "Web Mining and Content Analysis" and in part "So- 2017. Where the truth lies: Explaining the credibility of emerging claims on the web and social media. In Proceedings of the 26th International Conference on World cial Network Analysis and Graph Algorithms". KnOD also fits into Wide Web Companion. 1003–1012. and continues a line of former WebConf forums such as the Fact [9] Kai Shu, Amy Sliva, Suhang Wang, Jiliang Tang, and Huan Liu. 2017. Fake news Checking track in 2018 or the workshops (and a special track in detection on social media: A data mining perspective. ACM SIGKDD explorations newsletter 19, 1 (2017), 22–36. 2019) on Data Science for Social Good. [10] Andon Tchechmedjiev, Pavlos Fafalios, Katarina Boland, Malo Gasquet, Matthäus This first edition of the KnOD workshop brought together a Zloch, Benjamin Zapilko, Stefan Dietze, and Konstantin Todorov. 2019. Claim- sKG: A knowledge graph of fact-checked claims. In International Semantic Web diverse community of researchers from different fields such as ar- Conference. Springer, 309–324. gument mining, knowledge graphs and neural language models [11] Soroush Vosoughi, Deb Roy, and Sinan Aral. 2018. The spread of true and false or databases, but also social and political science. Seven papers news online. Science 359, 6380 (2018), 1146–1151. [12] Xuezhi Wang, Cong Yu, Simon Baumgartner, and Flip Korn. 2018. Relevant were accepted for publication (2 long papers and 5 short ones) af- Document Discovery for Fact-Checking Articles. In WWW. 525–533. ter a peer-review process,9 spanning a palette of topics such as claim detection, relation extraction for online discourse model- ing, interpretable graph embeddings for misinformation detection, APPENDIX disinformation on social networks, fact-checking in relation to Programme Committee of KnOD 2021 argumentation schemes and false narratives, political and social sci- • Harith Alani, KMI, The Open University, UK entific perspectives on propaganda chains and discourse mapping. • Katarina Boland, GESIS, Germany The current volume contains the seven accepted papers. • Sandra Bringay, Paul Valéry University of Montpellier, In addition, we were very happy to host three excellent keynotes: France Preslav Nakov talked about detecting ‘Fake News’ before it was • Gianluca Demartini, University of Queensland, Australia even written, media literacy and flattening the curve of the COVID- • Ronald Denaux, Expert.AI, Spain 19 infodemic; Daniel Schwabe proposed his take on trust and • Vasilis Efthymiou, FORTH, Greece 7 https://knod2021.wordpress.com/ • Michael Färber, Karlsruhe Institute of Technology, Ger- 8 https://www2021.thewebconf.org/ many 9 Each submitted paper was reviewed by 3 programme committee members. • Jose Manuel Gomez-Perez, Expert.AI, Spain Beyond Facts: Online Discourse and Knowledge Graphs / A preface to the KnOD 2021 proceedings KnOD’21 Workshop, April 14, 2021, Virtual Event • Daniel Hardt, Copenhagen Business School, Denmark • Daniel Schwabe, Pontifícia Universidade Católica do Rio • Ioana Manolescu, INRIA Saclay and LIX/Ecole Polytech- de Janeiro, Brazil nique, France • Kostas Stefanidis, Tampere University, Finland • Preslav Nakov, Qatar Computing Research Institute, Qatar • Pedro Szekely, University of Southern California, USA • Panagiotis Papadakos, FORTH, Greece • Andon Tchechmedjiev, Ecoles des Mines d’Alès, France • Rajesh Piryani, South Asian University, New Delhi, India • Yannis Tzitzikas, FORTH and University of Crete, Greece • Xiaofei Zhu, Chongqing University of Technology, China