=Paper=
{{Paper
|id=Vol-3052/paper11
|storemode=property
|title=PROVENANCE: An Intermediary-Free Solution for Digital Content Verification
|pdfUrl=https://ceur-ws.org/Vol-3052/paper11.pdf
|volume=Vol-3052
|authors=Bilal Yousuf,,M. Atif Qureshi,,Brendan Spillane,,Gary Munnelly,, Oisin Carroll,, Mathew Runswick,, Kirsty Park,, Own Conlan,, Jane Suiter
|dblpUrl=https://dblp.org/rec/conf/cikm/YousufQSMCRPCS21
}}
==PROVENANCE: An Intermediary-Free Solution for Digital Content Verification==
PROVENANCE: An Intermediary-Free Solution for Digital Content Verification Bilal Yousuf1,2 , M. Atif Qureshi1,2 , Brendan Spillane1 , Gary Munnelly1 , Oisin Carroll1 , Matthew Runswick1 , Kirsty Park3 , Eileen Culloty3 , Owen Conlan1 and Jane Suiter3 1 ADAPT Centre, Trinity College Dublin 2 ADAPT Centre, Technological University Dublin 3 Institute for Future Media, Democracy and Society, Dublin City University Abstract The threat posed by misinformation and disinformation is one of the defining challenges of the 21st century. Provenance is designed to help combat this threat by warning users when the content they are looking at may be misinformation or disinformation. It is also designed to improve media literacy among its users and ultimately reduce susceptibility to the threat among vulnerable groups within society. The Provenance browser plugin checks the content that users see on the Internet and social media and provides warnings in their browser or social media feed. Unlike similar plugins, which require human experts to provide evaluations and can only provide simple binary warnings, Provenance’s state of the art technology does not require human input and it analyses seven aspects of the content users see and provides warnings where necessary. Keywords Misinformation, Disinformation, Fake News, Social Media, Plugin, Browser Extension 1. Introduction plugins only provide a single broad-spectrum warning about the content users are viewing whereas Provenance Provenance is an intermediary-free solution for digital is capable of evaluating content under seven criteria and content verification to combat misinformation and disin- providing individual warnings for each. Provenance’s formation on the Internet and social media. As per [1], it warning notifications are also educational and designed is designed to aid users by providing them with warning to inspire users to be more cautious and critical of the notifications in their browser or social media feed when information they consume. Thus, it will improve media viewing content that may be dangerous or problematic. literacy among users and make them less susceptible to The detailed warning notifications inform users which of the influence of misinformation and disinformation by the seven criteria Provenance’s state of the art technol- making them more critical and reflective of the content ogy has detected an issue with and why. It significantly they consume. improves upon all known similar solutions in two ways. There are significant research challenges in the design Firstly, existing solutions do not analyse the content the and development of Provenance. The main challenges user is viewing and are thus limited to providing users include the huge volume of news and other content pub- with warnings based on the news agencies historical pub- lished each day, the combination of multimedia formats lication record and behaviour. Secondly, existing browser in each article or story, the high churn-rate and short shelf-life of news, and the fact that news content is often Fourth Workshop On Knowledge-Driven Analytics And Systems republished from wire services or from other publishers. Impacting Human Quality Of Life (KDAH-CIKM-2021), November 01–05, 2021, Gold Coast, Queensland, Australia These are compounded by the fact that misinformation " bilal.yousuf@adaptcentre.ie (B. Yousuf); and disinformation are often designed to masquerade as muhammad.qureshi@adaptcentre.ie (M. A. Qureshi); real news. Many disinformation sources share character- brendan.spillane@adaptcentre.ie (B. Spillane); istics with the Lernaean Hydra of Greek mythology and gary.munnelly@adaptcentre.ie (G. Munnelly); re-post problematic content through multiple easy to set oisin.carroll@adaptcentre.ie (O. Carroll); matthew.runswick@adaptcentre.ie (M. Runswick); up websites or social media groups and reappear under kirsty.park@dcu.ie (K. Park); eileen.culloty@dcu.ie (E. Culloty); different guises when they are identified and shut down. owen.conlan@scss.tcd.ie (O. Conlan); jane.suiter@dcu.ie (J. Suiter) There are also a range of individual challenges within 0000-0001-6024-9084 (B. Yousuf); 0000-0003-4413-4476 components of the Provenance platform. These include (M. A. Qureshi); 0000-0001-5893-1340 (B. Spillane); deriving a system to assign accurate writing quality 0000-0002-7757-6142 (G. Munnelly); 0000-0001-9398-9388 (O. Carroll); 0000-0002-0848-931X (M. Runswick); scores for each piece of textual content, detecting when 0000-0001-7960-8462 (E. Culloty); 0000-0002-9054-9747 new facts introduced in a news article are indicative of (O. Conlan); 0000-0002-2747-8069 (J. Suiter) disinformation or an evolution in an unfolding story, de- © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). tecting image and video manipulations, or developing CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) a system that can differentiate between anger and fear organisations have also identified misinformation and in disinformation and anger and fear in opinion news disinformation as a threat and have increased efforts to articles. There is also some difficulty in differentiating combat it. These include the United Nations through its between news articles from alternative and independent Verified platform [15] and the World Health Organisation agencies and news articles from disinformation sources [16]. More can be read about these initiatives in the Poyn- due to often lower quality writing, more emotive content, ter Institute’s guide to national and international efforts and the reuse of images and videos. to combat misinformation and disinformation around the This paper provides an update on the ongoing progress world [17]. of developing Provenance. The remainder of this paper Provenance is a H2020 project1 , however it differs from is organised as follows. Section 2 Motivation and Back- many of the above as it is a user orientated intermediary- ground delves into the impetus for this project and sit- free solution to help consumers identify misinformation uates it within other recent EU disinformation projects. and disinformation as they browse the Internet and social Section 3 Related Work provides a detailed overview of media. It is also designed to improve media literacy skills similar browser plugins and describes how Provenance by equipping consumers with the tools, knowledge and advances the state of the art. Section 4 Architecture know-how to face this challenge now and into the future. Overview contains system architecture diagrams and de- scriptions of each component in the Provenance platform. Section 5 Provenance in Action provides a detailed expla- 3. Related Work nation of how the Provenance browser plugin provides This review of related work will focus on comparable warnings to the user. Section 6 Use Cases presents two browser plugins designed to provide users with warning use cases for the Provenance plugin to show in what notifications about disinformation or other problematic scenarios we envision it being used. Section 7 Evalua- content and which are currently active or maintained. tion briefly describes plans to evaluate the tool. Finally, The purpose of this review is to establish how Provenance section 8 Conclusions completes the paper with closing advances the state of the art. remarks. NewsGuard [18] provides ‘nutrition’ labels for news websites based on nine journalistic criteria. What differ- 2. Motivation and Background entiates it from many of the other fake news and bias detection browser plugins is that it does not use auto- The proliferation of misinformation and disinformation mated algorithms to assess news websites but rather re- on social media has been described as a strategic threat lies on a team of journalists to conduct reviews. It comes to democracy and society in the European Union (EU) as standard with Microsoft Edge, but a subscription is [2, 3]. A recent EU study on the issue found that the com- needed for other Internet browsers. Its notification icons mon narratives of society "are being splintered by filter appear as a browser extension in the upper right corner bubbles, and further ruined by micro-targeting." [4]. The and within third party search engines and social media report points out that like a virus, misinformation and platforms. Clicking on its browser icon opens a nutrition disinformation spread throughout society through social label pane where users can quickly see whether the news media and other platforms in open and closed groups to website passes or fails any of the nine criteria. A link the detriment of democratic systems. This occurs when is also available for users to see a more detailed report. "Susceptible users become weaponized as instruments for Visually, NewsGuard employs simple but effective white disseminating disinformation and propaganda" [4]. ✓on a green shield and red x iconography to denote The Presidents of the European Council, Commission when a website has passed or failed. NewsGuard’s trans- and Parliament have all made increasingly public calls parent methodology has resulted in their datasets being for concerted efforts to do more to combat the scourge used for research [19]. While expert led analysis has of fake news to protect democracy. The President of its merits, it also has issues with scalability, personal bi- the European Parliament has been the most forthright ases, and response times. Aker also maintains that much in this with a recent announcement that: "We must nur- of the credibility and transparency scoring provided by ture our democracy & defend our institutions against the NewsGuard could be automated [20]. corrosive power of hate speech, disinformation, fake news Décodex [21] created by Le Monde originally started & incitement to violence." [5]. As a result, the EU have as an online search facility for users to check URLs funded a range of FP7, H2020 and other projects to com- against a list of known websites which spread misin- bat misinformation and disinformation including WeV- formation and disinformation. They have since released erify [6, 7], SocialTruth [8], PHEME [9, 10], EUNOMIA a Facebook bot for users to directly chat to and a browser [11] Fandango [12, 13] and the European Digital Media plugin that provides red, orange or blue notifications to Observatory (EDMO) [14]. Many other international 1 https://cordis.europa.eu/project/id/825227 denote whether a website regularly disseminates false scores) to these common information portals so that information, whose reliability is doubtful, or if they are users may more easily choose high-quality information a parody website. When installed, the Décodex icon be- resources. It should be noted that this extension is not comes active when the website being viewed is listed in designed to provide users with detailed warning notifi- their database. It also produces a colour-coded popup cations when viewing a news website and thus is not with one of three standard warnings. Users cannot ac- directly comparable to the other systems or Provenance. cess detailed information about warnings, nor does it It is included here due to its use of MBFC, the fact that it appear to be integrated with well-known search engines, conveys limited visual information/warnings before the social media platforms or discussion boards. Décodex’s user visits an information source, and for plenitude. allow/deny list approach means that scalability is difficult and the warnings it provides are based on the historical 3.1. No Longer Active publication record of the website, not the content cur- rently being viewed. Transparency is also limited. While Many other projects and services related to this work, still available, its development appears to be in stasis. which have been reviewed in the literature, c.f. [25, 26, Media Bias Fact Check (MBFC)2 [22] is an extensive 27, 11, 28, 29, 30], now no longer appear to be active or media bias resource curated by a small team of journal- working. This is concerning as despite the fact that mis- ists and lay researchers who have undertaken detailed information and disinformation have been recognised as assessments of over 4000 media outlets. A transpar- a threat to democracy and social cohesion, and the fact ent assessment methodology means that their datasets that browser plugins are one of the few citizen-orientated have been used for several research projects [23, 20]. direct interventions which can help solve the problem at Their team of researchers undertake in-depth analyses source while increasing long term media literacy, very of news organisations and assess them using a standard- few of the proposed solutions have been actively pro- ised methodology, with some subjective judgement, to moted or maintained. The main reason for this appears to calculate a left/right bias score using their published for- be the fact that many of these plugins were developed by mula. They also calculate scores for factual reporting individuals or small teams, or even as part of a hackathon, and credibility. These reports are published on their web- and were thus lacked the resources to be actively main- site and updated from time to time. Each news website tained or updated to deal with changing technology such in their database is categorised as: left bias, left-centre as browser updates or the rapidly evolving threats posed bias, least biased, right-centre bias, right bias, pro-science, by misinformation and disinformation. The following conspiracy-pseudoscience, fake news, or satire. While present those related projects found in the literature, but their browser extension conveys limited details, further which now no longer appear to be actively maintained, information about each news source is available on their though some are still available to install. URLs have been website. It draws on this dataset to inform users when included for posterity where possible as many do not they click on the notification icon as to which of these have peer-reviewed publications. nine categories the news website they are viewing be- B.S Detector5 relied on matching the URLs of content longs to, including a brief explanation of the category. in the news feed to a known allow/deny list of sources It also provides a link to the detailed MBFC report. The of fake news and misinformation. browser extension also provides Facebook and Twitter AreYouFakeNews.com6 utilised Natural Language support by displaying a visual left/right bias scale on Processing (NLP) and deep learning to identify patterns news articles that appear in users feeds with links to the of bias on websites. MBFC detailed report and Factual Search3 so that the Fake News Detector AI7 claimed to use a neural net- user can investigate the topic further. While a valuable work to detect similarity between submitted URLs and resource with considerable detail, MBFC’s expert evalua- known fake news websites. tions are based on the historical publication record of the Fake News Detector8 was designed to learn from news website and not an evaluation of the content the webpages flagged by users to detect other similar fake user is looking at. It is also a labour intensive and time news webpages. consuming process. Trusted News9 is a browser plugin that was designed Stopaganda Plus4 [24] is a browser extension that to assess the objectivity of news articles. Its functionality adds accuracy and bias decals to Facebook, Twitter, Red- was limited to ‘long form’ news articles and it does not dit, DuckDuckGo and Google. These visual indicators work with social media content. extend the functionality of MBFC (who determine the 5 https://www.producthunt.com/posts/b-s-detector 2 6 https://mediabiasfactcheck.com/ https://github.com/N2ITN/are-you-fake-news 3 7 https://factualsearch.news https://www.fakenewsai.com/ 4 8 https://browserextension.dev/blog/stopagandaplus-helps- https://fakenewsdetector.org/ 9 understanding-media-biases/ https://trusted-news.com/ Fake News Guard10 claimed to combine linguistic where necessary, provide an easy to understand warning and network analysis techniques to identify fake news, to the user when the content they are viewing may be however this can no longer be verified. problematic or symptomatic of disinformation. In the FiB11 A browser extension built in a hackathon which cases where linguistic analysis or other machine learn- was reviewed several times in the literature as a compa- ing approaches have been utilized, the results are not rable system [31]. presented to the user in an explainable or transparent TrustedNews12 Trusted News used AI to help users way. Some of these methods have also proven susceptible evaluate news articles by scoring their objectivity [32]. to adversarial attacks, whereby text may be augmented However, it does not work on social media and has issues slightly to fool pretrained models [44, 45]. with analysing webpages that require scrolling. Two factors differentiating Provenance from the plug- Trusty Tweet [26] was designed to help users deal ins described above are their limited reach and scalability. with fake news tweets and to increase media literacy. Many of the above plugins do not provide any informa- Their transparent approach is designed to prevent reac- tion for some heavily trafficked news websites such as the tance and increase trust. Early user evaluations showed LA Times, Al Jazeera, and the Independent.co.uk. This promise. is likely due to limiting factors of time and labour of in- Check-It [33] was designed to analyse a range of sig- cluding humans in the disinformation judgement process. nals to identify fake news. It was focused on user privacy While no one doubts the benefits of highly trained expert with computation undertaken locally. Their approach judgement, the size and nature of the rapidly evolving used a combination of linguistic models, fact checking, media landscape, especially in regard to misinformation and website and social media user allow/deny lists. and disinformation in which publishers are prone to rapid growth, failure and re-branding, means that providing 3.2. Out of Scope Approaches human ratings is a never ending game of whack-a-mole. Current solutions are only partially succeeding in pro- Some misinformation and disinformation detection tools viding judgements of some news agencies. None have which have been reviewed in other papers have not been attempted to analyse the millions of pieces of content included in this literature review. This is because they they publish daily. Unlike each of the plugins described are not a browser plugin or they are a paid for b2b ser- above, Provenance does not require a human-in-the-loop, vice (Fakebox [34]; AreYouFakeNews [35]), they are fo- nor does it need to be backed by human-generated al- cused on an aligned but separate issue e.g., detection of low/deny lists. Its architecture supports fully automated bias or detection of reused and or manipulated images and intermediary free analysis of news content. (Ground.News [36]; SurfSafe [37]), they are specifically The ability to evaluate news articles against seven for fact checking (BRENDA [38], CredEye [39]), they criteria and provide users with visual notifications and have pivoted into a B2B platform (FightHoax [40]), they deeper explanations is also a significant advancement on are not user orientated (Credible News [41, 42]), or they the state of the art and a direct benefit to users in three are research systems and have not been made available to ways. First, and most importantly, users will be made the public [30, 43]. While relevant to combating disinfor- aware of individual issues with the content they are con- mation, these are not directly comparable to Provenance. suming and can thus decide whether they will continue viewing it or look for alternative sources. Second, it will 3.3. Advancing the State of the Art help develop users’ media literacy skills by making them aware of the different caution worthy indicators and how This review demonstrates that browser plugins are a to check them, making them less susceptible to misinfor- common user-orientated approach to combat misinfor- mation and disinformation in the future. Third, the na- mation and disinformation. However, Provenance adopts ture of these systems means that they cannot be properly a significantly more advanced and granular methodol- examined. In contrast, a full description of Provenance’s ogy than current or previous efforts in the domain. The system architecture is provided below. It is also currently warnings provided by earlier plugins are often based on undergoing evaluation and testing and the results will the news website’s history of publishing misinformation be published in time. and disinformation. Thus, they are limited to provid- ing a coarse-grained retrospective analysis of the news website’s publication history. In contrast, Provenance’s 4. Architecture Overview fine-grained approach is designed to analyse the content of the news webpage or users’ social media feeds and, The system architecture for Provenance is shown in Fig- ure 1. The components and services use REST APIs serv- 10 http://fakenewsguard.com/ ing JSON for easy, reliable, and fast data exchanges across 11 https://projectfib.azurewebsites.net/ internal subsystems. 12 https://trusted-news.com/ Figure 1: Provenance System Architecture: Dashed lines denote REST API calls, solid lines denote local access. Data in the form of webpages or social media con- to further investigate the claims made in the article’s tent is ingested by Provenance either through the Social content. The Personalised Companion Service is used to Network Monitor or by a Trusted Content Analyst (e.g., determine how this information should be presented for a journalist or fact checker). The Social Network Moni- an individual user. tor service discovers content using NewsWhip’s13 social network monitoring platform. The introduced asset is en- 4.1. Key Components riched with social engagement data (e.g., likes and shares) and is forwarded to the Asset Workflow Handler service. 4.1.1. Social Network Monitor The Asset Workflow Handler separates the incoming The Social Network Monitor communicates with data (e.g., a news webpage) into individual assets such NewsWhip’s Social Network API to identify assets which as images, video, text, etc. These assets are registered should be ingested by Provenance. Finding assets with the Asset Fingerprinter before being disseminated to involves querying Newswhip’s API with a parameterized the analytical components (Video/Image Reverse-searcher, search request. The call to NewsWhip’s Social Network Video/Image Manipulation Detector, Text Similarity De- API is automatically invoked periodically to maintain tector, Text Tone Detector, and Writing Quality Detector) an updated record of trending news articles and social to determine if they exhibit any features which normally media posts. Assets detected by NewsWhip are enriched characterise misleading, questionable, or unsubstantiated through social scoring. The URL, titles, summaries, im- information. The output of each analytical service, and ages and videos (if any), along with the enrichment data, the initial data passed from the Social Network Monitor is extracted from the article and provided to Provenance. are combined and sent to the Knowledge Graph where Assets composed only of text, for example, are registered they are stored. in fragments consisting of news feed/article title, the The Knowledge Graph may be queried by the Prove- summary, and user engagement data. nance Query Service to retrieve the results of analysis for a given webpage. The Provenance plugin, installed in the user’s browser, leverages this query service to retrieve 4.1.2. Asset Registration information about webpages that a user is currently view- A dedicated Asset Registration web interface also allows ing. If the webpage has been analysed by Provenance, Trusted Content Analysts to add assets into the Asset and exhibits questionable features, the plugin will issue Workflow Handler. Trusted Content Analysts are stake- a warning to the user, indicating that they may want holders such as journalists and other representatives 13 of news agencies and wire services, fact checkers, de- https://www.newswhip.com bunkers, and original content creators who may want to search operation for videos and images. register their multimedia content assets. In future, this facility will be made more widely available to allow the 4.1.5. Video/Image Manipulation Detector general public to send content directly to Provenance. It may also be integrated with news publication platforms The Provenance Video/Image Manipulation Detector iden- and content management systems so that content is au- tifies if an image or video has been manipulated in com- tomatically added. The primary task of this component parison to its source. This work is based on the PIZ- is to enable third-parties to register assets that have not ZARO14 project. It utilises recent developments achieved been discovered by the Social Network Monitor. by deep learning-based methods to enable an instant de- tection of manipulations in visual content. In addition, use of the latest technologies based on Convolutional Net- 4.1.3. Asset Workflow Handler works will lead to tangible enhancements in integrity ver- The Asset Workflow Handler is the component of the ification in visual content. The Video/Image Manipulation Provenance Verification Layer that is responsible for or- Detector increases trust and improves governance. The chestrating the components and data within the layer. solution is designed to build a web-based system to assess This component’s primary task is to distribute assets to visual content in a real-world setting. The Video/Image different components for further processing. It invokes Manipulation Detector will further support the develop- the service interfaces and handles the data flow between ment of user skills in detecting false visual information the services. By utilising the Asset Workflow Handler, themselves by providing a world-class image forensic components are loosely coupled, thus mitigating direct technology. The Video/Image Manipulation Detector has component-to-component communications. This will en- a special focus on developing a solution that will be intu- able Provenance to work with the variety of APIs exposed itive and easy to understand and interpret for end-users, from the existing tools/components. Moreover, the APIs thereby increasing its uptake by the public and its impact can be adjusted to meet Provenance’s specific needs. Due on the information system. This component’s primary to this modular design, new components can be easily task is to detect if the image and video are manipulated added to the Provenance Verification Layer (e.g., detection by comparing them with previously registered images of bias [46], tabloidization [47], and hate speech [48]), and videos in the system. and connected to the Asset Workflow Handler. 4.1.6. Asset Fingerprinter and Asset Registry 4.1.4. Video/Image Reverse Searcher The Asset Fingerprinter and Asset Registry provide trace- The Video/Image Reverse Searcher is a key component ability of registered content. It is based on Blockchain for creating a large-scale annotated dataset for detect- technology, making content immutable and enabling the ing manipulated visual content. The dataset consists of verification of the sources and alterations to the content. three distinct parts. The first part includes 45,000 images, Registered assets are handed to the Asset Fingerprinter each captured by a unique device (i.e., 45,000 different via the Asset Workflow Handler. Due to the General Data cameras have been used). Half of these images are real, Protection Regulation (GDPR) and the size of some assets, and the other half has been digitally manipulated by ap- the hash of the data is stored on Blockchain. Azure Stor- plying a random image processing operation to a local age is used as the Blockchain, and the assets themselves, area of the image. Since the sensor pattern noise present including large files, are stored using an off-line storage in images is unique to each sensor (i.e., camera), this service available to store multimedia files. Blockchain is dataset introduces large diversity, such as noise. The used due to its innate data integrity which is important second part of the dataset uses imaging software in cam- to prove the traceability of registered content if the tool eras to introduce a large diversity of artefacts in images. was ever targeted as part of a combined disinformation Commonly available camera brands and models were and hacking campaign. This component’s primary task identified and used to collect a dataset of 50,000 images. is the traceability of registered content via Blockchain. Half of these images were digitally manipulated using an advanced image editing method based on Generative 4.1.7. Text Similarity Detector Adversarial Networks (GAN) [49]. Finally, the third part of the dataset consists of 2,000 images downloaded from News is regularly republished nationally and locally the Internet representing “real-life” (uncontrolled) ma- from international wire services such as Reuters, Agence nipulated images created by random people. For all of France-Presse (AFP) and Associated Press (AP). In a bid the manipulated samples collected for the third part of to lower costs, many news agencies who are not in com- the data, the matching unmanipulated image was also petition negotiate deals to republish each other’s content. collected. This component’s primary task is to enable 14 http://zoi.utia.cas.cz/node/180/0459504 Similarly, less trustworthy news outlets often put ‘spins’ which had characteristics symptomatic of disinforma- on existing articles, where correct articles are modified tion, was annotated in a crowdsourced study to identify to contain false information. terms and phrases indicative of low quality writing. A To combat this, the Text Similarity Detector in Prove- WQS for each piece of content was then derived using a nance attempts to verify the textual content of an article standard formula. This was subject to testing and expert by comparing it to similar articles published elsewhere. evaluation to ensure the WQS the formula produced accu- A backlog of trustworthy articles is stored in an Elastic- rately reflected each piece of content. Models were then search database with a BM25 similarity index [50]. As trained on the dataset which showed that the WQS could BM25 under-performs with very long documents [51], be automatically generated with a high degree of accu- only the title and first 10 sentences are used in the index. racy. These models and the overall process are currently Once similar articles have been found the component undergoing formal evaluation. searches for facts given in the query article in the similar ones. Facts in an article are found by taking sentences 4.1.10. Knowledge Graph and Knowledge Graph with a low subjectivity from TextBlob’s sentiment analy- Builder sis model [52]. The similarity of two facts is the cosine similarity of the vector embedding of both, which is pro- The Provenance Knowledge Graph stores a record of all vided by Google’s multilingual text model [53]. If enough the articles introduced to Provenance via the Social Net- of the article’s factual content cannot be verified, the plu- work Monitor service or via Asset Registration from a gin displays a warning. Trusted Content Analyst. It is also a record of all analysis performed on said assets. The content is organised according to concept, cate- 4.1.8. Text Tone Detector gories and topics. For example, a news article discussing Intuitively, one would expect that impartial news sources politics can be categorised according to the left/right would use impartial, unemotive language to convey the political spectrum followed by the topics discussed as facts of a story. Recent research has shown that emotions shown in Figure 2. Each node at the article level is split such as fear, anger, sadness, doubt, and the absence of according to text, image and video. joy and happiness are indicative of misinformation and The output of the Video/Image Reverse Searcher in- disinformation [54, 55, 56]. Provenance’s Text Tone De- cludes the N most similar images/videos, distance mea- tector is designed to identify emotions in text which may sures and geometric validation results. The data from the indicate that the news source is unreliable. Threshold Video/Image Manipulation Detector includes the proba- values are used to determine whether caution should be bility of manipulations and the area of polygons. These shown, and the degree of caution is determined by how are sent as JSON objects to the Knowledge Graph where far the calculated value deviates from the threshold value. they are stored as entities in a triplestore. Modelling of Provenance data is achieved using a com- 4.1.9. Writing Quality Detector bination of the RDF Data Cube vocabulary [64] to store statistical information such as the outputs from the vari- Provenance’s Writing Quality Detector computes a writ- ous analytical components, and the Dublin Core/BIBO ing quality score (WQS) for the textual content the user vocabularies [65] to model bibliographic information is viewing and provides a warning when it falls below a about the assets themselves. Some use is also made of threshold value. Writing quality is closely related to cohe- the FOAF15 vocabulary to model information such as sion and coherence [57]. Within the context of news, high content publishers, which are naturally represented as quality writing is indicative of paid professional journal- foaf:Agent entities. ism from mainstream, independent, and to a lesser degree, The Knowledge Graph Builder is responsible for expos- alternative news agencies, whereas low quality writing is ing a REST API which the Asset Workflow Handler may indicative of amateur or unprofessional news production use to upload assets as JSON, and then transforming the processes [58]. This high/low quality differentiation is JSON into triples which are stored in a triplestore. In also apparent in other domains such as academia, pub- Provenance, this is achieved using JOPA [66]: a Java li- lishing, commercial, and blogs and information websites. brary which can be used to map POJOs to triples. Using While NLP techniques exist to derive writing quality [59], Spring Boot16 , a REST API accepting JSON is exposed. and others have called for it to be used to identify misin- The uploaded JSON is serialized into POJOs using Spring formation and disinformation [60, 61], only two examples Boot’s built-in version of Jackson. JOPA is then used to of systems could be found in the literature which actually serialize the triples out to an RDF4J17 instance. calculate writing quality [62, 63]. To calculate WQSs for Provenance, a dataset of news 15 http://xmlns.com/foaf/spec/ articles, blog posts, and other website content, much of 16 https://spring.io/projects/spring-boot 17 https://rdf4j.org/ is implemented as a Chrome Extension and works on the Facebook and Twitter platforms and with articles published by news agencies. The Personalised Compan- ion Service uses the user’s interests, domain knowledge, digital literacy, and the warning preferences stored in the Minimal User Model to determine whether to high- light caution or show the verification indicator without caution. The Personalised Companion Service uses the data provided by the Asset Fingerprinter, the Video/Image Reverse Searcher and Video/Image Manipulation Detector, and the Text Similarity, Tone and Writing Quality Detector components to create the set of icons that are presented to users, who can explore the levels of verification pre- sented through the visual iconography. Figure 2: Knowledge Graph categorisations of assets. 5. Provenance in Action The Provenance browser plugin is designed to provide The same serialization process works in reverse, al- users with easy to understand, granular and cautionary lowing the Provenance Query Service to expose both a warnings about the content they are consuming. These JSON REST endpoint which can produce JSON objects warnings are provided via an in-browser icon beside the from the results of a canned SPARQL query exposed via a address bar when the user is browsing the Internet, or Spring Boot REST endpoint, and a much lower level raw within their Facebook and Twitter social media feeds SPARQL endpoint from the triplestore, for those who beside the content they are viewing. Figures 3 - 6 show want a high level of control over their queries. how Provenance and its visual warnings appear to a user - who has the Provenance plugin installed - within their 4.1.11. Provenance Query Service Facebook social media feed. The Provenance icon appears The Provenance Query Service is the interface to the Verifi- as a small blue square with a white P above each content cation Layer and offers external trusted services with the item that it has checked. When the icon background means to request verification information about a web- turns red (with a small exclamation mark), it indicates to page or article. It will also allow trusted services with the user that the content item is worthy of a cautionary a means to identify the relatedness of content (through warning. The following presents the four main states of similarity and the Knowledge Graph) and determine if Provenance which a user will see. content has been modified. As the results of all analy- Figure 3 shows a user’s Facebook feed who has the sis are stored in the Knowledge Graph, the Provenance Provenance browser plugin installed. The Provenance Query Service is effectively a proxy between the user- icon is visible at the top of each news article in the user’s facing front-end, and the query interface to whatever feed. In this image, the icon is blue which indicates that storage medium is used to implement the Knowledge there are no warnings with this particular news item. Graph. In Figure 4, the background of the Provenance icon As mentioned in Section 4.1.10, the Provenance Query within the user’s news feed has turned red to indicate Service exposes both a raw SPARQL endpoint and a REST that this news item is worthy of one or more cautionary API which provides endpoints for a number of canned warnings. A small black exclamation mark has been SPARQL queries which return JSON objects. It is envi- added to the top right of the icon for colour blind users. sioned that the vast majority of user cases will be covered In Figure 5, the user has clicked on the red Provenance by the REST API, making it easier for developers to access icon. A window has appeared beneath the Provenance data that is helpful to users. However, it is worthwhile to icon to show the user which of the seven criteria the allow lower level access to the KG’s contents in the event news article was checked against that Provenance has de- of unforeseen requirements being placed on the KG. tected an issue with. In this example, the red background and exclamation mark beneath the Writing Quality icon 4.1.12. Personalised Companion Service indicates that this aspect of the news article is worthy of caution. The user may click on the downward arrow be- The Personalised Companion Service manages the Prove- neath each icon for further information. In this example, nance verification indicator, the minimal user model, and the Tone icon is greyed out indicating that this could not user scrutability and control. The verification indicator be assessed by Provenance in this instance. Figure 3: A user’s Facebook feed showing the Provenance icon in blue indicating that there are no warnings. Figure 5: An initial explanation pane appears when then user clicks on the Provenance icon in their social media feed. Figure 4: The Provenance icon in red (with exclamation mark) indicating that this article has one or more issues which are worthy of caution. Figure 6: A detailed explanation pane appears when the user clicks on any of the seven categories Provenance analyses the news item under. Figure 6 shows a detailed explanation of the Writing Quality warning after the user clicked on the option to expand it. It contains further information about how 6. Use Cases: Provenance Plugin Writing Quality score is calculated and why low quality writing is indicative of misinformation and disinforma- 6.1. Social Media Timeline tion. On the recommendation of a friend, Mary installed the Provenance browser plugin due to increased concerns about the spread of misinformation and disinformation. the images. As this is just an image of a press conference, The instructional video on the Provenance Chrome Ex- she is confident that its use by multiple news agencies is tension webpage explained that Provenance uses seven not an issue. criteria to verify digital content on the Internet and social media feeds. After installing the Provenance plugin, she notices that the news items in her Facebook timeline now 7. Evaluation display the Provenance icon beside the publisher’s name. Provenance is under development and will shortly be un- For most of the news stories, the Provenance icon shows dergoing human evaluation. Currently, five of the seven a white P inside a white circle on a blue background. news analysis functions have been implemented and have When she clicks on the blue Provenance icon, it opens a been integrated with the platform. These are undergoing notification pane showing the seven verification criteria, technical evaluation while the final two analysis tools are all of which display a green background with a white ✓. being completed. When the tool is fully completed, a se- She is able to click on each of the seven verification ries of technical tests and human evaluation tests will be icons to read a detailed explanation for each criterion, undertaken to evaluate basic functionality and to ensure why failing the criterion is an indication that the webpage that it is providing the right warnings at the appropriate or social media post may be misinformation or disinfor- time. Following this, a series of experiments will be un- mation, and how the warning is derived. As all of the dertaken to evaluate its effect on user behaviour. This icons are green, she is reassured about the origin, ve- will include the likelihood of reading and sharing news racity and overall quality of the news article. For some articles that have cautionary warnings beside them. We news items displayed on her timeline, she notices that will also be analysing unintended effects of the tool. Fi- the blue background of the Provenance icon has turned nally, a series of long term studies are planned to evaluate red. When she clicks on it, the same information pane its effect on users’ media literacy. displaying the same verification criteria appears, except one or more of the seven verification criteria now display a red background with an exclamation mark beneath. 8. Conclusions When she clicks on these, an additional detailed expla- nation pane appears underneath them to explain why it Misinformation and disinformation are significant issues has failed. Reading through each warning including their that have negatively affected public discourse, politics detailed description, she gains a better understanding and social cohesion. The Internet and especially social of how to identify misinformation and disinformation. media are the primary conduits for its growth and spread. In both instances, Mary has become more aware of the Existing user-orientated browser plugins have limited need to critically check the news she consumes and more capabilities and only provide users with an historical rat- aware of good media literacy habits in general. ing of a website’s propensity to publish misinformation and disinformation. They are also not capable of detailed 6.2. News Websites analysis of the content of news webpages or social me- dia feeds. The Provenance browser plugin significantly Mary regularly visits news websites to inform herself of improves upon existing user orientated solutions by pro- current affairs. Usually, the Provenance icon, which is viding intermediary free analysis of webpage and social visible to the right of her browser’s address bar, displays media content using seven criteria, and where necessary a white P inside a white circle on a blue background. providing cautionary warnings to users. The user can However, recently when she was visiting news websites then check the detailed explanatory warning notifica- to read more about a story relating to Covid 19 vaccina- tions to make their own judgement. This will improve tion, she noticed that the background of the Provenance users’ media literacy and reduce susceptibility to misin- icon would sometimes turn red. When she clicked on the formation and disinformation long term. icon, the verification criteria information pane showed that Provenance had detected a problem with the image used in the news article she was reading. Clicking on 9. Acknowledgements the arrow to open the drop-down explanation pane, she reads that Provenance has detected that the image has The work has been supported by the PROVENANCE been used before in another article. The image in ques- project which has received funding from the European tion shows a picture taken at a conference of the World Union’s Horizon 2020 research and innovation pro- Health Organisation. Looking closely, she sees a credit gramme under Grant Agreement No. 825227, and with to the Associated Press (AP). She knows that AP is an the financial support of Science Foundation Ireland under international news wire service, and that local and na- Grant Agreement No. 13/RC/2106_P2 at the ADAPT SFI tional news agencies republish their articles, including Research Centre. References sikas, V. Zorkadis (Eds.), E-Democracy – Safeguard- ing Democracy and Human Rights in the Digi- [1] G. Rehm, An infrastructure for empowering in- tal Age, Communications in Computer and In- ternet users to handle fake news and other online formation Science, Springer International Publish- media phenomena, in: G. Rehm, T. Declerck (Eds.), ing, 2020, p. 196–208. doi:10.1007/978-3-030- Language Technologies for the Challenges of the 37545-4_13. Digital Age, Lecture Notes in Computer Science, [12] D. Martín-Gutiérrez, G. Hernández-Peñaloza, J. M. Springer International Publishing, 2018, p. 216–231. Menéndez, F. Álvarez, A multi-modal approach for doi:10.1007/978-3-319-73706-5_19. fake news discovery and propagation from big data [2] E. Commission, Action plan against disinfor- analysis and artificial intelligence operations (2020) mation (2018). URL: https://ec.europa.eu/digital- 3. single-market/en/news/action-plan-against- [13] D. Martín-Gutiérrez, G. Hernández-Peñaloza, disinformation. A. B. Hernández, A. Lozano-Diez, F. Ál- [3] E. Commission, Tackling online disinformation, varez, A deep learning approach for robust 2017. URL: https://ec.europa.eu/digital-single- detection of bots in twitter using trans- market/en/tackling-online-disinformation. formers, IEEE Access 9 (2021) 54591–54601. [4] J. Bayer, N. Bitiukova, P. Bard, J. Szakács, A. Ale- doi:10.1109/ACCESS.2021.3068659. manno, E. Uszkiewicz, Disinformation and Propa- [14] L. Ginsborg, P. Gori, Report on a survey for fact ganda – Impact on the Functioning of the Rule of checkers on COVID-19 vaccines and disinforma- Law in the EU and its Member States, 2019. URL: tion, 2021. URL: https://cadmus.eui.eu//handle/ https://papers.ssrn.com/abstract=3409279. 1814/70917, accepted: 2021-04-26T08:57:47Z. [5] 2021. URL: https://twitter.com/vonderleyen/status/ [15] U. N. S. Verified, Shareverified, 2021. URL: https: 1354030170789834755. //shareverified.com/en. [6] A. Aker, A. Sliwa, F. Dalvi, K. Bontcheva, Rumour [16] W. H. Organisation, 1st who infodemiology con- verification through recurring information and an ference, who infodemic management, 2020. URL: inner-attention mechanism, Online Social Net- https://www.who.int/teams/risk-communication/ works and Media 13 (2019) 100045. doi:10.1016/ infodemic-management/1st-who-infodemiology- j.osnem.2019.07.001. conference. [7] Z. Marinova, J. Spangenberg, D. Teyssou, S. Pa- [17] T. P. Institute, A guide to anti-misinformation padopoulos, N. Sarris, A. Alaphilippe, K. Bontcheva, actions around the world, 2021. URL: https: Weverify: Wider and enhanced verification for //www.poynter.org/ifcn/anti-misinformation- you project overview and tools, in: 2020 IEEE actions/. International Conference on Multimedia Expo [18] 2021. URL: https://www.newsguardtech.com/. Workshops (ICMEW), 2020, p. 1–4. doi:10.1109/ [19] J. Nørregaard, B. D. Horne, S. Adalı, Nela-gt-2018: ICMEW46912.2020.9106056. A large multi-labelled news dataset for the study [8] M. Choraś, M. Pawlicki, R. Kozik, K. Demestichas, of misinformation in news articles, Proceedings P. Kosmides, M. Gupta, Socialtruth project ap- of the International AAAI Conference on Web and proach to online disinformation (fake news) de- Social Media 13 (2019) 630–638. tection and mitigation, in: Proceedings of the [20] A. Aker, V. Kevin, K. Bontcheva, Credibility and 14th International Conference on Availability, Re- transparency of news sources: Data collection and liability and Security, ARES ’19, Association for feature analysis (2019) 6. Computing Machinery, 2019, p. 1–10. URL: https: [21] Le Monde.fr (2017). URL: https://www.lemonde.fr/ //doi.org/10.1145/3339252.3341497. doi:10.1145/ les-decodeurs/article/2017/01/23/le-decodex-un- 3339252.3341497. premier-premier-pas-vers-la-verification-de- [9] L. Derczynski, K. Bontcheva, Pheme: Veracity in masse-de-l-information_5067709_4355770.html. digital social networks (2014) 4. [22] 2021. URL: https://mediabiasfactcheck.com/. [10] P. K. Srijith, M. Hepple, K. Bontcheva, D. Preotiuc- [23] V. Kevin, B. Högden, C. Schwenger, A. Şahan, Pietro, Sub-story detection in twitter with hierar- N. Madan, P. Aggarwal, A. Bangaru, F. Muradov, chical dirichlet processes, Information Processing A. Aker, Information nutrition labels: A plugin for & Management 53 (2017) 989–1003. doi:10.1016/ online news evaluation, ACL, 2018. doi:10.18653/ j.ipm.2016.10.004. v1/W18-5505. [11] L. Toumanidis, R. Heartfield, P. Kasnesis, G. Loukas, [24] 2020. URL: https://browserextension.dev/blog/ C. Patrikakis, A prototype framework for assess- stopagandaplus-helps-understanding-media- ing information provenance in decentralised so- biases/. cial media: The eunomia concept, in: S. Kat- [25] P. Nordberg, J. Kävrestad, M. Nohlberg, Au- tomatic detection of fake news, in: Proceed- chinery, 2020, p. 2117–2120. URL: https://doi.org/ ings of the 6th International Workshop on Socio- 10.1145/3397271.3401396. Technical Perspective in IS Development (STPIS [39] K. Popat, S. Mukherjee, J. Strötgen, G. Weikum, 2020), CEUR-WS, 2020, p. 168–179. URL: http:// Credeye: A credibility lens for analyzing and ex- urn.kb.se/resolve?urn=urn:nbn:se:his:diva-19356. plaining misinformation, in: Companion Pro- [26] K. Hartwig, C. Reuter, Trustytweet: An indicator- ceedings of the The Web Conference 2018, WWW based browser-plugin to assist users in dealing with ’18, International World Wide Web Conferences fake news on twitter (2019). Steering Committee, 2018, p. 155–158. URL: https: [27] A. Giełczyk, R. Wawrzyniak, M. Choraś, Evalua- //doi.org/10.1145/3184558.3186967. doi:10.1145/ tion of the existing tools for fake news detection, 3184558.3186967. in: K. Saeed, R. Chaki, V. Janev (Eds.), Computer [40] FightHoax, Fighthoax - unlock your programmatic Information Systems and Industrial Management, advertising, 2021. URL: http://34.253.212.69/. Lecture Notes in Computer Science, Springer Inter- [41] M. Hardalov, I. Koychev, P. Nakov, In search of cred- national Publishing, 2019, p. 144–151. doi:10.1007/ ible news, in: C. Dichev, G. Agre (Eds.), Artificial 978-3-030-28957-7_13. Intelligence: Methodology, Systems, and Applica- [28] A. Školkay, J. Filin, A comparison of fake news de- tions, Lecture Notes in Computer Science, 2016. tecting and fact-checking ai based solutions, Studia doi:10.1007/978-3-319-44748-3_17. Medioznawcze 20 (2019) 365–383. [42] M. Hardalov, mhardalov/news-credibility, 2019. [29] K. Shu, A. Sliva, S. Wang, J. Tang, H. Liu, Fake URL: https://github.com/mhardalov/news- news detection on social media: A data mining per- credibility. spective, ACM SIGKDD Explorations Newsletter [43] X. Zhou, A. Jain, V. V. Phoha, R. Zafarani, Fake news 19 (2017) 22–36. doi:10.1145/3137597.3137600. early detection: A theory-driven model, Digital [30] A. Hanselowski, A. PVS, B. Schiller, F. Caspel- Threats: Research and Practice 1 (2020) 12:1–12:25. herr, D. Chaudhuri, C. M. Meyer, I. Gurevych, A doi:10.1145/3377478. retrospective analysis of the fake news challenge [44] W. E. Zhang, Q. Z. Sheng, A. Alhazmi, C. Li, Adver- stance-detection task, in: Proceedings of the 27th sarial attacks on deep learning models in natural International Conference on Computational Lin- language processing: A survey, arXiv:1901.06796 guistics, Association for Computational Linguistics, [cs] (2019). URL: http://arxiv.org/abs/1901.06796, 2018, p. 1859–1874. URL: https://www.aclweb.org/ arXiv: 1901.06796. anthology/C18-1158. [45] Z. Zhou, H. Guan, M. M. Bhat, J. Hsu, Fake [31] A. Goel, ProjectFib - GitHub Repo, 2016. URL: https: news detection via nlp is vulnerable to adver- //github.com/anantdgoel/ProjectFib. sarial attacks, Proceedings of the 11th In- [32] Eyeo, 2020. URL: https://chrome.google.com/ ternational Conference on Agents and Artifi- webstore/detail/trusted-news/ cial Intelligence (2019) 794–800. doi:10.5220/ nkkghpncidknplmlkgemdoekpckjmlok?hl=en. 0007566307940800, arXiv: 1901.09657. [33] D. Paschalides, C. Christodoulou, R. Andreou, [46] B. Spillane, S. Lawless, V. Wade, The impact of G. Pallis, M. D. Dikaiakos, A. Kornilakis, increasing and decreasing the professionalism of E. Markatos, Check-it: A plugin for detecting news webpage aesthetics on the perception of bias and reducing the spread of fake news and misin- in news articles, in: Proceedings of the 22nd formation on the web, in: 2019 IEEE/WIC/ACM International Conference On Human-Computer International Conference on Web Intelligence (WI), Interaction, Lecture Notes in Computer Science, 2019, p. 298–302. Springer, 2020. doi:https://doi.org/10.1007/ [34] V. Inc, Fakebox, 2021. URL: https://machinebox.io/. 978-3-030-49059-1_50. [35] Z. A. Estela, N2ITN/are-you-fake-news, 2021. URL: [47] B. Spillane, I. Hoe, M. Brady, V. Wade, S. Lawless, https://github.com/N2ITN/are-you-fake-news. Tabloidization versus credibility: Short term [36] 2021. URL: https://ground.news/. gain for long term pain, in: CHI ’20: The ACM [37] A. Bhat, SurfSafe, 2021. URL: Conference on Human Factors in Computing https://chrome.google.com/webstore/ Systems, ACM, 2020. URL: https://dl.acm.org/ detail/surfsafe-join-the-fight-a/ doi/abs/10.1145/3313831.3376388. doi:http: hbpagabeiphkfhbboacggckhkkipgdmh?hl=en. //dx.doi.org/10.1145/3313831.3376388. [38] B. Botnevik, E. Sakariassen, V. Setty, Brenda: [48] A. Schmidt, M. Wiegand, A survey on hate speech Browser extension for fake news detection, in: detection using natural language processing, in: Proceedings of the 43rd International ACM SIGIR Proceedings of the Fifth International Workshop Conference on Research and Development in Infor- on Natural Language Processing for Social Media, mation Retrieval, Association for Computing Ma- Association for Computational Linguistics, 2017, p. 1–10. URL: https://aclanthology.org/W17-1101. spread of fake news via third-person perception, doi:10.18653/v1/W17-1101. Human Communication Research 47 (2021) 1–24. [49] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, doi:10.1093/hcr/hqaa010. D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio, [59] V. Klyuev, Fake news filtering: Semantic ap- Generative adversarial nets, in: Advances in proaches, in: 2018 7th International Conference Neural Information Processing Systems, vol- on Reliability, Infocom Technologies and Optimiza- ume 27, Curran Associates, Inc., 2014. URL: tion (Trends and Future Directions) (ICRITO), 2018, https://proceedings.neurips.cc/paper/2014/hash/ p. 9–15. doi:10.1109/ICRITO.2018.8748506. 5ca3e9b122f61f8f06494c97b1afccf3-Abstract.html. [60] M. Spradling, J. Straub, J. Strong, Protection from [50] S. E. Robertson, S. Walker, Some simple effective ‘fake news’: The need for descriptive factual label- approximations to the 2-poisson model for proba- ing for online content, Future Internet 13 (2021) bilistic weighted retrieval, in: SIGIR’94, Springer, 142. doi:10.3390/fi13060142. 1994, pp. 232–241. [61] N. Fuhr, A. Giachanou, G. Grefenstette, I. Gurevych, [51] Y. Lv, C. Zhai, When documents are very A. Hanselowski, K. Jarvelin, R. Jones, Y. Liu, long, bm25 fails!, in: Proceedings of the J. Mothe, W. Nejdl, et al., An information nutritional 34th international ACM SIGIR conference on label for online documents, ACM SIGIR Forum 51 Research and development in Information Re- (2018) 46–66. doi:10.1145/3190580.3190588. trieval, SIGIR ’11, Association for Computing [62] C. Fan, Classifying fake news, 2017. URL: Machinery, 2011, p. 1103–1104. URL: https: https://www.conniefan.com/wp-content/uploads/ //doi.org/10.1145/2009916.2010070. doi:10.1145/ 2017/03/classifying-fake-news.pdf, connie Fan. 2009916.2010070. [63] E. S. Jo, A. Muhamed, S. Nuthakki, A. Singhania, [52] S. Loria, textblob documentation (2020). URL: DeepNews: Detecting Quality in News, 2018. https://buildmedia.readthedocs.org/media/pdf/ [64] W. W. W. Consortium, et al., The rdf data cube textblob/latest/textblob.pdf, release 0.16.0. vocabulary (2014). [53] Y. Yang, D. Cer, A. Ahmad, M. Guo, J. Law, [65] D. C. M. Initiative, et al., Dublin core metadata N. Constant, G. H. Abrego, S. Yuan, C. Tar, Y.-H. element set, version 1.1 (2012). Sung, B. Strope, R. Kurzweil, Multilingual univer- [66] M. Ledvinka, P. Kremen, Jopa: Accessing ontologies sal sentence encoder for semantic retrieval, 2019. in an object-oriented way., in: ICEIS (2), 2015, pp. arXiv:1907.04307. 212–221. [54] S. B. Parikh, V. Patil, P. K. Atrey, On the origin, proliferation and tone of fake news, in: 2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), IEEE, 2019, p. 135–140. URL: https://ieeexplore.ieee.org/document/8695387/. doi:10.1109/MIPR.2019.00031. [55] J. Paschen, Investigating the emotional appeal of fake news using artificial intelligence and human contributions, Journal of Product & Brand Manage- ment 29 (2019) 223–233. doi:10.1108/JPBM-12- 2018-2179. [56] X. Zhang, J. Cao, X. Li, Q. Sheng, L. Zhong, K. Shu, Mining dual emotion for fake news detection, Proceedings of the Web Con- ference 2021 (2021) 3465–3476. doi:10.1145/ 3442381.3450004, arXiv: 1903.01728 version: 1. [57] I. Singh, D. P., A. K., On the coherence of fake news articles, in: I. Koprinska, M. Kamp, A. Ap- pice, C. Loglisci, L. Antonie, A. Zimmermann, R. Guidotti, O. Özgöbek, R. P. Ribeiro, R. Gavaldà, et al. (Eds.), ECML PKDD 2020 Workshops, Com- munications in Computer and Information Science, Springer International Publishing, 2020, p. 591–607. doi:10.1007/978-3-030-65965-3_42. [58] M. Chung, N. Kim, When i learn the news is false: How fact-checking information stems the