<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">On the Readability of Misinformation in Comparison to the Truth</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Mohammadali</forename><surname>Tavakoli</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Knowledge Media institute</orgName>
								<orgName type="institution">The Open University</orgName>
								<address>
									<addrLine>Walton Hall, Milton Keynes</addrLine>
									<postCode>MK7 6AA</postCode>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Harith</forename><surname>Alani</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Knowledge Media institute</orgName>
								<orgName type="institution">The Open University</orgName>
								<address>
									<addrLine>Walton Hall, Milton Keynes</addrLine>
									<postCode>MK7 6AA</postCode>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Grégoire</forename><surname>Burel</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Knowledge Media institute</orgName>
								<orgName type="institution">The Open University</orgName>
								<address>
									<addrLine>Walton Hall, Milton Keynes</addrLine>
									<postCode>MK7 6AA</postCode>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">On the Readability of Misinformation in Comparison to the Truth</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">B7C79AF6F6ECCC43EF23BDD6EBD03ED4</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-04-29T06:30+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Ease of processing</term>
					<term>Readability</term>
					<term>Misinformation</term>
					<term>False claims</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Psychological studies have demonstrated that much misinformation circulating on the Web tends to be more believable and memorable due to its ease of processing. The readability of a passage is a crucial factor in the ease of processing, as it indicates how easy or difficult it is to read and understand. According to some qualitative research, if online misinformation is easier to read, it becomes stickier and more memorable. In contrast, other studies showed that people are more likely to trust and believe misinformation when it appears to be more complex. As a result of such conflicting findings, it remains unclear how readability is associated with true or false content on the Web in general. This paper aims to gain a deeper understanding of readability through quantitative analysis by applying six readability formulas to four datasets containing both true and false content, as well as across multiple datasets. Our research shows that false claims are generally harder to read than true claims.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Papers from psychology have demonstrated through a range of qualitative studies that misinformation tends to be easier to process in general, and thus easier to believe and remember <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b1">2]</ref>. Ease of processing, also called processing fluency, refers to the ease with which a piece of information can be processed by its readers. Understanding what makes misinformation easier to process is key to producing more effective methods to curb its spread.</p><p>In textual content, one of the features that influence its ease of processing is readability <ref type="bibr" target="#b2">[3]</ref>. Currently, research is conflicting with respect to how readability is associated with online misinformation. On the one hand, easy-to-read misinformation is found to stick more to the readers' mind <ref type="bibr" target="#b0">[1]</ref> and on the other hand, people are found to be more likely to trust and believe more complex information <ref type="bibr" target="#b3">[4]</ref>. This raises the need for analysing information that is known to be false and comparing its readability measurement with information that is true, to help in better determining how high/low readability is associated with true/false information online.</p><p>To understand how readability relates to these categories, we analysed the readability of true and false information collected from the Web. To this end, the research question addressed in this paper is: How readability of misinformation compares to that of true information? To address this question, we collect news articles and claims containing false and true content items (i.e., claims and articles) and analyse them in terms of readability. The main contributions of this paper are (1) Analyse four datasets of True and False information from the Web; (2) Measure and compare the readability of the datasets using six different readability measures, and; <ref type="bibr" target="#b2">(3)</ref> Demonstrate that misinformation appears to be harder to read than true information.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related work</head><p>The mechanism of assessing the truth by humans often consists of two phases; intuitive and analytic assessments. Through the intuitive phase, we make a decision on whether to accept the received information or to begin the analytic assessment process <ref type="bibr" target="#b4">[5,</ref><ref type="bibr" target="#b5">6]</ref>. The simpler and more intuitive the information is to us, the less likely we are to kick-start the analytical process <ref type="bibr" target="#b6">[7]</ref>. Ease of processing of (mis)information is, therefore, an influential factor of how quickly and intuitively we are prone to accepting such information without proper scrutiny <ref type="bibr" target="#b0">[1]</ref>.</p><p>Various parameters have been found to be associated with increasing ease of processing, such as familiarity <ref type="bibr" target="#b7">[8,</ref><ref type="bibr" target="#b8">9,</ref><ref type="bibr" target="#b9">10]</ref>, compatibility with prior beliefs <ref type="bibr" target="#b10">[11,</ref><ref type="bibr" target="#b11">12]</ref>, perceived credibility of source <ref type="bibr" target="#b12">[13]</ref>, and social consensus <ref type="bibr" target="#b13">[14,</ref><ref type="bibr" target="#b14">15,</ref><ref type="bibr" target="#b15">16]</ref>. Readability is another key feature for assessing the ease of processing textual contents and reflects the level of difficulty in which text information can be read and understood <ref type="bibr" target="#b16">[17]</ref>. Some readability studies focused on cosmetic features such as colour contrast <ref type="bibr" target="#b17">[18]</ref> and font type and size <ref type="bibr" target="#b18">[19]</ref>. In <ref type="bibr" target="#b18">[19]</ref>, authors found that 35% more participants were misled by information when using easier-to-read fonts. In a study with over 92K false and true news articles, it was found that misinformation was 3% easier to read than true information <ref type="bibr" target="#b19">[20]</ref>, where readability was measured using Flesch-Kincaid method (FK) <ref type="bibr" target="#b20">[21]</ref> which takes into account the number of words, sentences, and syllables to calculate the level of readability of given text.</p><p>In some scenarios, readability was found to play a rather surprising role. For example, in <ref type="bibr" target="#b3">[4]</ref>, authors found that when providing text with either False or True information, the participants trusted the harder-to-read text regardless of its veracity. The authors concluded that reading difficulty gave a stronger perception of truthfulness <ref type="bibr" target="#b21">[22]</ref>. Other researchers found that readers tend to invest less cognitive effort in judging the truthfulness of news when they have a higher level of reading difficulty, i.e., they believe the information based on face value <ref type="bibr" target="#b22">[23]</ref>. Some of the readability measures have been used as classification attributes to distinguish between true and false information. FK and GFI (see section 3.3) for example have been used along with several other lexical, stylistic, and grammatical features by Horne and Adah <ref type="bibr" target="#b23">[24]</ref>, in an SVM-based model to classify news articles into true, false, and satire. The authors concluded that the style and complexity of fake content are significantly different from real one, yet, it is more closely related to satire than to real. They found that the readability related features cause improvement in classifying news articles into the target classes. A similar model was built in <ref type="bibr" target="#b24">[25]</ref> to classify Portuguese news articles into true and false. The authors used 165 textual features including some readability measures adopted for the utilised language. Although it is yet unknown whether their findings from investigating the Portuguese data are generalizable to English and other languages, they show that the classifiers with readability-related features, such as DCI and GFI (see section 3.3), in turn, achieve higher accuracy. These studies, however, lack a proper analysis to investigate how each of these features is associated with true and false information and to what extent these associations differ from each other.</p><p>From the above, it is clear that readability can be measured in different ways and can have different impacts on misinformation. Our work in this paper differs from the state of the art in that we apply multiple computational methods for calculating readability, and we perform this analysis on several datasets of true and false information. Expanding the analysis to more readability methods and datasets increases the chances of establishing more concrete and representative evidence on how readability differs between true and false information.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Readability of True and False Information</head><p>The aim of this paper is to measure and compare the readability of online misinformation and true information to gain a better understanding of how readability differs between the two categories of content. To achieve this in a systematic manner, the readability score of content items is calculated using six different readability measures (Section 3.3). Apart from three datasets of short claims, a dataset of full news articles is also processed in our experiment. The workflow of our experiments is as follows: (1) Collect datasets consisting of true as well as false claims found on the Web, written in varying lengths (full news articles, short messages); (2) Pre-process the datasets; (3) Calculate the readability of each content item and aggregate their values in our four datasets using six readability measures; (4) Evaluate the readability difference for each of the datasets depending on their true/false labels.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Datasets</head><p>In our experiments, two different types of data are used for readability measurement and comparison. A dataset of full news articles and another three datasets of short text. Each dataset consists of true and false claims. The first dataset used in this study is a collection of 5K full news articles named Fake News Detection Challenge Dataset<ref type="foot" target="#foot_0">1</ref> (KDD2020) gathered from a variety of news websites in 2020. The veracity of each article is manually labelled with 0 or 1, indicating true and false respectively. The average length of the articles is 27.84 sentences.</p><p>The second dataset is a manufactured collection of 67, 366 claims named FEVEROUS<ref type="foot" target="#foot_1">2</ref> (Fact Extraction and VERification Over Unstructured and Structured information) <ref type="bibr" target="#b25">[26]</ref>. This dataset was manually generated in 2021. Each claim is verified against Wikipedia relevant pages by trained annotators and labelled with SUPPORTED, REFUTED, and NOT ENOUGH EVIDENCE. For our experiments, we only consider the claims that were either SUPPORTED or REFUTED.</p><p>PubHealth<ref type="foot" target="#foot_2">3</ref>  <ref type="bibr" target="#b26">[27]</ref> is another dataset of claims. The dataset was constructed in 2020 and consists of 11k claims collected from fact-checking websites (i.e., Politifact, FactCheck, Snopes, TruthorFiction, and FullFact) and online news sources (i.e., Associated Press, Reuters News, and Health News Review). In this experiment, an equal number of claims from each source is selected to avoid bias. The veracity labelling provided with the dataset is true, false, mixture, and unproven. To meet the need of our experiments, only true and false labels are used. The last dataset of claims is LIAR<ref type="foot" target="#foot_3">4</ref>  <ref type="bibr" target="#b27">[28]</ref> with 12.8k claims. The data is collected from Politifact.com. The labels used in coding the data are pants-fire, false, barely-true, half-true, mostly-true, and true. Our focus is on claims that are untrue (pants-fire and false labels) and true.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Pre-processing</head><p>The pre-processing tasks aim to clean and prepare the data for our experiment. The preprocessing phase consists of the following tasks: discarding duplicates, non-English content items, short ones consisting of less than 3 words, punctuation letters apart from full stops which indicate sentence boundaries, and discarding irrelevant or excessively repeated symbols and characters such as emoji, asterisks, hashes, etc.</p><p>The number of articles in each dataset is not balanced. Therefore, to avoid bias, we selected the same number of each set (false, true) after cleaning the data and removing noises. Apart from full articles with no information about their sources available, we balance the number of claims with regard to the source (e.g., BBC, CBS) to minimize bias that could emerge from a particular source (e.g., specific writing style or more complex text) for all other datasets. The final size of the datasets used in our study, along with some statistics about the pre-processing steps is shown in Table <ref type="table" target="#tab_0">1</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">Readability Measures</head><p>The readability tests that are used in this work for measuring the readability of false and true content items are listed in Table <ref type="table" target="#tab_1">2</ref>). For each readability metric, we apply the min-max normalisation method, the scores from each readability measure are therefore normalised between 0 (very easy to read) to 100 (very hard to read) for comparative purposes.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Readability Comparison Results</head><p>In this section, we describe various comparisons of readability between the true and false sets in our four datasets, to reach a better understanding of the similarities and differences in the overall results as well as the results between the different datasets.   </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Statistical Comparison of Readability Scores</head><p>To investigate if false and true content items differ in terms of readability scores, we first compare the means of these scores in all four datasets. Figure <ref type="figure" target="#fig_2">1</ref> shows the distribution of these readability means across the datasets for both true and false sets. These results suggest that although readability is relatively different across the datasets, they are more comparative between the true and false sets in each individual dataset. Overall, we observe that the KDD2020 dataset has a lower readability score compared to the other datasets. This may be due to the item length difference between this dataset and the other analysed datasets.</p><p>To get an understanding of these readability values and the significance of the similarities or differences between false and true content items, we obtain the scores from the readability measures and apply the Mann-Whitney U (MWU) test. For this experiment, the significance  <ref type="table" target="#tab_2">3</ref> represents the results of the MWU test, showing that the content items in the false set are generally harder to read than the ones in the true set and that these distributions differences are statistically significant. The only exception is in FEVEROUS dataset which shows a different pattern. However, as mentioned earlier, this dataset is lab-manufactured and hence is more likely to differ from the other three more naturally-generated datasets.</p><p>What we can conclude from the statistical analysis above is that the readability of false content is generally harder than true content in all our datasets except the manufactured one. This provides computational evidence in support of the common view and most qualitative studies from psychologists, which argue that falsified information tends to be written in a more complex fashion to give the perception of depth and truthfulness (see <ref type="bibr">Section 2)</ref>.</p><p>What remains unknown is how the individual readability parameters differ from one set to another, which is the focus of the next part of the experiment.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Comparison of Readability Parameters</head><p>As discussed in Section 3.3, each readability formula has several influencing parameters for calculating readability. To compare the influence of the different readability parameters between the datasets we use the Pearson Correlation Coefficient (PCC). Correlations between each parameter and the readability of true and false content items across the datasets are represented in Figure <ref type="figure" target="#fig_5">2</ref>. It can be seen that the correlation between the parameters and readability scores for the formulas is positive in almost all cases. In general, there is a strong correlation between ASL and the mean value of the readability scores. The figures also show that Char_Wrds also has a correlation slightly stronger than moderate with the mean value. Such findings enhance our understanding of why readability is proving to be different between true and false content in our datasets (more on this in Section 5).   </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Discussion</head><p>The results of the analysis are illustrated in Figure <ref type="figure" target="#fig_2">1</ref> and reveal that false content items are in general slightly more difficult to read than true ones. This finding contradicts <ref type="bibr" target="#b19">[20]</ref> (see Section 2). However, only one dataset was used in <ref type="bibr" target="#b19">[20]</ref>. This indicates the need for further quantitative research to better understand the reasons behind such variation in results. The analysis of the datasets showed an inconsistency between the FEVEROUS dataset and the other datasets in the difference between the readability of false and true content items. Analysing the FEVEROUS content shows that true claims are more difficult to read than false ones which contradicts our results from the other datasets (Figure <ref type="figure" target="#fig_2">1</ref>). Looking into the collection/creation process of these datasets, we can infer that the FEVEROUS synthetic dataset is not representative of the real-world true/false content distributions that are observed in the other datasets since the claims created in FEVEROUS are written artificially by a limited number of experts from the misinformation domain rather than naturally authored and published on the Web.</p><p>Regarding the parameters used in the readability formulas, Figure <ref type="figure" target="#fig_5">2</ref> shows that excluding FEVEROUS for its deviation discussed above, for the rest of the claims datasets (i.e., PubHealth and Liar), Char-Wrds and ASW are of slightly higher than the moderate correlation with the mean value of readability scores. However, this is not the case in the dataset of full articles (i.e., KDD2020) which shows that these parameters could have more impact when experimenting with short texts. The impact of them, however, would be minor when using the GFI measure which might be due to the use of complex words in the measure that diminishes the correlation of these parameters to the measure as it stands for the words with more than 3 syllables. On the other hand, ASL has a contradictory pattern appearing to influence best with long documents. It has a strong relationship with the mean value. Lengthier sentences are used in false news articles with an average of 29 words per sentence. The average length of the sentences in true content items, however, is 25. This indicates that these parameters should be considered when building models for identifying misinformation on the Web. The disparity in content length between true and false content suggests that brevity and conciseness may be a key differentiating factor between misinformation and true information with misinforming content being more convoluted than true content. Such variety in the correlation of parameters and the measures between different types of content items (i.e., claims and full news articles) enables future research to be more wisely when selecting features for classifying content items of different types.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Limitations and Future Work</head><p>In this experiment, we looked into readability and its association with misinformation. Apart from the readability, the concept of ease of processing has other aspects, such as social consensus and source credibility (see section 2). Analytically investigating their association with misinformation and discovering relevant features correlated to them would be an interesting angle to investigate in future.</p><p>In this experiment, the only language considered was English. Although the readability measures might need modifications to work properly with different languages, experimenting with other languages might result in different findings that may highlight the cultural and structural differences between languages when dealing with true and false information.</p><p>As discussed in section 3.1, our focus was only on the content items with true and false labels, while some datasets have additional fine-grained annotations, such as Not enough evidence, unproven, mixture, etc. Although, including such fine-grained labels in the analysis would make the experiment more comprehensive, matching labels across various datasets annotated with different guidelines is not straightforward and may result in inconsistent results.</p><p>It is also of great importance to investigate how the association of readability with misinformation differs across topics. Discovering topic-specific readability patterns and considering them when building models for detecting misinformation is another research direction.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.">Conclusion</head><p>Our analysis of four distinct datasets showed that readability, in general, is higher (i.e. more difficult) for false information compared to true information. We found a strong difference in the average length of sentences and the number of characters in words in the false and true content, which could be used in misinformation detection models. We also found that when measuring the readability of long documents, the average length of sentences is the most indicative parameter, while the average number of syllables per word and the average number of characters per word work best with short documents. Our analysis also showed that the lab-manufactured FEVEROUS dataset produced readability patterns that were inconsistent with the real-world Web data present in the other datasets. This shows the importance of using real-world datasets when studying misinformation.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head></head><label></label><figDesc>In: R. Campos, A. Jorge, A. Jatowt, S. Bhatia, M. Litvak (eds.): Proceedings of the Text2Story'23 Workshop, Dublin (Republic of Ireland), 2-April-2023 Envelope ali.tavakoli@open.ac.uk (M. Tavakoli); harith.alani@open.ac.uk (H. Alani); gregoire.burel@open.ac.uk (G. Burel) Orcid 0000-0003-3005-4539 (M. Tavakoli); 0000-0003-2784-349X (H. Alani); 0000-0003-0029-5219 (G. Burel)</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Distribution of content items by mean of readability scores (false vs. true).</figDesc><graphic coords="5,208.33,296.51,75.00,150.78" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head></head><label></label><figDesc>(a) KDD2020 (Top: True / Bottom: False). (b) FEVEROUS (Top: True / Bottom: False).</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head></head><label></label><figDesc>(c) PubHealth (Top: True / Bottom: False). (d) Liar (Top: True / Bottom: False).</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_5"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Correlation between readability measures and readability parameters including measuremeasure, parameter-parameter, and measure-parameter correlations for true and false items (Char-Wrds: Number of characters/number of words, CmpWrds-Wrds: Number of Complex words/number of words.</figDesc><graphic coords="7,89.29,225.08,187.51,121.97" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>Distribution of content items in datasets and pre-processing statistics.</figDesc><table><row><cell>Dataset</cell><cell cols="4">KDD2020 FEVEROUS PubHealth Liar</cell></row><row><cell>Number of Content Items (NCI)</cell><cell>4,280</cell><cell>69,058</cell><cell>7,496</cell><cell>4,516</cell></row><row><cell>NCI after removing non-English samples NCI after removing samples with ≤ 2 words NCI after Balancing true &amp; false samples</cell><cell>4,280 4,141 3,300</cell><cell>68,728 68,661 54,000</cell><cell>7,432 7,421 5,422</cell><cell>4,483 4,459 3,334</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2</head><label>2</label><figDesc>The Readability measures (Parameters: ASL: Avg sentence length, ASW: Avg word length in syllables, Complex words: words with ≥ 3 syllables, DW: words with ≥ 7 characters).</figDesc><table><row><cell>Name</cell><cell>Formula</cell><cell>Source</cell></row><row><cell cols="3">Flesch Reading Ease Score (FRES) Flesch-Kincaid Grade Level (FKGL) Gunning's Fog Index (GFI) Automated Readability Index (ARI) Dale-Chall readability formula (DCRF) 0.1579 × ( 206.835 − (1.015 × 𝐴𝑆𝐿) − (84.6 × 𝐴𝑆𝑊 ) [29] [21] 0.39 × 𝐴𝑆𝐿 + 11.8 × 𝐴𝑆𝑊 -15.59 0.4 × [𝐴𝑆𝐿 + 100 × ( 𝐶𝑜𝑚𝑝𝑙𝑒𝑥𝑊 𝑜𝑟𝑑𝑠 𝑊 𝑜𝑟𝑑𝑠 [30] )] 4.71 × ( 𝐶ℎ𝑎𝑟𝑎𝑐𝑡𝑒𝑟𝑠 𝑊 𝑜𝑟𝑑𝑠 ) + 0.5 × 𝐴𝑆𝐿 − 21.43 [31] 𝐷𝑊 𝑠 𝑊 𝑜𝑟𝑑𝑠 × 100) + 0.0496 × 𝐴𝑆𝐿 [32] Spache Readability Formula (SRF) 0.121 × 𝐴𝑆𝐿 + 0.082 × 𝑃𝐷𝑊 + 0.659 [33]</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 3</head><label>3</label><figDesc>Comparison of the avg readability of false and true content items (𝛼 = 0.05).</figDesc><table><row><cell></cell><cell></cell><cell cols="2">P-values</cell></row><row><cell cols="4">Measure KDD2020 FEVEROUS PubHealth</cell><cell>Liar</cell></row><row><cell>FRES FKGL GFI ARI DCRF SRF All</cell><cell>2.3𝐸 − 4 1.63𝐸 − 10 5.57𝐸 − 17 1.55𝐸 − 14 0.40 0.40 2.46𝐸 − 10</cell><cell>1.00 1.00 1.00 1.00 1.00 1.00 0.99</cell><cell cols="2">0.46 6.66𝐸 − 26 1.4𝐸 − 2 1.6𝐸 − 6 1.77𝐸 − 11 0.95 3.06𝐸 − 19 0.20 1.00 2.3𝐸 − 9 1.17𝐸 − 20 4.7𝐸 − 4 6.56𝐸 − 11 5.2𝐸 − 4</cell></row><row><cell cols="5">level (𝛼) is set to 0.05 indicating that any calculated 𝑝 − 𝑣𝑎𝑙𝑢𝑒 ≤ 𝛼 is showing that a significant difference exists between readability scores.</cell></row><row><cell>Table</cell><cell></cell><cell></cell><cell></cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">Fake News Detection Challenge, https://www.kaggle.com/c/fakenewskdd2020/data.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">FEVEROUS, https://fever.ai/dataset/feverous.html.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">PubHealth, https://github.com/neemakot/Health-Fact-Checking.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_3">LIAR, https://www.kaggle.com/code/hendrixwilsonj/liar-data-analysis.</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>This work has been partially supported by the European CHIST-ERA program via the UK Engineering and Physical Sciences Research Council (UKRI -EP/V062662/1) within the CIMPLE project (grant agreement CHIST-ERA-19-XAI-003).</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">When (fake) news feels true: Intuitions of truth and the acceptance and correction of misinformation</title>
		<author>
			<persName><forename type="first">N</forename><surname>Schwarz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Jalbert</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">The Psychology of Fake News</title>
				<imprint>
			<publisher>Routledge</publisher>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="73" to="89" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Processing fluency in education: How metacognitive feelings shape learning, belief formation, and affect</title>
		<author>
			<persName><forename type="first">R</forename><surname>Reber</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Greifeneder</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Educational psychologist</title>
		<imprint>
			<biblScope unit="volume">52</biblScope>
			<biblScope unit="page" from="84" to="103" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Processing fluency and investors&apos; reactions to disclosure readability</title>
		<author>
			<persName><forename type="first">K</forename><surname>Rennekamp</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of accounting research</title>
		<imprint>
			<biblScope unit="volume">50</biblScope>
			<biblScope unit="page" from="1319" to="1354" />
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">The impact of readability on trust in information</title>
		<author>
			<persName><forename type="first">A</forename><surname>Withall</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Sagi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Annual Meeting of the Cognitive Science Society</title>
				<meeting>the Annual Meeting of the Cognitive Science Society</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="volume">43</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<title level="m" type="main">Who is rational?: Studies of individual differences in reasoning</title>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">E</forename><surname>Stanovich</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1999">1999</date>
			<publisher>Psychology Press</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">The elaboration likelihood model of persuasion</title>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">E</forename><surname>Petty</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">T</forename><surname>Cacioppo</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Communication and persuasion</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="1986">1986</date>
			<biblScope unit="page" from="1" to="24" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<title level="m" type="main">Thinking, fast and slow</title>
		<author>
			<persName><forename type="first">D</forename><surname>Kahneman</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2011">2011</date>
			<publisher>Straus and Giroux</publisher>
			<pubPlace>New York</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">The validity effect: A search for mediating variables</title>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">E</forename><surname>Boehm</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Personality and Social Psychology Bulletin</title>
		<imprint>
			<biblScope unit="volume">20</biblScope>
			<biblScope unit="page" from="285" to="293" />
			<date type="published" when="1994">1994</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">E-commerce: the role of familiarity and trust</title>
		<author>
			<persName><forename type="first">D</forename><surname>Gefen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Omega</title>
		<imprint>
			<biblScope unit="volume">28</biblScope>
			<biblScope unit="page" from="725" to="737" />
			<date type="published" when="2000">2000</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">People with easier to pronounce names promote truthiness of claims</title>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">J</forename><surname>Newman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Sanson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">K</forename><surname>Miller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Quigley-Mcbride</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">L</forename><surname>Foster</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">M</forename><surname>Bernstein</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Garry</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">PloS one</title>
		<imprint>
			<biblScope unit="volume">9</biblScope>
			<biblScope unit="page">e88671</biblScope>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<title level="m" type="main">Comprehension: A paradigm for cognition</title>
		<author>
			<persName><forename type="first">W</forename><surname>Kintsch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">Walter</forename><surname>Kintsch</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1998">1998</date>
			<publisher>Cambridge university press</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">The theory of cognitive dissonance: A current perspective</title>
		<author>
			<persName><forename type="first">E</forename><surname>Aronson</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Advances in experimental social psychology</title>
		<imprint>
			<biblScope unit="volume">4</biblScope>
			<biblScope unit="page" from="1" to="34" />
			<date type="published" when="1969">1969</date>
			<publisher>Elsevier</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<monogr>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">H</forename><surname>Eagly</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Chaiken</surname></persName>
		</author>
		<title level="m">The psychology of attitudes</title>
				<imprint>
			<publisher>Harcourt brace Jovanovich college publishers</publisher>
			<date type="published" when="1993">1993</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<monogr>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">B</forename><surname>Cialdini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>James</surname></persName>
		</author>
		<title level="m">Influence: Science and practice</title>
				<meeting><address><addrLine>Boston</addrLine></address></meeting>
		<imprint>
			<publisher>Pearson education</publisher>
			<date type="published" when="2009">2009</date>
			<biblScope unit="volume">4</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">A theory of social comparison processes</title>
		<author>
			<persName><forename type="first">L</forename><surname>Festinger</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Human relations</title>
		<imprint>
			<biblScope unit="volume">7</biblScope>
			<biblScope unit="page" from="117" to="140" />
			<date type="published" when="1954">1954</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Attitudes in the social context: the impact of social network composition on individual-level attitude strength</title>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">S</forename><surname>Visser</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">R</forename><surname>Mirabile</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of personality and social psychology</title>
		<imprint>
			<biblScope unit="volume">87</biblScope>
			<biblScope unit="page">779</biblScope>
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Readability formulas: An overview</title>
		<author>
			<persName><forename type="first">C</forename><surname>Tekfi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of documentation</title>
		<imprint>
			<date type="published" when="1987">1987</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Forming judgments of attitude certainty, importance, and intensity: The role of subjective experiences</title>
		<author>
			<persName><forename type="first">H</forename><surname>Geoffrey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Rolf</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Personality and Social Psychology Bulletin</title>
		<imprint>
			<biblScope unit="page" from="771" to="782" />
			<date type="published" when="1999">1999</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Fluency and the detection of misleading questions: Low processing fluency attenuates the moses illusion</title>
		<author>
			<persName><forename type="first">H</forename><surname>Song</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Schwarz</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Social cognition</title>
		<imprint>
			<biblScope unit="volume">26</biblScope>
			<biblScope unit="page">791</biblScope>
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">The fingerprints of misinformation: how deceptive content differs from reliable sources in terms of cognitive effort and appeal to emotions</title>
		<author>
			<persName><forename type="first">C</forename><surname>Carrasco-Farré</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Humanities and Social Sciences Communications</title>
		<imprint>
			<biblScope unit="volume">9</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<monogr>
		<title level="m" type="main">Derivation of new readability formulas (automated readability index, fog count and flesch reading ease formula) for navy enlisted personnel</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">P</forename><surname>Kincaid</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">P</forename><surname>Fishburne</surname><genName>Jr</genName></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">L</forename><surname>Rogers</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">S</forename><surname>Chissom</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1975">1975</date>
		</imprint>
		<respStmt>
			<orgName>Naval Technical Training Command Millington TN Research Branch</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Technical Report</note>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Identifying linguistic cues of fake news associated with cognitive and affective processing: Evidence from neurois</title>
		<author>
			<persName><forename type="first">B</forename><surname>Lutz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">T</forename><surname>Adam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Feuerriegel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Pröllochs</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Neumann</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">NeuroIS Retreat</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="16" to="23" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Motivational and emotional controls of cognition</title>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">A</forename><surname>Simon</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Psychological review</title>
		<imprint>
			<biblScope unit="volume">74</biblScope>
			<biblScope unit="page">29</biblScope>
			<date type="published" when="1967">1967</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">This just in: Fake news packs a lot in title, uses simpler, repetitive content in text body, more similar to satire than real news</title>
		<author>
			<persName><forename type="first">B</forename><surname>Horne</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Adali</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the international AAAI conference on web and social media</title>
				<meeting>the international AAAI conference on web and social media</meeting>
		<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="volume">11</biblScope>
			<biblScope unit="page" from="759" to="766" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Measuring the impact of readability features in fake news detection</title>
		<author>
			<persName><forename type="first">R</forename><surname>Santos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Pedro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Leal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Vale</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Pardo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Bontcheva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Scarton</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. 12th language resources and evaluation Conf</title>
				<meeting>12th language resources and evaluation Conf</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<monogr>
		<author>
			<persName><forename type="first">R</forename><surname>Aly</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Guo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Schlichtkrull</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Thorne</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Vlachos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Christodoulopoulos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Cocarascu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Mittal</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2106.05707</idno>
		<title level="m">Feverous: Fact extraction and verification over unstructured and structured information</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b26">
	<monogr>
		<author>
			<persName><forename type="first">N</forename><surname>Kotonya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Toni</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2010.09926</idno>
		<title level="m">Explainable automated fact-checking for public health claims</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b27">
	<monogr>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">Y</forename><surname>Wang</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1705.00648</idno>
		<title level="m">liar, liar pants on fire&quot;: A new benchmark dataset for fake news detection</title>
				<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b28">
	<monogr>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">F</forename><surname>Flesch</surname></persName>
		</author>
		<title level="m">Art of readable writing</title>
				<imprint>
			<date type="published" when="1949">1949</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b29">
	<analytic>
		<title level="a" type="main">The fog index after twenty years</title>
		<author>
			<persName><forename type="first">R</forename><surname>Gunning</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Business Communication</title>
		<imprint>
			<biblScope unit="volume">6</biblScope>
			<biblScope unit="page" from="3" to="13" />
			<date type="published" when="1969">1969</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b30">
	<monogr>
		<title level="m" type="main">Automated readability index</title>
		<author>
			<persName><forename type="first">R</forename><surname>Senter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">A</forename><surname>Smith</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1967">1967</date>
			<pubPlace>Cincinnati Univ OH</pubPlace>
		</imprint>
	</monogr>
	<note type="report_type">Technical Report</note>
</biblStruct>

<biblStruct xml:id="b31">
	<monogr>
		<title level="m" type="main">Readability revisited: The new Dale-Chall readability formula</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">S</forename><surname>Chall</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Dale</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1995">1995</date>
			<publisher>Brookline Books</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b32">
	<analytic>
		<title level="a" type="main">A new readability formula for primary-grade reading materials</title>
		<author>
			<persName><forename type="first">G</forename><surname>Spache</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">The Elementary School Journal</title>
		<imprint>
			<biblScope unit="volume">53</biblScope>
			<biblScope unit="page" from="410" to="413" />
			<date type="published" when="1953">1953</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
