Credibility and Transparency of News Sources: Data
               Collection and Feature Analysis

                      Ahmet Aker                                                      Vincentius Kevin
 University of Duisburg-Essen, Duisburg, Germany and                            University of Duisburg-Essen
       University of Sheffield, Sheffield, England                                  Duisburg, Germany
                 aker@is.inf.uni-due.de                                       vincentius.kevin@stud.uni-due.de


                                                 Kalina Bontcheva
                                               University of Sheffield
                                                Sheffield, England
                                            k.bontcheva@sheffield.ac.uk


                                                                shaped only by few experts or professional people or
                                                                institutions but by everyone who has access. Although
                       Abstract                                 this new style of contribution towards web content has
                                                                led to immense information richness and diverse views
    The ability to discern news sources based                   however, it has also brought new challenges. It has
    on their credibility and transparency is use-               stripped the traditional information providers, such as
    ful for users in making decisions about news                news media, from their gate-keeping role [1] and has
    consumption. In this paper, we release a                    left the public in a jungle of web content with varying
    dataset of 673 sources with credibility and                 quality from reliable and true information to misinfor-
    transparency scores manually assigned. Upon                 mation i.e., facts that are not true.
    acceptance we will make this dataset pub-                      Misinformation is interchangeably used with the
    licly available. Furthermore, we compared fea-              terms fake news. Douglas et al. refer to fake news
    tures which can be computed automatically                   as a “deliberate publication of fictitious information,
    and measured their correlation with credibil-               hoaxes and propaganda” [7], and is similarly defined
    ity and transparency scores annotated by hu-                by others [11]. Furthermore, it is reported that the
    man experts. Our correlation analysis shows                 veracity of the information is highly connected to the
    that there are indeed features which highly                 publisher, i.e. the source of information [6, 4]. Thus
    correlate with the manual judgments.                        instead of performing judgment on e.g. article level
                                                                such as performed in [12, 8, 14] there are services to as-
1 Introduction                                                  sess the sources publishing online news. NewsGuard1
                                                                is one of such services. NewsGuard analyses manually
The Web has never been as big as it is now. It con-
                                                                each news publishing source in terms of credibility and
tains tremendous amount of information represented
                                                                transparency and provides detailed information such
in form of articles, videos, images, blog and social me-
                                                                as references and reasoning, and the persons account-
dia posts and many other entries. One of the rea-
                                                                able behind each analysis. The results are made avail-
sons for this massive growth is that it is not anymore
                                                                able to the public via a browser plugin.
Copyright © 2019 for the individual papers by the papers’ au-
                                                                   In this paper we use NewsGuard to manually col-
thors. Copying permitted for private and academic purposes.     lect analyses results of 673 news sources. For each
This volume is published and copyrighted by its editors.        news source we manually record the overall credibility
In: A. Aker, D. Albakour, A. Barrón-Cedeño, S. Dori-Hacohen,    and transparency scores but also detailed information
M. Martinez, J. Stray, S. Tippmann (eds.): Proceedings of the   that led to those overall decisions. We plan to make
NewsIR’19 Workshop at SIGIR, Paris, France, 25-July-2019,
published at http://ceur-ws.org                                   1 www.newsguardtech.com
this dataset freely available.2 Next,we collect a rich         • Left/Right: “moderately to strongly biased to-
set of well known metrics/features used by e.g. search           war” liberal/ conservative causes, may be untrust-
engines to assess the popularity of a web-site and run           worthy.
correlation analysis between the features and manu-
ally assigned NewsGuard scores. Our analysis show              • Left-Center/Right-Center: slight to moderate
that there are features which highly correlate with the          bias toward liberal/conservative causes.
NewsGuard scores. This suggests that the manual pro-           • Center (Least Biased): minimal bias, most credi-
cess done by NewsGuard could be automated.                       ble media sources.

2 Data Collection                                              • Pro-Science: “These sources consist of legitimate
                                                                 science or are evidence based through the use of
2.1 NewsGuard:     Credibility              and      Trans-      credible scientific sourcing. ...”
    parency Scores
                                                               • Conspiracy-Pseudoscience:      “Sources in the
NewsGuard’s team manually reviewed thousands of
                                                                 Conspiracy- Pseudoscience category may publish
news agencies, which are mostly based in the US,
                                                                 unverifiable information that is not always sup-
to label them with nine criteria. A news agency is
                                                                 ported by evidence. ..”
rewarded credibility and transparency scores for each
criterion it fulfills. The criteria are listed below.          • Questionable Sources: “extreme bias, consistent
                                                                 promotion of propaganda/conspiracies, poor or no
Credibility criteria:                                            sourcing to credible information, a complete lack
                                                                 of transparency and/or is fake news.”
 • Does not repeatedly publish false content (22
   points)                                                     • Satire: “... humor, irony, exaggeration, or ridicule
                                                                 to expose and criticize people’s stupidity or vices,
 • Gathers and presents information responsibly (18
                                                                 ... these sources are clear that they are satire and
   points)
                                                                 do not attempt to deceive”
 • Regularly corrects or clarifies errors (12.5 points)
                                                               • Re-Evaluated Sources: these are sources which
 • Handles the difference between news and opinion               have been updated by MBFC. They are dupli-
   responsibly (12.5 points)                                     cates, so this category is removed from our anal-
                                                                 ysis.
 • Avoids deceptive headlines (10 points)
                                                                We used the sources (in total 2714) from MBFC to
  Transparency criteria:                                      run over the NewsGuard (see next Section).
 • Website discloses ownership and financing (7.5
                                                              2.3 Collection Procedure
   points)
                                                              To collect NewsGuard judgments on the sources col-
 • Clearly labels advertising (7.5 points)                    lected from MBFC we performed a manual process.
 • Reveals who’s in charge (5 points)                         We installed the NewsGuard as a browser plugin and
                                                              visited each of the MBFC source. The results shown by
 • The site provides the names of content creators,           the plugin were recorded. For instance for BBC.com,
   along with either contact information or biograph-         NewsGuard lists the results shown in Figure 1. For
   ical information (10 points)                               this source we recorded the values for the individual
                                                              labels as well as the overall NewsGuard score (in this
   The total of credibility and transparency scores is
                                                              case 95). If the results are unavailable because News-
100 at maximum, and a news website is considered
                                                              Guard has not analysed the source, the news source is
“safe” if it has at least 60 points.
                                                              discarded.
                                                                 We performed this procedure for all 2714 news
2.2 News Sources
                                                              sources available in the nine categories at the time.
The list of news sources we used were taken from Me-          NewsGuard scores were available only for 673 of them.
dia Bias Fact Check (MBFC). MBFC aims to cate-                Most of the sources in the “Satire” category were
gorize sources by political bias. The categories are as       unavailable. The scores were found to agree with
follows, with some descriptions (partially) quoted from       MBFC’s description of each category - in general,
their website3 .                                              least biased and pro-science sources are the most
  2 https://github.com/ahmetaker/sourceCredibility            credible ones, while extremely biased and conspir-
  3 mediabiasfactcheck.com                                    acy/pseudoscience sources can be unreliable. Table 1
                                                                  3.1 Automatic Features
                                                                  3.1.1 CheckPageRank
                                                                  CheckPageRank5 (cPR) provides a free online tool
                                                                  which can report page rank score, alexa rank, and a
                                                                  few other domain analysis results for any given web-
                                                                  site.
                                                                     The tool does not provide any exact definition or in-
                                                                  formation on how the scores are calculated. However,
                                                                  cPR provides scores which seem to be taken from non-
                                                                  free services such as Moz SEO and Majestic SEO tools.
                                                                  While these tools highly limits usage for free users to
                                                                  ten queries per month and a few queries per day re-
                                                                  spectively (as of 2019), cPR allows one query every
                                                                  thirty seconds, although it does not provide the full
                                                                  information available in the other tools.
                                                                     Below is the most likely explanation we found for
                                                                  the features provided by cPR, either because the fea-
                                                                  ture name is self-explanatory or the supposed under-
            Figure 1: NewsGuard on bbc.com                        lying services give exact or very close scores compared
                                                                  to what is displayed by cPR.
shows the average score and standard deviation per
category. The counts show how many sources are
                                                                   • Google Page Rank: A score from 0 to 10 which
available on NewsGuard out of all that were listed in
                                                                     estimates the importance of the website based on
MBFC.
                                                                     the quantity and quality of links to it from other
 category       count       µ(score)   σ(score)   cred.   tran.      websites.
 Left           85 / 316    77.16      22.25      57.81   19.35
 Left Center    185 / 466   94.32      8.11       72.58   21.74
 Center         122 / 404   94.29      8.29       72.20   22.09    • cPR Score: This is shown visually as one of the
 Right Center   76 / 224    92.01      15.00      70.03   21.97      most important scores in checkpagerank.net, al-
 Right          60 / 269    61.27      26.82      46.02   15.25      beit without any given definition. We presume
 Pro Science    27 / 139    93.89      7.51       72.22   21.67
                                                                     that ‘cPR’ simply stands for ‘checkPageRank’ and
 Conspiracy     39 / 287    30.09      27.76      16.88   13.21
 Fake News      76 / 478    23.55      17.33      12.93   10.46      cPR score is calculated with a proprietary formula
 Satire*        3 / 131     5.00       4.33       0.00    5.00       or algorithm.

Table 1: NewsGuard score per source category and                   • Citation Flow and Trust Flow: These two scores
the break down into credibility (max. 75) and trans-                 are most probably from Majestic6 , an SEO
parency (max. 25). The count shows how many news                     (Search Engine Optimization) tool. According to
sources are available in NewsGuard out of all sources                Majestic’s glossary7 , citation flow focuses on the
listed in MBFC. *The satire category is not represen-                quantity and influential power of links to the web-
tative as it has only 3 NewsGuard scores.                            site, while trust flow focuses on links from man-
                                                                     ually reviewed trusted sites. Majestic seems to
                                                                     have crawled over 600 billion URLs by 2014 [13].
3 Correlation Analysis
                                                                   • Topic Value: this score also most likely comes
In the correlation analysis the automatic features are               from Majestic. Majestic provides a “Topical Trust
compared to the manually annotated credibility and                   Flow” score, which, according to their glossary
transparency scores to analyze the correlation and pre-              “shows the relative influence [...] in any given
dictive power of the features. We calculated specif-                 topic or category.” It is a likely explanation that
ically the correlation between each automatic fea-                   cPR show only the topic for which the website
ture against the combined score (3 × credibility +                   has the best Topical Trust Flow, since the topic
transparency) from NewsGuard4 .                                      names and value range are exactly the same in
   In the followings we outline features we selected as              cPR and Majestic.
well as the metric used to perform the correlation.
                                                                    5 checkpagerank.net
   4 https://www.newsguardtech.com/ratings/rating-process-          6 majestic.com

criteria/                                                           7 https://majestic.com/help/glossary
 • Backlinks: External backlinks mean links from           3.1.3 Facebook
   other websites to the subject website. This ex-
   cludes internal links, which usually exist to let        • Page Likes: the number of Facebook users who
   users navigate within the same website.                    likes the Facebook page of the news source, by
                                                              simply clicking on the like button. Likes informa-
 • Referring domains: this is the number of domains           tion is publicly available.
   which contains backlink(s) to the subject website.
 • EDU and GOV backlinks and domains: Majestic              • Page Followers: the number of Facebook users
   also provides the counts of educational and gov-           who are following the page, which means any
   ernmental backlinks and domains.                           posts by the page will be shown in the users’ home
                                                              screens. By default, when someone likes a page,
 • Domain Authority and Page Authority: the Moz8              he automatically follows the page as well. The
   SEO tool describes these scores as “the rank-              user can then “Unfollow” while still keeping the
   ing potential in search engines based on an al-            “Like”. It is also possible to follow a page without
   gorithmic combination of all link metrics”. While          liking it.
   MozRank is not used directly by search engines,
   it is similar and correlated to ranking of major
                                                           3.2 Pearson Correlation with Logarithmic
   search engines [16]. We tested a few websites and
                                                               Transformation
   confirmed that cPR shows exactly the same scores
   as Moz.                                                 First, we measured the Pearson correlation [3]. Pear-
 • Spam Score: This most likely represents the Moz         son only measures linear relationships. This means if
   SEO spam flags explained in their website9 . The        there is no such relationship Pearson is not a good
   flags represent internal and external features of       choice to compute the correlation. However, one way
   websites that are indicative of ‘spam websites’ and     of overcome this limitation is to convert the data to
   have been found to be penalized or banned by            logarithm form. Therefore, we also applied a logarithm
   Google.                                                 (base 10) on the features before calculating the Pear-
                                                           son correlation (with “add one” to avoid math error
 • Alexa Rank: Alexa Rank is described as a pop-           for the logarithm of zero) to capture the correlations
   ularity measure which “is calculated using a pro-       which follow the power law rather than linear.
   prietary methodology that combines a site’s es-            We expected features such as backlink counts and
   timated traffic and visitor engagement over the         number of likes in social media to follow the power
   past three months.”10                                   law, under the assumption that website links and user
 • Alexa Reach Rank: this score is based specifically      networks in social media follow the pattern of a scale-
   on the estimated number of people each website          free network (preferential attachment) [2].
   is able to reach.                                          We also expect behavior of ranking features (e.g.
                                                           Alexa Rank) to be non-linear. Although it is not nec-
 • Indexed URLs: This may be the number of URLs            essarily logarithmic, ratio would be a better measure
   indexed by Google, as is commonly provided in           than rank difference. By applying a logarithm kernel,
   SEO tools, but since there is no information pro-       only the ratio is now considered, i.e. the difference
   vided, this is only a guess.                            between rank 10 and 20 is considered as significant as
                                                           the difference between ranks 1,000 and 2,000.
3.1.2 Twitter
 • Number of followers: the number of users on twit-       3.3 Spearman and Kendall Tau Correlations
   ter.com who “subscribes” to the news’ Twitter ac-
   count. Posts made on Twitter will appear on the         Since Pearson correlation only measures linear cor-
   followers’ home screen.                                 relation, we have also computed the Spearman and
                                                           Kendall Tau correlation scores. This may give a bet-
 • Listed count: a Twitter user can make lists of
                                                           ter insight on which variables are more predictive of
   users to personally categorize other users. They
                                                           the news source quality.
   can keep the list private or publicly visible. Listed
                                                              Both Spearman [15] and Kendall Tau [9] are rank-
   count represents the number of public lists in
                                                           based correlation measurement, thus they work well on
   which the Twitter user appears.
  8 moz.com
                                                           monotonous correlations. Spearman does not handle
  9 https://moz.com/blog/spam-score-mozs-new-metric-to-    tied ranks, which occurs very often in our dataset due
measure-penalization-risk                                  to NewsGuard’s scoring method. Therefore, Kendall
 10 blog.alexa.com                                         Tau seems to be the better measurement and has been
used to sort the rows in the following table. We have                One unexpected result is the negative correla-
used the tau-b implementation available in scipy 11 .             tion between Facebook likes/follows and NewsGuard
                                                                  scores. This may be caused by the availability of paid
4 Correlation Results                                             “like farms” to get fake likes on the platform, such as
                                                                  BoostLikes and SocialFormula. Even legitimate Face-
                              pearson                             book ad campaigns can result in significant amounts of
 Feature                                       spear.    kend.
                          linear   log                            such fake likes [5]. However, it requires further anal-
 GOV Backlinks            0.031    0.698       0.656     0.499    ysis of the corresponding Facebook pages to confirm
 GOV Domains              0.201    0.698       0.627     0.473    this.
 EDU Backlinks            0.029    0.723       0.612     0.454       One should note that since the dataset comes from
 EDU Domains              0.305    0.723       0.556     0.408    NewsGuard, it is possible for unpopular news sources
 Trust Metric*            0.614    0.662       0.542     0.399    to be under-represented.
 Trust Flow*              0.614    0.662       0.542     0.399
 Indexed URLs             0.019    0.584       0.537     0.396
                                                                  5 Conclusion
 Topic Value*             0.589    0.641       0.528     0.387
 Ref. Domains             0.227    0.622       0.508     0.367    In this paper, we release a dataset of 673 sources with
 Google PageRank          0.581    0.575       0.448     0.354    credibility and transparency scores manually assigned.
 Citation Flow*           0.523    0.538       0.449     0.327    The scores come from NewsGuard’s plugin. We man-
 Domain Authority         0.603    0.588       0.445     0.325    ually accessed the plugin for 2714 news sources pub-
 cPR Score                0.589    0.584       0.445     0.323    lished by Media Bias Fact Check and recorded for
 Ext. Backlinks           0.073    0.567       0.449     0.322    those 673 detailed scores about credibility and trans-
 Page Authority*          0.521    0.524       0.397     0.284    parency NewsGuard provides. For the remaining 2042
 Global Rank              -0.338 -0.427        -0.323    -0.232   sources NewsGuard did not have judgments.
 Alexa Reach              -0.327 -0.414        -0.313    -0.224      We also extracted a rich set of features and per-
 Alexa USA*               -0.379 -0.360        -0.276    -0.197   formed a correlation analysis. Our results show that
 Facebook Likes           -0.076 -0.149        -0.229    -0.163   there are strong correlations between the NewsGuard
 Twitter Listed           0.131    0.388       0.231     0.162    scores and features analysed in this work. This in-
 Twitter Followers        0.098    0.327       0.228     0.161    dicates that the credibility and transparency scoring
 Facebook Follows         -0.073 -0.147        -0.225    -0.160   could be automated.
 Spam Score               -0.051 0.025         0.038     0.032       In our future work we aim to perform such a step
                                                                  and create a regression model to automatically pre-
Table 2: Feature correlation with NewsGuard score:                dict the credibility and transparency scores. This will
Pearson, Spearman and Kendall tau-b coefficients.                 allow to obtain credibility scores for any source that
                                                                  is so far not judged by NewsGuard. Note since our
   Table 2 shows the correlation scores (Pearson,
                                                                  features are language independent this will allow us
Spearman, Kendall tau) between each feature and the
                                                                  to obtain credibility scores for any source reporting in
total score from NewsGuard. Grey values indicate sta-
                                                                  any language. We also plan to use the output of our
tistically non-significant correlations with p_value ⩾
                                                                  regression models as information nutrition label within
0.00069 (using Bonferroni correction, counting both
                                                                  NewsScan12 [10].
Pearson tests as one).
   As expected, applying logarithmic transformation
yields big improvements on the Pearson correlation                Acknowledgements
scores. There were six features which have not met                This work was partially supported by the European
our expectation in terms of whether logarithm kernel              Union under grant agreement No. 825297 WeVer-
would improve the linear correlation (marked with a               ify (http://weverify.eu) and the Deutsche Forschungs-
star), even though the differences in these cases are             gemeinschaft (DFG, German Research Foundation) -
relatively small (< 0.05).                                        GRK 2167, Research Training Group “User-Centred
   Many of the automatically retrievable features have            Social Media”.
a significant correlation with the NewsGuard scores.
Notably, backlinks and referring domains, especially              References
from government and educational websites, are very
good indicators of trustable sources. Trust Metric and             [1] Baly, R., Karadzhov, G., Alexandrov, D.,
Trust Flow also work very well, confirming that seeded                 Glass, J., and Nakov, P. Predicting factuality
network graphs can be useful in practice.                              of reporting and bias of news media sources. arXiv
                                                                       preprint arXiv:1810.01765 (2018).
  11 https://docs.scipy.org/doc/scipy-

0.15.1/reference/generated/scipy.stats.kendalltau.html             12 www.news-scan.com
 [2] Barabási, A.-L., and Pósfai, M. Network sci-          [14] Rashkin, H., Choi, E., Jang, J. Y.,
     ence. Cambridge University Press, Cambridge,               Volkova, S., and Choi, Y. Truth of varying
     2016.                                                      shades: Analyzing language in fake news and po-
                                                                litical fact-checking. In Proceedings of the 2017
 [3] Benesty, J., Chen, J., Huang, Y., and Co-
                                                                Conference on Empirical Methods in Natural Lan-
     hen, I. Pearson correlation coefficient. In Noise
                                                                guage Processing (2017), pp. 2931–2937.
     reduction in speech processing. Springer, 2009,
     pp. 1–4.                                              [15] Spearman, C. The proof and measurement of
                                                                association between two things. The American
 [4] Burgoon, J. K., and Hale, J. L. The funda-
                                                                Journal of Psychology 15, 1 (1904), 72–101.
     mental topoi of relational communication. Com-
     munication Monographs 51, 3 (1984), 193–214.          [16] Themistoklis Mavridis, A. L. S. Identify-
 [5] De Cristofaro, E., Friedman, A., Jourjon,                  ing valid search engine ranking factors in a web
     G., Kaafar, M. A., and Shafiq, M. Z. Paying                2.0 and web 3.0 context for building efficient seo
     for likes?: Understanding facebook like fraud us-          mechanisms. Engineering Applications of Artifi-
     ing honeypots. In Proceedings of the 2014 Confer-          cial Intelligence 41 (2015), 75–91.
     ence on Internet Measurement Conference (New
     York, NY, USA, 2014), IMC ’14, ACM, pp. 129–
     136.
 [6] Demchenko, Y., Grosso, P., De Laat, C.,
     and Membrey, P. Addressing big data issues
     in scientific data infrastructure. In Collaboration
     Technologies and Systems (CTS), 2013 Interna-
     tional Conference on (2013), IEEE, pp. 48–55.
 [7] Douglas, K., Ang, C. S., and Deravi, F.
     Farewell to truth? conspiracy theories and fake
     news on social media. The Psychologist (2017).
 [8] Hardalov, M., Koychev, I., and Nakov,
     P. In search of credible news. In International
     Conference on Artificial Intelligence: Methodol-
     ogy, Systems, and Applications (2016), Springer,
     pp. 172–180.
 [9] Kendall, M. G. The treatment of ties in rank-
     ing problems. Biometrika 33, 3 (1945), 239–251.
[10] Kevin, V., Högden, B., Schwenger, C., Sa-
     han, A., Madan, N., Aggarwal, P., Ban-
     garu, A., Muradov, F., and Aker, A. Infor-
     mation nutrition labels: A plugin for online news
     evaluation. In Proceedings of the First Workshop
     on Fact Extraction and VERification (FEVER)
     (Brussels, Belgium, Nov. 2018), Association for
     Computational Linguistics, pp. 28–33.
[11] Klein, D. O., and Wueller, J. R. Fake news:
     A legal perspective. Journal of Internet Law 20,
     10 (2017), 6–13.
[12] Markowitz, D. M., and Hancock, J. T. Lin-
     guistic traces of a scientific fraud: The case of
     diederik stapel. PloS one 9, 8 (2014), e105937.
[13] Pardeep Sud, M. T. Linked title mentions: A
     new automated link search candidate. Sciento-
     metrics 101 (2014), 1831–1849.