             Multidimensional News Quality:
      A Comparison of Crowdsourcing and Nichesourcing

        Eddy Maddalena                                    Davide Ceolin                       Stefano Mizzaro
    University of Southampton                    Centrum Wiskunde & Informatica              University of Udine
          Southampton                                      Amsterdam                               Udine
        United Kingdom                                   The Netherlands                            Italy
    E.Maddalena@soton.ac.uk                            davide.ceolin@cwi.nl                   mizzaro@uniud.it

                                                                 intrinsic complexity. Information quality can be as-
                                                                 sessed by considering diverse points of views; how they
                          Abstract                               can be assessed, and how the assessment results should
                                                                 be combined, depends on the assessors and on their re-
     In the age of fake news and of filter bubbles,
                                                                 quirements. This calls for a combined approach, where
     assessing the quality of information is a com-
                                                                 automated computation is required to handle the huge
     pelling issue: it is important for users to un-
                                                                 amount of information available on the Web, while hu-
     derstand the quality of the information they
                                                                 man computation is required to understand how the
     consume online. We report on our experiment
                                                                 quality dimensions are assessed and combined. An im-
     aimed at understanding if workers from the
                                                                 portant aspect of human computation in this context is
     crowd can be a suitable alternative to experts
                                                                 its regularity: when human assessments are consistent
     for information quality assessment. Results
                                                                 enough, automated computation can leverage them to
     show that the data collected by crowdsourc-
                                                                 scale the computation up.
     ing seem reliable. The agreement with the ex-
     perts is not full, but in a task that is so com-               In a previous work by Ceolin, Noordegraaf, and
     plex and related to the assessor’s background,              Aroyo [CNA16], two user studies are performed to col-
     this is expected and, to some extent, positive.             lect quality assessments regarding Web documents on
                                                                 the vaccination debate. Assessments were collected
1    Introduction and Background                                 by means of a Web application, in a scenario simi-
                                                                 lar to crowdsourcing with the only difference that the
Online information is used by a variety of stakehold-
                                                                 assessments were expressed by a few experts (media
ers as a basis for decision making, knowledge discov-
                                                                 scholars and journalism students) rather than a large
ery, studies, and many more activities. However, as
                                                                 crowd of anonymous workers. This approach has been
a consequence of the democratic nature of the Web,
                                                                 named nichesourcing [Boe+12]. Ceolin, Noordegraaf,
such information shows an extremely diverse level of
                                                                 and Aroyo noted that, when the task at hand is con-
quality. Making explicit this level of quality for each
                                                                 strained, experts who show a similar background tend
information item is crucial to allow the stakeholders an
                                                                 to significantly agree with each other. However, they
overall adequate information perusal. Given their per-
                                                                 also noted that the task of deeply assessing online in-
vasiveness and influence on the public opinion, online
                                                                 formation is rather demanding, and expert availability
news are a kind of information whose quality assess-
                                                                 is limited. Crowdsourcing could be a solution to the
ment becomes a particularly critical task to contrast
                                                                 limited availability of human assessors.
the spread of misinformation and disinformation.
   Assessing the quality of online news and informa-                In this paper, we repeat that study [CNA16] though
tion in general is a challenging task, because of its            crowdsourcing to analyse similarities and differences
                                                                 among the two ways of collecting human assessments.
Section 5 concludes the paper.                                  pro, against)?
                                                             3. Readability - Does the document read well?
2     Related Work                                           4. Precision - How precise is the information in this
                                                                document (as opposed to vague)?
In the age of fake news [Laz+18; VRA18] and of               5. Completeness - How complete is the information in
the filter bubble [Par11], assessing the quality of             this document?
information is a compelling issue: it is important           6. Trustworthiness - How trustworthy is the source? Is
for users to understand the quality of the informa-             the source trustworthy or does it exhibit malicious
tion they consume online. Two important initia-                 intentions?
tives that are worth being mentioned in this field are       7. Relevance - How relevant is the article to the task?
the W3C Credible Web Community Group (https://               8. Overall quality - Which is your general opinion
credweb.org/) and the Credibility Coalition (http:              about the quality of the article?
//credibilitycoalition.org). While the first is              We also asked two further questions requiring workers
meant to establish standards to model and share data         personal opinion, to understand how personal belief
about the credibility of information online, the second      affects quality judgment:
aims at identifying markers and strategies for estab-        9. Your personal opinion - Do you agree with the doc-
lishing the credibility of the same information. To this        ument content?
extent, the work we present in this paper is comple-        10. Your confidence - How knowledgeable/expert are
mentary to these initiatives, as it aims at providing           you about the topic?
gold standards to reason on the credibility (and, more       All the 10 assessments were collected on a 5-stars Lik-
broadly, quality) of online information.                     ert scale, as in the original experiment [CNA16]. For
                                                             each quality dimension, we also asked the users to mo-
3     Experimental Setup                                     tivate their judgment by some free text.
                                                                The task ran on the Figure Eight (https://www.
3.1   Dataset Description
                                                             figure-eight.com/) crowdsourcing platform by se-
We ran our experiment on a sample from the vaccina-          lecting level-three workers who are highest accuracy
tion debate dataset provided by the QuPiD project            contributors. Each worker was paid 0.2 USD and could
(http://qupid-project.net) and used by Ceolin,               not judge more than three articles. Besides redun-
Noordegraaf, and Aroyo [CNA16]. In 2015, a measles           dancy (each article was judged by 10 workers), we also
outbreak took place at Disneyland, California. Such          adopted some standard quality checks: each worker
outbreak triggered a fierce debate that fleshed out the      was shown a pair of articles of clearly low and high
already hot discussions regarding vaccinations, where        quality, and the work was rejected if the collected val-
pro and anti vaccination individuals blamed each other       ues were ranked in the wrong way; there was also a
for the responsibility of the event. The vaccination de-     time threshold (the worker needed to spend at least
bate dataset collects a number of documents regard-          120 seconds on the task), and some syntactic checks
ing that specific debate. While the dataset is limited       on the free text motivations.
in size (about 50 documents), it is rather diverse in
terms of types of documents represented (newspaper           3.3   Research Questions
articles, activist blog posts, etc.) and stances (pro,
anti, neutral).                                              This experiment allows us to address three research
                                                             Q1. Relationships between quality dimensions: what
3.2   The Crowdsourcing Task
                                                                 are the correlations between the quality dimen-
The crowdsourcing task we ran aimed at collecting                sions? Do some of the quality dimensions corre-
laymen judgments concerning the quality of a subset              late in a way that makes one derivable from an-
of 20 articles assessed by the experts (media scholars           other? What is the difference between experts
and journalism students). We asked each worker to                and workers?
assess one document along eight different quality di-        Q2. Internal agreement (between individual workers):
mensions derived from Ceolin, Noordegraaf, and Aroyo             can different workers agree to a reasonable extent
[CNA16] (we slightly reformulated some of them to                when assessing quality dimensions? Are there dif-
have a shorter description, more adequate for crowd              ferences among the dimensions?
workers):                                                    Q3. External agreement (between individual workers
1. Accuracy - How accurate is the information in this            and experts): what is the individual external
   article?                                                      agreement, i.e., the agreement between the in-
2. Neutrality - Is the document neutral with respect             dividual workers and the experts, on all dimen-
   to the topic addressed, or does it clear stance (e.g.,        sions? What is the aggregate external agreement,
                                                         5                        =.22, p=1.6e-03 =.41, p=1.5e-09 =.68, p=7.0e-29 =.51, p=9.3e-15 =.51, p=1.8e-14 =.46, p=1.2e-11 =.65, p=1.6e-25                                                                                          5                        =.57, p=4.8e-04 =.73, p=9.4e-07 =.83, p=1.4e-09 =.76, p=1.5e-07 =.88, p=7.0e-12 =.64, p=5.1e-05 =.9, p=2.2e-13
                                                                                              **              ***              ***              ***              ***              ***              ***                                                                                                                           ***             ***               ***              ***              ***              ***              ***

                                                         4                                                                                                                                                                                                                                 4
                                                                                 r=.24, p=6.7e-04 r=.43, p=1.9e-10 r=.7, p=1.6e-30 r=.51, p=1.3e-14 r=.5, p=3.0e-14 r=.49, p=1.7e-13 r=.64, p=2.0e-24                                                                                      3
                                                                                                                                                                                                                                                                                                                   r=.54, p=9.8e-04 r=.71, p=2.0e-06 r=.82, p=2.9e-09 r=.76, p=2.4e-07 r=.89, p=1.0e-12 r=.58, p=3.0e-04 r=.93, p=1.1e-15
                                                         2                                   *** =.4, p=3.5e-10
                                                                                  =.21, p=3.8e-04             *** =.64, p=2.0e-25
                                                                                                                               *** =.46, p=2.3e-14
                                                                                                                                                *** =.45, p=1.4e-13
                                                                                                                                                                 *** =.43, p=1.8e-12
                                                                                                                                                                                  *** =.58, p=2.1e-21
                                                                                                                                                                                                   ***                                                                                     2                                     ***             ***               ***              ***              ***              ***
                                                                                                                                                                                                                                                                                                                    =.49, p=8.0e-04 =.6, p=1.8e-05 =.74, p=1.1e-07 =.67, p=2.5e-06 =.81, p=7.4e-09 =.51, p=3.2e-04 =.88, p=3.5e-10     ***
                                                                                             *** =.096, p=1.8e-01
                                                                                                              *** =.3, p=2.0e-05
                                                                                                                               *** =.2, p=4.0e-03
                                                                                                                                                *** =.26, p=2.5e-04
                                                                                                                                                                 *** =.19, p=7.0e-03
                                                                                                                                                                                  *** =.27, p=1.5e-04
                                                                                                                                                                                                   ***                                                                                     1
                                                                                                                                                                                                                                                                                                                                 *** =.42, p=1.4e-02
                                                                                                                                                                                                                                                                                                                                                 *** =.5, p=2.5e-03*** =.7, p=3.7e-06
                                                                                                                                                                                                                                                                                                                                                                                    *** =.62, p=1.0e-04
                                                                                                                                                                                                                                                                                                                                                                                                     *** =.33, p=5.8e-02
                                                                                                                                                                                                                                                                                                                                                                                                                      *** =.45, p=8.2e-03
                                                                                                                               *** r=.23, p=1.1e-03
                                                                                                                                                 ** r=.27, p=1.4e-04
                                                                                                                                                                 *** r=.16, p=2.3e-02
                                                                                                                                                                                   ** r=.28, p=7.4e-05
                                                                                                                                                                                                   ***                                                                                                                               r=.41, p=1.7e-02* r=.48, p=3.9e-03
                                                                                                                                                                                                                                                                                                                                                                    ** r=.7, p=4.1e-06
                                                                                                                                                                                                                                                                                                                                                                                    *** r=.61, p=1.4e-04
                                                                                                                                                                                                                                                                                                                                                                                                     *** r=.25, p=1.5e-01 r=.46, p=6.4e-03


                                                         4                                                                                                                                                                                                                                 4
                                                                                                  r=.089, p=2.1e-01r=.3, p=1.7e-05                                                                                                                                                         3
                                                         2                                                                     *** =.2, p=5.7e-04** =.24, p=6.9e-05
                                                                                                   =.082, p=1.8e-01 =.26, p=8.3e-06                              *** =.15, p=1.2e-02* =.24, p=6.2e-05
                                                                                                                                                                                                   ***                                                                                     2                                                        * =.44, p=1.9e-03
                                                                                                                                                                                                                                                                                                                                      =.33, p=2.1e-02               ** =.62, p=2.8e-05
                                                                                                                                                                                                                                                                                                                                                                                    *** =.57, p=8.0e-05
                                                                                                                                                                                                                                                                                                                                                                                                     *** =.23, p=1.1e-01 =.42, p=3.6e-03**
                                                                                                                               ***              ***              ***                 *
                                                                                                                    =.42, p=4.0e-10 =.46, p=6.1e-12 =.35, p=4.3e-07 =.29, p=3.0e-05 =.47, p=1.3e-12
                                                                                                                                                                                                   ***                                                                                     1
                                                                                                                                                                                                                                                                                                                                                    * =.67, p=1.3e-05
                                                                                                                                                                                                                                                                                                                                                                    ** =.62, p=1.1e-04
                                                                                                                                                                                                                                                                                                                                                                                    *** =.76, p=2.2e-07
                                                                                                                                                                                                                                                                                                                                                                                                     *** =.65, p=3.4e-05 =.69, p=5.7e-06**


                                                                                                                               *** r=.44, p=4.4e-11
                                                                                                                   r=.45, p=3.2e-11             *** r=.35, p=2.7e-07
                                                                                                                                                                 *** r=.36, p=2.5e-07
                                                                                                                                                                                  *** r=.49, p=3.0e-13
                                                                                                                                                                                                   ***                                                                                     4
                                                                                                                                                                                                                                                                                                                                                                   *** r=.61, p=1.2e-04
                                                                                                                                                                                                                                                                                                                                                       r=.64, p=4.0e-05             *** r=.75, p=3.3e-07
                                                                                                                                                                                                                                                                                                                                                                                                     *** r=.59, p=2.5e-04
                                                                                                                                                                                                                                                                                                                                                                                                                      *** r=.68, p=1.0e-05
                                                         2                                                                     *** =.4, p=1.3e-10
                                                                                                                    =.41, p=1.1e-10             *** =.32, p=3.4e-07
                                                                                                                                                                 *** =.32, p=3.4e-07
                                                                                                                                                                                  *** =.45, p=1.9e-12
                                                                                                                                                                                                   ***                                                                                     2                                                                       *** =.55, p=1.1e-04
                                                                                                                                                                                                                                                                                                                                                        =.53, p=1.3e-04             *** =.64, p=4.2e-06
                                                                                                                                                                                                                                                                                                                                                                                                     *** =.52, p=2.0e-04
                                                                                                                                                                                                                                                                                                                                                                                                                      *** =.57, p=3.2e-05
                                                                                                                               *** =.47, p=2.4e-12
                                                                                                                                                *** =.45, p=2.9e-11
                                                                                                                                                                 *** =.55, p=2.9e-17
                                                                                                                                                                                  *** =.66, p=3.5e-26
                                                                                                                                                                                                   ***                                                                                     1
                                                                                                                                                                                                                                                                                                                                                                   *** =.78, p=4.4e-08
                                                                                                                                                                                                                                                                                                                                                                                    *** =.8, p=1.3e-08
                                                                                                                                                                                                                                                                                                                                                                                                     *** =.73, p=1.2e-06
                                                                                                                                                                                                                                                                                                                                                                                                                      *** =.79, p=2.1e-08
                                                                                                                                                *** r=.47, p=2.7e-12
                                                                                                                                                                 *** r=.51, p=6.4e-15
                                                                                                                                                                                  *** r=.62, p=2.0e-22
                                                                                                                                                                                                   ***                                                                                                                                                                              *** r=.81, p=5.2e-09
                                                                                                                                                                                                                                                                                                                                                                                                     *** r=.65, p=2.7e-05
                                                                                                                                                                                                                                                                                                                                                                                                                      *** r=.79, p=2.7e-08


                                                         4                                                                                                                                                                                                                                 4
                                                                                                                                    r=.51, p=7.4e-15                                                                                                                                       3
                                                                                                                                                                                                                                                                                                                                                                        r=.77, p=1.2e-07
                                                         2                                                                                      *** =.43, p=2.8e-12
                                                                                                                                     =.47, p=1.2e-14             *** =.46, p=6.3e-14
                                                                                                                                                                                  *** =.56, p=4.6e-20
                                                                                                                                                                                                   ***                                                                                     2                                                                                        *** =.74, p=1.1e-07
                                                                                                                                                                                                                                                                                                                                                                         =.66, p=2.5e-06             *** =.55, p=7.8e-05
                                                                                                                                                                                                                                                                                                                                                                                                                      *** =.69, p=5.1e-07
                                                                                                                                                *** =.5, p=5.1e-14
                                                                                                                                                                 *** =.37, p=1.1e-07
                                                                                                                                                                                  *** =.49, p=1.3e-13
                                                                                                                                                                                                   ***                                                                                     1
                                                                                                                                                                                                                                                                                                                                                                                    *** =.72, p=1.5e-06
                                                                                                                                                                                                                                                                                                                                                                                                     *** =.67, p=1.2e-05
                                                                                                                                                                                                                                                                                                                                                                                                                      *** =.7, p=3.5e-06

                                                                                                                                                                                                                                  Overall Quality Relevance Trustworthiness Completeness
Overall Quality Relevance Trustworthiness Completeness

                                                                                                                                                                 *** r=.43, p=1.2e-10
                                                                                                                                                     r=.5, p=4.6e-14              *** r=.51, p=8.2e-15
                                                                                                                                                                                                   ***                                                                                     4
                                                                                                                                                                                                                                                                                                                                                                                                     *** r=.61, p=1.2e-04
                                                                                                                                                                                                                                                                                                                                                                                         r=.73, p=1.0e-06             *** r=.72, p=2.0e-06
                                                         2                                                                                                       *** =.38, p=3.2e-10
                                                                                                                                                      =.45, p=2.5e-13             *** =.46, p=2.5e-14
                                                                                                                                                                                                   ***                                                                                     2                                                                                                         *** =.52, p=2.6e-04
                                                                                                                                                                                                                                                                                                                                                                                          =.64, p=6.6e-06             *** =.63, p=7.9e-06
                                                                                                                                                                 *** =.27, p=8.5e-05
                                                                                                                                                                                  *** =.46, p=1.2e-11
                                                                                                                                                                                                   ***                                                                                     1
                                                                                                                                                                                                                                                                                                                                                                                                     *** =.6, p=1.8e-04
                                                                                                                                                                                                                                                                                                                                                                                                                      *** =.82, p=3.7e-09
                                                                                                                                                                                  *** r=.49, p=1.1e-13
                                                                                                                                                                      r=.34, p=1.0e-06             ***                                                                                     4
                                                                                                                                                                                                                                                                                                                                                                                                                      *** r=.83, p=1.8e-09
                                                                                                                                                                                                                                                                                                                                                                                                          r=.55, p=7.5e-04             ***
                                                         2                                                                                                                        *** =.45, p=4.1e-13
                                                                                                                                                                       =.3, p=8.5e-07              ***                                                                                     2                                                                                                                          *** =.76, p=5.4e-08
                                                                                                                                                                                                                                                                                                                                                                                                           =.47, p=8.3e-04             ***
                                                                                                                                                                                  *** =.52, p=4.2e-15
                                                                                                                                                                                                   ***                                                                                     1
                                                                                                                                                                                                                                                                                                                                                                                                                      *** =.67, p=1.4e-05
                                                                                                                                                                                       r=.5, p=2.9e-14
                                                                                                                                                                                                                                                                                                                                                                                                                           r=.64, p=4.7e-05
                                                         2                                                                                                                                         ***
                                                                                                                                                                                        =.45, p=1.8e-13                                                                                    2                                                                                                                                           ***
                                                                                                                                                                                                                                                                                                                                                                                                                            =.54, p=1.0e-04
                                                                                                                                                                                                   ***                                                                                     1
                                                         4                                                                                                                                                                                                                                 4
                                                         3                                                                                                                                                                                                                                 3
                                                         2                                                                                                                                                                                                                                 2
                                                         1                                                                                                                                                                                                                                 1
                                                             1   2   3   4   5    1    2   3   4   5   1    2   3   4    5   1    2   3   4   5   1   2   3   4   5   1   2   3   4   5   1   2   3   4   5   1   2   3   4   5                                                                1   2   3   4   5    1    2   3   4   5   1    2   3   4    5   1    2   3   4   5   1   2   3   4   5   1   2   3   4   5   1   2   3   4   5   1   2   3   4   5
                                                                 Accuracy             Neutrality           Readability           Precision        Completeness Trustworthiness Relevance Overall Quality                                                                                           Accuracy             Neutrality           Readability           Precision        Completeness Trustworthiness Relevance Overall Quality

Figure 1: Scatterplots and correlations between the                                                                                                                                                                               Figure 2: Scatterplots and correlations between the
dimensions pairs, for raw worker values                                                                                                                                                                                           dimensions pairs, for the experts
                                                                     i.e., the agreement between the aggregated assess-                                                                                                           they are less statistically significant. When comparing
                                                                     ments by the workers and the experts, on all di-                                                                                                             to Figure 2 one can see that usually the correlation be-
                                                                     mensions?                                                                                                                                                    tween dimensions are higher for the experts than for
                                                                                                                                                                                                                                  the aggregate workers, but values are definitely more
4                                                                    Results                                                                                                                                                      comparable than the individual raw values, and indeed
                                                                                                                                                                                                                                  the aggregate workers have higher correlations than
The main results are grouped on the basis of the re-                                                                                                                                                                              the experts in three cases (the correlations between
search questions.                                                                                                                                                                                                                 Accuracy and Relevance those between Overall Qual-
                                                                                                                                                                                                                                  ity and both Neutrality and Precision). We also tried
4.1                                                                      Q1: Quality Dimensions Relationships                                                                                                                     aggregating with the median, obtaining worse results.
                                                                                                                                                                                                                                     Another remark that can be made by observing the
A first result is presented in Figure 1, that shows a                                                                                                                                                                             histograms on the diagonals of Figures 1 and 2 is that
scatterplot matrix. For each pair of dimensions (in-                                                                                                                                                                              the values provided by the experts tend to follow a
dicated on the diagonal), a scatterplot is shown (in                                                                                                                                                                              more Bimodal distributions (they use more the ex-
the bottom triangular matrix, with some random jit-                                                                                                                                                                               tremes of the scale) than the workers. This is even
ter to avoid some overlap). Each dot in a scatterplot                                                                                                                                                                             clearer when looking at the aggregated values since the
represents one individual worker/article pair, and its                                                                                                                                                                            mean of the values will pull them even more towards
coordinates are the values expressed by the worker on                                                                                                                                                                             the middle of the scale, as it can be seen in Figure 3.
the corresponding two dimensions. In the upper trian-                                                                                                                                                                             The distributions also show that the workers tend to
gular part, the correlation values are shown with their                                                                                                                                                                           express higher values than the experts.
p-values to measure statistical significance.
   Figure 2 allows to compare the data to experts.
                                                                                                                                                                                                                                  4.2                                                                      Q2: Internal Agreement among Workers
Comparing correlation values, it is clear that experts
are more consistent across dimensions; p-values are                                                                                                                                                                               Table 1 shows the agreement among the workers, over-
roughly similar in the two cases.                                                                                                                                                                                                 all and on each quality dimension, measured by both
   As it is common practice in crowdsourcing, in place                                                                                                                                                                            Krippendorff’s α [Kri07] and Φ [Che+17]. Both mea-
of using raw values by individual workers, we com-                                                                                                                                                                                sures assume values in [−1, +1] (with −1 correspond-
pute aggregated values. We select a simple (if not the                                                                                                                                                                            ing to complete disagreement, 0 to random agreement,
simplest) aggregation function: the arithmetic mean.                                                                                                                                                                              and +1 to complete agreement). For Φ the table also
Figure 3 shows the correlations obtained when aggre-                                                                                                                                                                              shows, besides the most likely Φ value, the Highest
gating with the mean the 10 values expressed by 10                                                                                                                                                                                Posterior Density (HPD) interval, i.e., the interval that
workers on the same article. When comparing to Fig-                                                                                                                                                                               contains the actual Φ value with a 95% probability:
ure 1, one can see that correlations increase, although                                                                                                                                                                           these are quite small intervals, so we can be confi-
                                                                                  =.48, p=4.3e-02 =.37, p=1.3e-01 =.75, p=3.4e-04 =.57, p=1.3e-02 =.79, p=9.8e-05 =.77, p=1.8e-04 =.79, p=1.0e-04
                                                         4                                         *                                ***                 *              ***               ***              ***                       ferent worker groups, and/or decrease the granularity
                                                                                 r=.49, p=3.8e-02 r=.39, p=1.1e-01 r=.7, p=1.3e-03 r=.6, p=8.3e-03 r=.77, p=1.8e-04 r=.76, p=2.6e-04 r=.73, p=5.8e-04
                                                         2                                         *                                 **               **               ***               ***
                                                                                  =.33, p=6.5e-02 =.31, p=8.8e-02 =.57, p=1.6e-03 =.47, p=9.8e-03 =.66, p=2.3e-04 =.65, p=4.0e-04 =.61, p=7.2e-04         ***                       and ask to evaluate passages of an article instead of a
                                                                                                                                     **               **               ***               ***
                                                                                                        =.39, p=1.1e-01 =.33, p=1.8e-01 =.48, p=4.6e-02 =.54, p=2.2e-02 =.23, p=3.6e-01 =.57, p=1.4e-02
                                                                                                                                                                                                          ***                       full article. In this light, we observe a low correlation
                                                                                                       r=.31, p=2.1e-01 r=.28, p=2.6e-01 r=.43, p=7.7e-02* r=.58, p=1.2e-02* r=.055, p=8.3e-01r=.57, p=1.4e-02*

                                                         3                                                                                                                                                                          (between 0 and 0.20) between the workers confidence,
                                                         2                                              =.25, p=1.6e-01 =.2, p=2.7e-01 =.34, p=6.0e-02 =.4, p=2.4e-02* =.035, p=8.5e-01 =.42, p=1.9e-02      *
                                                                                                                                                                          *                                  *                      i.e., question number 10, and all the quality dimen-
                                                         5                                                               =.42, p=8.6e-02 =.5, p=3.5e-02 =.39, p=1.1e-01 =.13, p=6.0e-01 =.42, p=8.1e-02
                                                                                                                                                                                                                                    sions and a moderate correlation (about 0.6) between

                                                                                                                        r=.24, p=3.5e-01 r=.36, p=1.5e-01* r=.25, p=3.1e-01 r=.18, p=4.8e-01 r=.23, p=3.6e-01
                                                                                                                              =.19, p=3.1e-01 =.27, p=1.4e-01 =.19, p=3.0e-01 =.14, p=4.6e-01 =.17, p=3.5e-01                       the workers agreement, i.e., question 9, with the article
                                                         5                                                                                          =.74, p=4.0e-04 =.71, p=1.1e-03 =.67, p=2.5e-03 =.82, p=3.0e-05                 assessed and Precision, Accuracy, and Overall Quality
                                                                                                                                                                ***              **               **               ***

                                                                                                                                                   r=.75, p=3.4e-04 r=.6, p=8.6e-03 r=.58, p=1.1e-02 r=.86, p=4.6e-06
                                                         2                                                                                                      ***              **                 *
                                                                                                                                                    =.6, p=9.1e-04 =.47, p=9.1e-03 =.48, p=9.1e-03 =.71, p=8.8e-05 ***              scores. While this correlation is not complete, it still
                                                                                                                                                                *** =.63, p=5.0e-03
                                                                                                                                                                                 ** =.4, p=1.0e-01** =.82, p=3.2e-05
                                                                                                                                                                                                                   ***              hints at the possibility that a subgroup of the workers
Overall Quality Relevance Trustworthiness Completeness

                                                                                                                                                                                 ** r=.43, p=7.8e-02 r=.86, p=5.0e-06
                                                                                                                                                                    r=.62, p=6.1e-03                               ***              shows a confirmation bias, meaning that these tend to
                                                         2                                                                                                                       ** =.35, p=6.1e-02 =.72, p=7.4e-05
                                                                                                                                                                     =.47, p=9.0e-03                               ***
                                                                                                                                                                                 ** =.57, p=1.4e-02 =.75, p=3.5e-04***              judge positively the articles they agree with, and vice-
                                                                                                                                                                                     r=.57, p=1.4e-02* r=.65, p=3.5e-03
                                                                                                                                                                                                                   ***              versa. In this short paper we do not have the space
                                                         2                                                                                                                                          * =.54, p=2.5e-03
                                                                                                                                                                                      =.43, p=2.0e-02               **              to discuss these issues in full, and we leave them for
                                                                                                                                                                                                    * =.49, p=4.0e-02
                                                                                                                                                                                                                    **              future work.
                                                                                                                                                                                                       r=.43, p=7.8e-02*
                                                         2                                                                                                                                                      =.36, p=5.2e-02
                                                         5                                                                                                                                                                          4.3   Q3: External Agreement with the Experts
                                                         2                                                                                                                                                                          Turning to the agreement between workers and ex-
                                                             1   2   3
                                                                         4   5    1    2   3   4
                                                                                                   5    1    2   3   4
                                                                                                                          5   1    2   3
                                                                                                                                           4   5    1   2   3   4   5   1   2   3   4   5   1   2   3   4
                                                                                                                                                   Completeness Trustworthiness Relevance Overall Quality
                                                                                                                                                                                                            5   1   2   3   4   5   perts, the scatterplots and correlations values in Fig-
                                                                                                                                                                                                                                    ure 4 (top row) show that the agreement of the indi-
Figure 3: Scatterplots and correlations between the                                                                                                                                                                                 vidual workers with the experts is rather low, as cor-
dimensions pairs, for aggregated (mean) worker values.                                                                                                                                                                              relation values are positive but quite small, and of-
                                                                                                                                                                                                                                    ten not significant. Figure 4 (center row) shows the
                                                                                                                                                                                                                                    agreement with the experts that is obtained when ag-
                                                             Dimension                                                   α                                  Φ                   HPD [2.5, 97.5]                                     gregating the worker values with the mean. Correla-
                                                                                                                                                                                                                                    tion values are systematically higher than individual
                                                             All                                                         0.132                              0.084               [0.014, 0.146]
                                                                                                                                                                                                                                    workers, although almost never greater than 0.5 and
                                                             Accuracy                                                    0.057                              0.800               [0.747, 0.836]                                      often not statistically significant. As previously ob-
                                                             Neutrality                                                  0.016                              0.703               [0.609, 0.778]                                      served, the aggregation reduces the range of the values:
                                                             Readability                                                 0.012                              0.687               [0.500, 0.831]                                      whereas the experts usually use the full spectrum, the
                                                             Precision                                                   0.026                              0.807               [0.773, 0.868]                                      aggregated workers score is more limited. In all these
                                                             Completeness                                                0.065                              0.876               [0.816, 0.903]
                                                                                                                                                                                                                                    plots, the eight dimensions show quite similar correla-
                                                             Trustworthiness                                             0.108                              0.904               [0.827, 0.954]
                                                                                                                                                                                                                                    tion values with the exception of Neutrality: workers
                                                             Relevance                                                   0.022                              0.739               [0.716, 0.783]
                                                             Overall Quality                                             0.011                              0.833               [0.805, 0.852]                                      particularly disagree with the experts about it.
                                                                                                                                                                                                                                       Figure 4 (bottom row) demonstrates the previous
                                                                             Table 1: Agreement among the workers                                                                                                                   claim that in general the median is a worse aggrega-
                                                                                                                                                                                                                                    tion function: lower correlation values are obtained for
dent that the most likely Φ value is correct. α val-                                                                                                                                                                                Completeness, Trustworthiness, Relevance, and, espe-
ues are quite low, but Φ ones are much higher. Most                                                                                                                                                                                 cially, Overall Quality (which has not correlation with
likely, as we have discussed above, assessment values                                                                                                                                                                               the experts when using the median). However, Read-
have a quite low variability. In such a case, α exhibits                                                                                                                                                                            ability and Precision are similar, and Neutrality and,
a pathological behavior, which is of the issues with                                                                                                                                                                                especially, Accuracy are higher. This suggests that
α that is solved by Φ as discussed by Checco et al.                                                                                                                                                                                 different and more sophisticate aggregation functions
[Che+17]. The much higher Φ values, together with                                                                                                                                                                                   might lead to a higher agreement with the experts, an
the narrow HPD intervals, show that the agreement                                                                                                                                                                                   issue that for space limits we leave for future work.
among the workers is consistent even if not complete.
   The results presented so far hint that the data col-
                                                                                                                                                                                                                                    5     Conclusions and Future Work
lected by our crowdsourcing experiment are reliable. It
is also important to remark that although the workers                                                                                                                                                                               In this paper we present an experiment that aims at
in some cases fail to exactly replicate the assessments                                                                                                                                                                             comparing crowd and nichesourcing as methods for as-
by the experts (as we discuss shortly), the task is quite                                                                                                                                                                           sessing the quality of online information from a mul-
complex and assessor background might have a critical                                                                                                                                                                               tidimensional standpoint. We collect 10 assessments
role. In this respect, a full agreement might even be                                                                                                                                                                               about 20 articles from a dataset on the vaccination
a problem rather than a feature. If this is the case,                                                                                                                                                                               debate, and we analyze them internally and in compar-
it might be necessary to treat in a different way dif-                                                                                                                                                                              ison to previously published expert assessments. We
                                Accuracy                                Neutrality                                Readability                                     Precision                               Completeness                             Trustworthiness                                         Relevance                                Overall Quality
                  =.14, p=4.70e-02
               5 r=.14,                                         =.016, p=8.25e-01                             =.13, p=5.85e-02                             =.1, p=1.49e-01                             =.18, p=1.09e-02                            =.17, p=1.35e-02                                 =.17, p=1.52e-02                             =.12, p=9.59e-02
                        p=4.77e-02                             r=.021, p=7.65e-01                            r=.15, p=3.52e-02                            r=.13, p=6.73e-02                           r=.15, p=3.02e-02                           r=.17, p=1.49e-02                                r=.14, p=5.61e-02                            r=.11, p=1.18e-01
                  =.12, p=4.96e-02                              =.017, p=7.68e-01                             =.13, p=3.70e-02                             =.11, p=6.64e-02                            =.13, p=3.10e-02                            =.14, p=1.69e-02                                 =.11, p=6.71e-02                             =.091, p=1.22e-01
Expert score
                  1 2 3 4                          5            1        2   3       4       5               1        2   3       4       5               1       2    3      4       5               1        2   3      4       5               1           2       3   4       5                1           2       3   4   5                1         2       3   4   5
                           Workers score                            Workers score                                Workers score                                Workers score                                Workers score                              Workers score                                        Workers score                              Workers score
                                 Accuracy                                Neutrality                                Readability                                    Precision                               Completeness                            Trustworthiness                                          Relevance                            Overall Quality
               5        =.42, p=8.12e-02                        =.036, p=8.88e-01                         =.35, p=1.58e-01                             =.41, p=8.75e-02                            =.46, p=5.69e-02                            =.43, p=7.62e-02                                =.55, p=1.90e-02                             =.44, p=6.96e-02
                       r=.41, p=9.26e-02                       r=.11, p=6.62e-01                         r=.44, p=6.55e-02                            r=.49, p=3.76e-02                           r=.38, p=1.19e-01                           r=.43, p=7.59e-02                               r=.52, p=2.82e-02                            r=.37, p=1.27e-01
                        =.31, p=1.07e-01                        =.067, p=7.22e-01                         =.41, p=2.79e-02                             =.39, p=4.05e-02                            =.29, p=1.34e-01                            =.33, p=7.90e-02                                =.42, p=3.22e-02                             =.33, p=8.15e-02
Expert score




                   1         2         3       4       5   1         2           3       4       5   1            2           3       4       5   1           2        3          4       5   1           2        3          4       5   1           2           3           4       5   1            2           3       4       5   1          2           3       4       5
                             Workers score                           Workers score                                Workers score                               Workers score                               Workers score                               Workers score                                    Workers score                              Workers score
                                Accuracy                                Neutrality                                Readability                                     Precision                               Completeness                             Trustworthiness                                         Relevance                                Overall Quality
                  =.5, p=3.59e-02
               5 r=.52,                                         =.12, p=6.45e-01                              =.35, p=1.58e-01                             =.42, p=8.08e-02                            =.36, p=1.37e-01                            =.32, p=2.02e-01                                 =.3, p=2.32e-01                              =.076, p=7.63e-01
                        p=2.84e-02                             r=.12, p=6.37e-01                             r=.41, p=9.11e-02                            r=.39, p=1.09e-01                           r=.28, p=2.53e-01                           r=.4, p=9.94e-02                                 r=.16, p=5.33e-01                            r=-.096, p=7.04e-01
                  =.44, p=3.31e-02                              =.1, p=5.99e-01                               =.37, p=7.62e-02                             =.33, p=1.11e-01                            =.24, p=2.48e-01                            =.32, p=1.25e-01                                 =.11, p=5.97e-01                             =-.098, p=6.37e-01
Expert score

                   1        2      3       4       5       1        2        3       4       5           1       2        3       4       5           1       2       3       4       5           1        2       3      4       5           1           2       3       4       5            1           2       3       4   5            1         2       3       4   5
                           Workers score                            Workers score                                Workers score                                Workers score                                Workers score                              Workers score                                        Workers score                              Workers score

Figure 4: Scatterplots and correlations between experts and: (i) individual workers (top row); (ii) aggregated
workers, with mean as aggregation function (center row); and (iii) aggregated workers, with median as aggregation
function (bottom row).
observe that workers tend to use higher values than ex-   [Che+17] Alessandro Checco, Kevin Roitero, Eddy
perts, and that aggregate workers values show a higher                 Maddalena, Stefano Mizzaro, and Gian-
correlation in three cases (between Accuracy and Rel-                  luca Demartini. “Let’s Agree to Disagree:
evance, and between Overall Quality and Neutrality                     Fixing Agreement Measures for Crowd-
and Precision). When looking at the internal agree-                    sourcing”. In: The 5th AAAI Conference
ment among workers, we note that this is high, but not                 on Human Computation and Crowdsourc-
complete. This might be due to the fact that, at least                 ing (HCOMP 2017). 2017.
some workers, show a confirmation bias, i.e., tend to     [CNA16] Davide Ceolin, Julia Noordegraaf, and
rate higher documents they agree with, and vice-versa.                 Lora Aroyo. “Capturing the Ineffable:
Lastly, when looking at the agreement between work-                    Collecting, Analysing, and Automating
ers and experts, we can see that this is generally high,               Web Document Quality Assessments”. In:
except for the Neutrality dimension.                                   Knowledge Engineering and Knowledge
   In the future, we plan to extend our dataset to in-                 Management. Springer International Pub-
crease the number of assessments, of articles analysed,                lishing, 2016, pp. 83–97.
and of topics covered to help us generalise our find-
                                                          [Kri07]      Klaus Krippendorff. “Computing Krippen-
ings. We plan to extend the depth of our analyses, for
                                                                       dorff’s alpha reliability”. In: Departmental
example to identify an assessability measure for docu-
                                                                       papers (ASC) (2007), p. 43.
ments (hinting at how easy it is to assess them), and to
identify similar groups of workers with higher internal   [Laz+18] David M. J. Lazer, Matthew A. Baum,
agreement.                                                             Yochai Benkler, Adam J. Berinsky, Kelly
Acknowledgements This study was partially sup-                         M. Greenhill, Filippo Menczer, Miriam J.
ported by the H2020 project QROWD (grant agree-                        Metzger, Brendan Nyhan, Gordon Penny-
ment ID: 732194).                                                      cook, David Rothschild, Michael Schud-
                                                                       son, Steven A. Sloman, Cass R. Sun-
                                                                       stein, Emily A. Thorson, Duncan J. Watts,
