<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>What Can Crowd Computing Do for the Next Generation of AI Systems?</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Ujwal Gadiraju and Jie Yang Web Information Systems Delft University of Technology</institution>
          <country country="NL">The Netherlands</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The unprecedented rise in the adoption of artificial intelligence techniques and automation in many contexts is concomitant with shortcomings of such technology with respect to robustness, interpretability, usability, and trustworthiness. Crowd computing offers a viable means to leverage human intelligence at scale for data creation, enrichment, and interpretation, demonstrating a great potential to improve the performance of AI systems and increase the adoption of AI in general. Existing research and practice has mainly focused on leveraging crowd computing for training data creation. However, this perspective is rather limiting in terms of how AI can fully benefit from crowd computing. In this vision paper, we identify opportunities in crowd computing to propel better AI technology, and argue that to make such progress, fundamental problems need to be tackled from both computation and interaction standpoints. We discuss important research questions in both these themes, with an aim to shed light on the research needed to pave a future where humans and AI can work together seamlessly, while benefiting from each other.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Artificial intelligence techniques and machine learning in particular, are drastically changing our
lives through technological revolutions across several domains such as transportation, health, finance,
education, and manufacturing. AI systems at the forefront of such innovations have garnered a
growing barrage of concerns, not only due to issues pertaining to performance – such systems have
been observed to easily fail in situations slightly different from those encountered in the training
instances [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] – but also due to the ethical and societal implications that arise as a result of using these
systems [
        <xref ref-type="bibr" rid="ref3 ref35 ref6 ref7 ref9">7, 6, 3, 9, 35</xref>
        ].
      </p>
      <p>
        Problems exist and manifest both in AI systems and in the interaction between end users with
such systems. On the one hand, machine learning models have been criticized for the lack of
robustness, fairness, and transparency [
        <xref ref-type="bibr" rid="ref14 ref20 ref26">26, 14, 20</xref>
        ]. Such model-related problems can be attributed
to data problems to a large extent: for models to learn comprehensive, fine-grained, and unbiased
patterns, they have to be trained on a large number of high-quality data instances with the right
distribution that is representative of real application scenarios. Creating such data is not only a
long, laborious, and expensive process, but sometimes even impossible when the data is extremely
imbalanced or the distribution constantly evolves over time. On the other hand, AI systems often
demonstrate inconsistent and unpredictable behavior that can confuse users, erode their confidence,
and may eventually lead to the abandonment of the systems [
        <xref ref-type="bibr" rid="ref10 ref2">10, 2</xref>
        ]. Systems with such behavior
violate established usability guidelines of traditional user interface design (e.g., minimizing the
unexpected changes), posing an ever bigger challenge for the design of intuitive and effective user
interfaces. The problem is further complicated by the variability of interfaces for AI systems, ranging
from the conventional Web-based interfaces to the emerging Voice-based ones. There is a limited
understanding of how users perceive automated decisions and how their behavior is mediated or
influenced by the interfaces.
      </p>
      <p>
        The two schools of challenges pertaining to AI systems, characterised as computational and
interactional ones, are in fact highly related to each other. From the computation perspective, a better
understanding of user interactions can help identify the focal point of system development and
potentially spark new research directions. A prominent example is machine learning interpretability,
inspired by the observation that explainable results are more in demand by users than highly accurate
ones. From the interaction perspective, more robust and interpretable systems can help build trust
and increase system uptake [
        <xref ref-type="bibr" rid="ref19 ref40">19, 40</xref>
        ]. As AI systems become more commonplace, people must be
able to make sense of their encounters and interpret their interactions with such systems.
A promising approach to address both computational and interactional challenges while building AI
systems, is the use of crowd computing, which offers a viable means to engage a large number of
human participants in data related tasks and in user studies.
      </p>
      <p>
        Crowd computing has been conceptualised in various ways – as being related to crowdsourcing,
human computation, social computing, cloud computing and mobile computing [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ]. Over the last
decade there has been a steady rise in the adoption of crowd computing solutions across a variety of
domains [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. In the context of overcoming the computational and interactional challenges facing the
current generation of AI systems, recent work has shown how crowd computing can be leveraged to
either debug noisy training data in machine learning systems [
        <xref ref-type="bibr" rid="ref46">46</xref>
        ], understand which machine learning
models are more congruent to human understanding in particular tasks [
        <xref ref-type="bibr" rid="ref22 ref47">22, 47</xref>
        ], or to advance our
understanding of how AI systems can influence human behavior [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
      </p>
      <p>Based on the existing evidence of how crowd computing can play an important role in tackling
computational and interactional challenges in developing new-age AI systems, in this vision paper,
we highlight research themes that need to be pursued to ensure that AI systems can create a future
where we are better off than we currently are – both as individuals and as a society.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Crowd Computing and Human-Centered AI</title>
      <p>In this section, we discuss important challenges that need to be addressed to make advances in the next
generation of AI systems from two main standpoints – (1) Human-in-the-loop AI, and (2) Human-AI
interaction. The former concerns the computational role of humans for AI, i.e., AI by humans, while
the latter concerns the interactional role of humans with AI systems, i.e., AI for humans.
2.1</p>
      <sec id="sec-2-1">
        <title>Human-in-the-Loop AI</title>
        <p>In what follows, we analyze the fundamental computational challenges in the quest for robust,
interpretable, and hence trustworthy AI systems. We argue that to tackle such fundamental challenges,
research should explore a novel crowd computing paradigm, which we refer to as “crowd conceptual
computing”. In such form of crowd computing, crowd workers can contribute knowledge at the
conceptual level; this comes in contrast to the current paradigm where crowd intelligence is utilised
on a per-datum basis, e.g., labelling and debugging individual data instances.</p>
        <p>
          Robust AI by Crowds. Machine (deep) learning models have proven to be “shallow” – they often
learn spurious correlations in the data – and “brittle” – they are unable to make sense of situations
slightly different from the training data. Consequently, current AI systems often fail when required to
make predictions on data beyond the training distribution, which is of crucial need in practice. Those
issues constitute what is now referred to as the robustness or reliability issue, generally viewed as a
main obstacle for wide deployment of AI systems [
          <xref ref-type="bibr" rid="ref13 ref26">13, 26</xref>
          ].
        </p>
        <p>Robust AI requires models to be encapsulated with causality and better generalisation ability, which
are the main advantageous characteristics of conventional symbolic AI methods focusing on
knowledge representation and reasoning. Recent discussions in the AI community has therefore converged
to the idea of developing neurosymbolic methods that benefit from both the robustness of symbolic
methods and the flexibility of deep learning. Few discussions have, however, touched upon the
questions of what knowledge is required, and where and how to obtain such knowledge. Historical
research in expert systems has shown that the amount of knowledge for a specific task can be very
large that can easily go beyond readily available knowledge bases and what individuals can provide.
Building on top of the Web, crowd computing systems can reach an unprecedented number of people,
thus offering a feasible approach to leveraging human intelligence at scale for knowledge creation.
Classical per-datum based crowd computing techniques, however, are ill suited for the problem, when
the outcome contributions are data instances as opposed to the knowledge in need. Take for example,
unknown unknowns of machine learning, which is a major class of errors produced by unreliable AI
systems. Such errors are caused by missing or underrepresented concepts in the model. Each of those
concepts can be instantiated as various data instances. Crowd computing has been used to detect
unknown unknowns and fix them by contributing instances for training data augmentation. Such an
approach however, is limited not only in terms of efficiency but also effectiveness, due to the intrinsic
shallowness and brittleness of machine learning models.</p>
        <p>
          Interpretable AI by Crowds. Interpretability in AI refers to “the ability to explain or to present in
understandable terms to a human” [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] how the system makes predictions for individual instances
(i.e., local interpretability) or how the system works with respect to a specific class of instances (i.e.,
global interpretability). The problem is closely related to the robustness problem: being able to
inspect what an AI system has learned is useful to identify what it has not.
        </p>
        <p>
          Humans as the object in the definition of AI interpretability implies the following key requirements for
the design of interpretability methods: i) presentation of interpretations need to match humans’ mental
representations of concepts as humans understand the world through concepts that are associated
with observable properties; ii) interpretability methods also need to take the flexible needs of humans
as explanation consumers into account, allowing humans to gain insights about system behavior with
multi-concept queries that involve the (non-)presence of multiple concepts flexibly named by humans.
Existing interpretability methods, however, fail to meet those requirements. Existing local methods
generally generate explanations by highlighting relevant input units – e.g., words in a sentence
or pixels in an image [
          <xref ref-type="bibr" rid="ref38 ref39">38, 39</xref>
          ], which require efforts from human users to make sense of; global
methods generate interpretations representing relevant concepts with a set of examples – e.g., pieces
of text or image patches [
          <xref ref-type="bibr" rid="ref18 ref23">23, 18</xref>
          ], which do not support multi-concept questions for in-depth model
understanding.
        </p>
        <p>A natural approach to fill the semantic gap is involving humans in the interpretation process. Similar
to crowd computing for robust AI, where the goal is to characterise what a model has not learned,
crowd computing for interpretable AI seeks to explain what a model has learned. The latter again
requires crowd computing on the conceptual level for human interpretability and query flexibility.
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Human-AI Interaction</title>
        <p>
          Principles for human-AI interaction have been discussed in the HCI community for several years [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ].
However, in the light of recent advances in AI and the growing role of AI technologies in
humancentered applications, a deeper exploration is the need of the hour. As different research communities
aim to progress in this direction, we need to explore and develop fundamental methods and techniques
to harness the virtues of AI in a manner that is beneficial and useful to the society at large. Crowd
computing methods can allow us to carry out large-scale behavioral experiments and randomized
controlled trials [
          <xref ref-type="bibr" rid="ref16 ref5">16, 5</xref>
          ], that are necessary to representatively study, and make advances in our
understanding of Human-AI interaction. We foresee crowd computing to play a pivotal role in
addressing important challenges in the following themes.
        </p>
        <p>
          Congruence of machine learning models with human understanding. Complex machine learning
models are deployed in several critical domains including healthcare and autonomous vehicles
nowadays, albeit as functional blackboxes. Models which correspond to human interpretation of a
task are more desirable in certain contexts and can help attribute liability, build trust, expose biases
and in turn build better models [
          <xref ref-type="bibr" rid="ref41">41</xref>
          ]. It is therefore of paramount importance to understand how and
which models conform to human understanding of various tasks. What is the relationship between
expectations and trust when humans interact with AI systems? How can effective machine learning
models be built, while conforming to human expectations?
Explaining AI systems to humans and supporting decision-making. AI systems offer
computational powers that vastly transcend human capabilities. In conjunction with the ability to autonomously
detect data patterns and derive superior predictions, AI systems are projected to complement,
transform and in several cases even substitute human decision-makers. This process broadly revolutionizes
all the relevant stages of economical, political and societal decision-making. Despite these dynamics,
the impact of AI systems on human behavior remains largely unexplored. We need to address this
crucial gap by carrying out interdisciplinary research to advance the current understanding of impact
of AI systems on human decision-making. Despite the recent surge in interpreting decisions of
complex machine learning models to explain their actions to humans [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ], little is known about
what constitutes a sufficient explanation from a user’s vantage point and the contextual settings.
Moreover, how such criteria varies across the landscape of different stakeholders interacting with
AI systems needs to be better understood. Different individuals and user groups alike, can have
varying attitudes towards the same technology due to a range of factors including their familiarity
with the technology [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ], individual traits [
          <xref ref-type="bibr" rid="ref37">37</xref>
          ], cultural differences [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ], or contexts [
          <xref ref-type="bibr" rid="ref42">42</xref>
          ]. How can
explanations be adapted and personalized across diverse stakeholders with an aim to improve the
effectiveness of their interaction with AI systems?
3
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>A Vision for the Future</title>
      <p>
        Open-ended Crowd Knowledge Creation. For the purpose of both, robust or interpretable AI,
knowledge creation in real-world machine learning tasks is a complex, open-ended task. Research on
this problem needs to investigate not only the extraction of knowledge from the training data and
model, but also the creation of any knowledge crowds deem as relevant, which can easily go beyond
knowledge encoded in existing knowledge sources. Such a problem is related to multiple ongoing
research lines, such as crowd knowledge creation [
        <xref ref-type="bibr" rid="ref43">43</xref>
        ], complex task design [
        <xref ref-type="bibr" rid="ref17 ref45">45, 17</xref>
        ], open-ended
crowdsourcing [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], machine intelligence for human work [
        <xref ref-type="bibr" rid="ref30 ref44">44, 30</xref>
        ], etc. In the crowd computing
community specifically, it has been widely recognized that the future of research in this field should
enable crowd work that is complex, collaborative, and sustainable, such that human workers can both
earn and learn from their work in an enjoyable manner [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]. Aligned with this goal, we advocate a
novel crowd computation paradigm aiming at bringing human computation to the conceptual level
for knowledge creation. The open-endedness of this new kind of knowledge creation tasks further
calls for research on leveraging the cognitive ability, and creativity in particular, of human workers.
Crowd knowledge creation for tackling problems in AI systems further contributes to the vision of a
human-AI collaborative future: by acquainting human workers with the strengths and weaknesses of
AI algorithms through knowledge creation tasks, we envision a future where human workers and AI
can work together seamlessly while benefiting from each other.
      </p>
      <p>
        Conversational Human-AI Interaction. Conversational interfaces have been argued to have
advantages over traditional GUIs due to having a more human-like interaction [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ]. Recent work in
crowd computing has shown that conversational interfaces can lead to an increased satisfaction and
engagement in online work settings when compared to conventional web interfaces [
        <xref ref-type="bibr" rid="ref27 ref32 ref33">27, 32, 33</xref>
        ].
Conversational interfaces have also been found to be conducive for memorable interactions with
information retrieval systems [
        <xref ref-type="bibr" rid="ref34">34</xref>
        ]. Messaging applications such as Telegram, Facebook
Messenger, and Whatsapp, are regularly used by an increasing number of people mainly for interpersonal
communication and coordination purposes [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ]. Users across cultures, demographics, and
technological platforms are now familiar with their minimalist interfaces and functionality. By building
conversational interfaces that people may generally be more familiar with, for their interaction with
AI systems, we can potentially lower the barrier for the adoption of such systems.
Trust plays a central role in Human-AI interaction – the adoption and successful utilization of AI
systems is mediated by trust. Therefore, it is important to investigate whether novel conversational
interfaces can be built to facilitate trust in AI systems. Several factors have been identified to be
capable of increasing trust toward conversational agents including appearances, voice features, and
communication styles [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ]. These findings suggest that human interaction with AI systems can
potentially be enhanced by leveraging conversational interfaces to improve engagement, and build
trust. By facilitating a more natural type of interaction, conversational interfaces can also lower the
barrier for crowd computing to address the robustness and interpretability issues of AI systems, in
particular conversational systems themselves as a representative type of AI-complete systems [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
Crowd computing offers promising means to overcome fundamental challenges in computation and
interaction, and herald a new generation of human-centered AI systems.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <article-title>[1] Death of Elaine Herzberg</article-title>
          . Wikipedia [Accessed:
          <fpage>2020</fpage>
          -04-12].
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Saleema</given-names>
            <surname>Amershi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Dan</given-names>
            <surname>Weld</surname>
          </string-name>
          , Mihaela Vorvoreanu, Adam Fourney, Besmira Nushi, Penny Collisson, Jina Suh, Shamsi Iqbal, Paul N Bennett,
          <string-name>
            <surname>Kori Inkpen</surname>
          </string-name>
          , et al.
          <article-title>Guidelines for human-ai interaction</article-title>
          .
          <source>In Proceedings of the 2019 chi conference on human factors in computing systems</source>
          , pages
          <fpage>1</fpage>
          -
          <lpage>13</lpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Julia</given-names>
            <surname>Angwin</surname>
          </string-name>
          , Jeff Larson, Surya Mattu, and
          <string-name>
            <given-names>Lauren</given-names>
            <surname>Kirchner</surname>
          </string-name>
          .
          <article-title>Machine bias</article-title>
          . https://www:propublica:org/article/machine-bias
          <article-title>-risk-assessments-incriminal-sentencing</article-title>
          . Propublica [Online; posted:
          <fpage>23</fpage>
          -May-2016].
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Ines</given-names>
            <surname>Arous</surname>
          </string-name>
          , Jie Yang,
          <string-name>
            <given-names>Mourad</given-names>
            <surname>Khayati</surname>
          </string-name>
          , and
          <string-name>
            <surname>Philippe</surname>
          </string-name>
          Cudré-Mauroux.
          <article-title>Opencrowd: A human-ai collaborative approach for finding social influencers via open-ended answers aggregation</article-title>
          .
          <source>In Proceedings of The Web Conference</source>
          <year>2020</year>
          , pages
          <fpage>1851</fpage>
          -
          <lpage>1862</lpage>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Edmond</given-names>
            <surname>Awad</surname>
          </string-name>
          , Sohan Dsouza, Richard Kim, Jonathan Schulz, Joseph Henrich, Azim Shariff,
          <string-name>
            <surname>Jean-François Bonnefon</surname>
            , and
            <given-names>Iyad</given-names>
          </string-name>
          <string-name>
            <surname>Rahwan</surname>
          </string-name>
          .
          <article-title>The moral machine experiment</article-title>
          .
          <source>Nature</source>
          ,
          <volume>563</volume>
          (
          <issue>7729</issue>
          ):
          <fpage>59</fpage>
          -
          <lpage>64</lpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Thomas</given-names>
            <surname>Beardsworth</surname>
          </string-name>
          and
          <string-name>
            <given-names>Nishant</given-names>
            <surname>Kumar</surname>
          </string-name>
          .
          <article-title>Who to sue when a robot loses your fortune</article-title>
          . https://www:bloomberg:com/news/articles/2019-05-06/who-to
          <article-title>-sue-whena-robot-loses-your-fortune</article-title>
          . Bloomberg [Online; posted:
          <fpage>05</fpage>
          -May-2019].
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Reuben</given-names>
            <surname>Binns</surname>
          </string-name>
          , Max Van Kleek,
          <string-name>
            <given-names>Michael</given-names>
            <surname>Veale</surname>
          </string-name>
          , Ulrik Lyngs,
          <string-name>
            <given-names>Jun</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Nigel</given-names>
            <surname>Shadbolt</surname>
          </string-name>
          .
          <article-title>'it's reducing a human being to a percentage' perceptions of justice in algorithmic decisions</article-title>
          .
          <source>In Proceedings of the 2018 Chi conference on human factors in computing systems</source>
          , pages
          <fpage>1</fpage>
          -
          <lpage>14</lpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Maxime</given-names>
            <surname>Chevalier-Boisvert</surname>
          </string-name>
          , Dzmitry Bahdanau, Salem Lahlou, Lucas Willems, Chitwan Saharia, Thien Huu Nguyen, and
          <string-name>
            <given-names>Yoshua</given-names>
            <surname>Bengio</surname>
          </string-name>
          . Babyai:
          <article-title>First steps towards grounded language learning with a human in the loop</article-title>
          .
          <source>arXiv preprint arXiv:1810.08272</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Jeffrey</given-names>
            <surname>Dastin</surname>
          </string-name>
          .
          <article-title>Amazon scraps secret ai recruiting tool that showed bias against women</article-title>
          . https://www:reuters:com/article/us-amazon
          <article-title>-com-jobs-automation-insight/ amazon-scraps-secret-ai-recruiting-tool-that-showed-bias-against-womenidUSKCN1MK08G.</article-title>
          <string-name>
            <surname>Reuters</surname>
          </string-name>
          [Online; posted:
          <fpage>09</fpage>
          -October-2018].
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Maartje De Graaf</surname>
          </string-name>
          , Somaya Ben Allouch, and Jan Van Diik.
          <article-title>Why do they refuse to use my robot?: Reasons for non-use derived from a long-term home study</article-title>
          .
          <source>In 2017 12th ACM/IEEE International Conference on Human-Robot Interaction (HRI</source>
          , pages
          <fpage>224</fpage>
          -
          <lpage>233</lpage>
          . IEEE,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Ewart J De Visser</surname>
          </string-name>
          , Samuel S Monfort,
          <string-name>
            <surname>Ryan</surname>
            <given-names>McKendrick</given-names>
          </string-name>
          ,
          <source>Melissa AB Smith</source>
          ,
          <string-name>
            <given-names>Patrick E McKnight</given-names>
            ,
            <surname>Frank Krueger</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Raja</given-names>
            <surname>Parasuraman</surname>
          </string-name>
          .
          <article-title>Almost human: Anthropomorphism increases trust resilience in cognitive agents</article-title>
          .
          <source>Journal of Experimental Psychology: Applied</source>
          ,
          <volume>22</volume>
          (
          <issue>3</issue>
          ):
          <fpage>331</fpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Gianluca</surname>
            <given-names>Demartini</given-names>
          </string-name>
          , Djellel Eddine Difallah, Ujwal Gadiraju, and
          <string-name>
            <given-names>Michele</given-names>
            <surname>Catasta</surname>
          </string-name>
          .
          <article-title>An introduction to hybrid human-machine information systems</article-title>
          .
          <source>Foundations and Trends in Web Science</source>
          ,
          <volume>7</volume>
          (
          <issue>1</issue>
          ):
          <fpage>1</fpage>
          -
          <lpage>87</lpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Thomas</surname>
            <given-names>G</given-names>
          </string-name>
          <string-name>
            <surname>Dietterich</surname>
          </string-name>
          .
          <article-title>Steps toward robust artificial intelligence</article-title>
          .
          <source>AI Magazine</source>
          ,
          <volume>38</volume>
          (
          <issue>3</issue>
          ):
          <fpage>3</fpage>
          -
          <lpage>24</lpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>Finale</given-names>
            <surname>Doshi-Velez</surname>
          </string-name>
          and
          <string-name>
            <given-names>Been</given-names>
            <surname>Kim</surname>
          </string-name>
          .
          <article-title>Towards a rigorous science of interpretable machine learning</article-title>
          .
          <source>arXiv preprint arXiv:1702.08608</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Alexander</surname>
            <given-names>Erlei</given-names>
          </string-name>
          , Franck Awounang Nekdem, Lukas Meub, Avishek Anand, and
          <string-name>
            <given-names>Ujwal</given-names>
            <surname>Gadiraju</surname>
          </string-name>
          .
          <article-title>Impact of algorithmic decision making on human behavior: Evidence from ultimatum bargaining</article-title>
          .
          <source>In Proceedings of the 8th AAAI Conference on Human Computation and Crowdsourcing (HCOMP</source>
          <year>2020</year>
          ),
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Ujwal</surname>
            <given-names>Gadiraju</given-names>
          </string-name>
          , Sebastian Möller, Martin Nöllenburg, Dietmar Saupe, Sebastian Egger-Lampl, Daniel Archambault, and Brian Fisher.
          <article-title>Crowdsourcing versus the laboratory: towards humancentered experiments using the crowd</article-title>
          .
          <source>In Evaluation in the Crowd. Crowdsourcing and Human-Centered Experiments</source>
          , pages
          <fpage>6</fpage>
          -
          <lpage>26</lpage>
          . Springer,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Ujwal</surname>
            <given-names>Gadiraju</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Jie</given-names>
            <surname>Yang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Alessandro</given-names>
            <surname>Bozzon</surname>
          </string-name>
          .
          <article-title>Clarity is a worthwhile quality: On the role of task clarity in microtask crowdsourcing</article-title>
          .
          <source>In Proceedings of the 28th ACM Conference on Hypertext and Social Media</source>
          , pages
          <fpage>5</fpage>
          -
          <lpage>14</lpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Amirata</surname>
            <given-names>Ghorbani</given-names>
          </string-name>
          , James Wexler, James Y Zou, and
          <string-name>
            <given-names>Been</given-names>
            <surname>Kim</surname>
          </string-name>
          .
          <article-title>Towards automatic conceptbased explanations</article-title>
          .
          <source>In Advances in Neural Information Processing Systems</source>
          , pages
          <fpage>9277</fpage>
          -
          <lpage>9286</lpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>Ella</given-names>
            <surname>Glikson</surname>
          </string-name>
          and
          <article-title>Anita Williams Woolley</article-title>
          .
          <article-title>Human trust in artificial intelligence: Review of empirical research</article-title>
          .
          <source>Academy of Management Annals, (ja)</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Moritz</surname>
            <given-names>Hardt</given-names>
          </string-name>
          , Eric Price, and
          <string-name>
            <given-names>Nati</given-names>
            <surname>Srebro</surname>
          </string-name>
          .
          <article-title>Equality of opportunity in supervised learning</article-title>
          .
          <source>In Advances in neural information processing systems</source>
          , pages
          <fpage>3315</fpage>
          -
          <lpage>3323</lpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>Kerstin</given-names>
            <surname>Sophie</surname>
          </string-name>
          <string-name>
            <surname>Haring</surname>
          </string-name>
          , David Silvera-Tawil, Yoshio Matsumoto, Mari Velonaki, and
          <string-name>
            <given-names>Katsumi</given-names>
            <surname>Watanabe</surname>
          </string-name>
          .
          <article-title>Perception of an android robot in japan and australia: A cross-cultural comparison</article-title>
          .
          <source>In International conference on social robotics</source>
          , pages
          <fpage>166</fpage>
          -
          <lpage>175</lpage>
          . Springer,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>Sungsoo</given-names>
            <surname>Ray</surname>
          </string-name>
          <string-name>
            <surname>Hong</surname>
          </string-name>
          , Jessica Hullman, and
          <string-name>
            <given-names>Enrico</given-names>
            <surname>Bertini</surname>
          </string-name>
          .
          <article-title>Human factors in model interpretability: Industry practices, challenges, and needs</article-title>
          .
          <source>Proceedings of the ACM on Human-Computer Interaction</source>
          ,
          <volume>4</volume>
          (
          <issue>CSCW1</issue>
          ):
          <fpage>1</fpage>
          -
          <lpage>26</lpage>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <surname>Been</surname>
            <given-names>Kim</given-names>
          </string-name>
          , Martin Wattenberg, Justin Gilmer, Carrie Cai, James Wexler,
          <string-name>
            <given-names>Fernanda</given-names>
            <surname>Viegas</surname>
          </string-name>
          , et al.
          <article-title>Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (tcav)</article-title>
          .
          <source>In International conference on machine learning</source>
          , pages
          <fpage>2668</fpage>
          -
          <lpage>2677</lpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Aniket</surname>
            <given-names>Kittur</given-names>
          </string-name>
          , Jeffrey V Nickerson,
          <string-name>
            <given-names>Michael</given-names>
            <surname>Bernstein</surname>
          </string-name>
          , Elizabeth Gerber, Aaron Shaw, John Zimmerman, Matt Lease,
          <string-name>
            <surname>and John Horton.</surname>
          </string-name>
          <article-title>The future of crowd work</article-title>
          .
          <source>In Proceedings of the 2013 conference on Computer supported cooperative work</source>
          , pages
          <fpage>1301</fpage>
          -
          <lpage>1318</lpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>Rich</given-names>
            <surname>Ling</surname>
          </string-name>
          and
          <string-name>
            <surname>Chih-Hui Lai</surname>
          </string-name>
          .
          <article-title>Microcoordination 2.0: Social coordination in the age of smartphones and messaging apps</article-title>
          .
          <source>Journal of Communication</source>
          ,
          <volume>66</volume>
          (
          <issue>5</issue>
          ):
          <fpage>834</fpage>
          -
          <lpage>856</lpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>Gary</given-names>
            <surname>Marcus</surname>
          </string-name>
          .
          <article-title>The next decade in ai: four steps towards robust artificial intelligence</article-title>
          .
          <source>arXiv preprint arXiv:2002.06177</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <surname>Panagiotis</surname>
            <given-names>Mavridis</given-names>
          </string-name>
          , Owen Huang, Sihang Qiu, Ujwal Gadiraju, and
          <string-name>
            <given-names>Alessandro</given-names>
            <surname>Bozzon</surname>
          </string-name>
          .
          <article-title>Chatterbox: Conversational interfaces for microtask crowdsourcing</article-title>
          .
          <source>In Proceedings of the 27th ACM Conference on User Modeling, Adaptation and Personalization</source>
          , pages
          <fpage>243</fpage>
          -
          <lpage>251</lpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <surname>Brent</surname>
            <given-names>Mittelstadt</given-names>
          </string-name>
          , Chris Russell, and
          <string-name>
            <given-names>Sandra</given-names>
            <surname>Wachter</surname>
          </string-name>
          .
          <article-title>Explaining explanations in ai</article-title>
          .
          <source>In Proceedings of the conference on fairness, accountability, and transparency</source>
          , pages
          <fpage>279</fpage>
          -
          <lpage>288</lpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <surname>Robert J Moore</surname>
          </string-name>
          , Raphael Arar,
          <string-name>
            <surname>Guang-Jie Ren</surname>
          </string-name>
          , and
          <string-name>
            <surname>Margaret H Szymanski</surname>
          </string-name>
          .
          <article-title>Conversational ux design</article-title>
          .
          <source>In Proceedings of the 2017 CHI Conference Extended Abstracts on Human Factors in Computing Systems</source>
          , pages
          <fpage>492</fpage>
          -
          <lpage>497</lpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <surname>Natalia</surname>
            <given-names>Ostapuk</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Jie</given-names>
            <surname>Yang</surname>
          </string-name>
          , and
          <string-name>
            <surname>Philippe</surname>
          </string-name>
          Cudré-Mauroux.
          <article-title>Activelink: deep active learning for link prediction in knowledge graphs</article-title>
          .
          <source>In The World Wide Web Conference</source>
          , pages
          <fpage>1398</fpage>
          -
          <lpage>1408</lpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>Kalpana</given-names>
            <surname>Parshotam</surname>
          </string-name>
          .
          <article-title>Crowd computing: a literature review and definition</article-title>
          .
          <source>In Proceedings of the South African Institute for Computer Scientists and Information Technologists Conference</source>
          , pages
          <fpage>121</fpage>
          -
          <lpage>130</lpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <surname>Sihang</surname>
            <given-names>Qiu</given-names>
          </string-name>
          , Ujwal Gadiraju, and
          <string-name>
            <given-names>Alessandro</given-names>
            <surname>Bozzon</surname>
          </string-name>
          .
          <article-title>Estimating conversational styles in conversational microtask crowdsourcing</article-title>
          .
          <source>Proceedings of the ACM on Human-Computer Interaction</source>
          ,
          <volume>4</volume>
          (
          <issue>CSCW1</issue>
          ):
          <fpage>1</fpage>
          -
          <lpage>23</lpage>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <surname>Sihang</surname>
            <given-names>Qiu</given-names>
          </string-name>
          , Ujwal Gadiraju, and
          <string-name>
            <given-names>Alessandro</given-names>
            <surname>Bozzon</surname>
          </string-name>
          .
          <article-title>Improving worker engagement through conversational microtask crowdsourcing</article-title>
          .
          <source>In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems</source>
          , pages
          <fpage>1</fpage>
          -
          <lpage>12</lpage>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <surname>Sihang</surname>
            <given-names>Qiu</given-names>
          </string-name>
          , Ujwal Gadiraju, and
          <string-name>
            <given-names>Alessandro</given-names>
            <surname>Bozzon</surname>
          </string-name>
          .
          <article-title>Towards memorable information retrieval</article-title>
          .
          <source>In Proceedings of the 2020 ACM SIGIR on International Conference on Theory of Information Retrieval</source>
          , pages
          <fpage>69</fpage>
          -
          <lpage>76</lpage>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <surname>Iyad</surname>
            <given-names>Rahwan</given-names>
          </string-name>
          , Manuel Cebrian, Nick Obradovich, Josh Bongard,
          <string-name>
            <surname>Jean-François</surname>
            <given-names>Bonnefon</given-names>
          </string-name>
          , Cynthia Breazeal, Jacob W Crandall, Nicholas A Christakis,
          <string-name>
            <surname>Iain D Couzin</surname>
          </string-name>
          ,
          <string-name>
            <surname>Matthew O Jackson</surname>
          </string-name>
          , et al.
          <article-title>Machine behaviour</article-title>
          .
          <source>Nature</source>
          ,
          <volume>568</volume>
          (
          <issue>7753</issue>
          ):
          <fpage>477</fpage>
          -
          <lpage>486</lpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <surname>Minjin</surname>
            <given-names>Rheu</given-names>
          </string-name>
          , Ji Youn Shin, Wei Peng, and
          <string-name>
            <surname>Jina</surname>
          </string-name>
          Huh-Yoo.
          <article-title>Systematic review: Trust-building factors and implications for conversational agent design</article-title>
          .
          <source>International Journal of HumanComputer Interaction</source>
          , pages
          <fpage>1</fpage>
          -
          <lpage>16</lpage>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37]
          <string-name>
            <surname>Maha</surname>
            <given-names>Salem</given-names>
          </string-name>
          , Gabriella Lakatos, Farshid Amirabdollahian, and
          <string-name>
            <given-names>Kerstin</given-names>
            <surname>Dautenhahn</surname>
          </string-name>
          .
          <article-title>Would you trust a (faulty) robot? effects of error, task type and personality on human-robot cooperation and trust</article-title>
          .
          <source>In 2015 10th ACM/IEEE International Conference on Human-Robot Interaction (HRI)</source>
          , pages
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          . IEEE,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [38]
          <string-name>
            <surname>Karen</surname>
            <given-names>Simonyan</given-names>
          </string-name>
          , Andrea Vedaldi, and
          <string-name>
            <given-names>Andrew</given-names>
            <surname>Zisserman</surname>
          </string-name>
          .
          <article-title>Deep inside convolutional networks: Visualising image classification models and saliency maps</article-title>
          .
          <source>In 2nd International Conference on Learning Representations, ICLR</source>
          <year>2014</year>
          ,
          <article-title>Banff</article-title>
          ,
          <string-name>
            <surname>AB</surname>
          </string-name>
          , Canada,
          <source>April 14-16</source>
          ,
          <year>2014</year>
          , Workshop Track Proceedings,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [39]
          <string-name>
            <given-names>Daniel</given-names>
            <surname>Smilkov</surname>
          </string-name>
          , Nikhil Thorat, Been Kim, Fernanda B.
          <string-name>
            <surname>Viégas</surname>
            , and
            <given-names>Martin</given-names>
          </string-name>
          <string-name>
            <surname>Wattenberg</surname>
          </string-name>
          .
          <article-title>Smoothgrad: removing noise by adding noise</article-title>
          .
          <source>CoRR, abs/1706.03825</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          [40]
          <string-name>
            <surname>Ehsan</surname>
            <given-names>Toreini</given-names>
          </string-name>
          , Mhairi Aitken, Kovila Coopamootoo, Karen Elliott, Carlos Gonzalez Zelaya, and Aad van Moorsel.
          <article-title>The relationship between trust in ai and trustworthy machine learning technologies</article-title>
          .
          <source>In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency</source>
          , pages
          <fpage>272</fpage>
          -
          <lpage>283</lpage>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          [41]
          <string-name>
            <surname>Jiaxuan</surname>
            <given-names>Wang</given-names>
          </string-name>
          , Jeeheh Oh,
          <string-name>
            <given-names>Haozhu</given-names>
            <surname>Wang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Jenna</given-names>
            <surname>Wiens</surname>
          </string-name>
          .
          <article-title>Learning credible models</article-title>
          .
          <source>In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery &amp; Data Mining</source>
          , pages
          <fpage>2417</fpage>
          -
          <lpage>2426</lpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref42">
        <mixed-citation>
          [42]
          <string-name>
            <surname>Lin</surname>
            <given-names>Wang</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pei-Luen Patrick</surname>
            <given-names>Rau</given-names>
          </string-name>
          , Vanessa Evers, Benjamin Krisper Robinson, and
          <string-name>
            <given-names>Pamela</given-names>
            <surname>Hinds</surname>
          </string-name>
          .
          <article-title>When in rome: the role of culture &amp; context in adherence to robot recommendations</article-title>
          .
          <source>In 2010 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI)</source>
          , pages
          <fpage>359</fpage>
          -
          <lpage>366</lpage>
          . IEEE,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref43">
        <mixed-citation>
          [43]
          <string-name>
            <surname>Jie</surname>
            <given-names>Yang</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Alessandro</given-names>
            <surname>Bozzon</surname>
          </string-name>
          , and
          <string-name>
            <surname>Geert-Jan Houben</surname>
          </string-name>
          .
          <article-title>Knowledge crowdsourcing acceleration</article-title>
          .
          <source>In International Conference on Web Engineering</source>
          , pages
          <fpage>639</fpage>
          -
          <lpage>643</lpage>
          . Springer,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref44">
        <mixed-citation>
          [44]
          <string-name>
            <surname>Jie</surname>
            <given-names>Yang</given-names>
          </string-name>
          , Thomas Drake, Andreas Damianou, and
          <string-name>
            <given-names>Yoelle</given-names>
            <surname>Maarek</surname>
          </string-name>
          .
          <article-title>Leveraging crowdsourcing data for deep active learning an application: Learning intents in alexa</article-title>
          .
          <source>In Proceedings of the 2018 World Wide Web Conference</source>
          , pages
          <fpage>23</fpage>
          -
          <lpage>32</lpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref45">
        <mixed-citation>
          [45]
          <string-name>
            <surname>Jie</surname>
            <given-names>Yang</given-names>
          </string-name>
          , Judith Redi, Gianluca Demartini, and
          <string-name>
            <given-names>Alessandro</given-names>
            <surname>Bozzon</surname>
          </string-name>
          .
          <article-title>Modeling task complexity in crowdsourcing</article-title>
          .
          <source>In Proceedings of the 4th AAAI Conference on Human Computation and Crowdsourcing</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref46">
        <mixed-citation>
          [46]
          <string-name>
            <surname>Jie</surname>
            <given-names>Yang</given-names>
          </string-name>
          , Alisa Smirnova, Dingqi Yang, Gianluca Demartini, Yuan Lu, and
          <string-name>
            <surname>Philippe</surname>
          </string-name>
          Cudré- Mauroux.
          <article-title>Scalpel-cd: leveraging crowdsourcing and deep probabilistic modeling for debugging noisy training data</article-title>
          .
          <source>In The World Wide Web Conference</source>
          , pages
          <fpage>2158</fpage>
          -
          <lpage>2168</lpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref47">
        <mixed-citation>
          [47]
          <string-name>
            <surname>Zijian</surname>
            <given-names>Zhang</given-names>
          </string-name>
          , Jaspreet Singh,
          <string-name>
            <given-names>Ujwal</given-names>
            <surname>Gadiraju</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Avishek</given-names>
            <surname>Anand</surname>
          </string-name>
          .
          <article-title>Dissonance between human and machine understanding</article-title>
          .
          <source>Proceedings of the ACM on Human-Computer Interaction</source>
          ,
          <volume>3</volume>
          (CSCW):
          <fpage>1</fpage>
          -
          <lpage>23</lpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>