=Paper= {{Paper |id=Vol-3306/paper1 |storemode=property |title=Trust in data engineering: reflection, framework, and evaluation methodology |pdfUrl=https://ceur-ws.org/Vol-3306/paper1.pdf |volume=Vol-3306 |authors=Sarah Oppold,Melanie Herschel |dblpUrl=https://dblp.org/rec/conf/vldb/OppoldH22 }} ==Trust in data engineering: reflection, framework, and evaluation methodology== https://ceur-ws.org/Vol-3306/paper1.pdf
Trust in data engineering: reflection, framework, and
evaluation methodology
Sarah Oppold1,* , Melanie Herschel1
1
    University of Stuttgart, Germany


                                       Abstract
                                       Trust is and has been essential to human interactions. With the rise of technology, we now live in a socio-technical environment
                                       where people frequently interact with technology as well. It is therefore natural to expect that people will also develop trust
                                       in technology. Data engineering researchers have at least assumed this when claiming certain methods they devise (e.g,
                                       explanations using provenance), likely help to foster some notion of trust. But rarely is the notion of trust clarified or this
                                       claim validated. We propose a more systematic consideration of trust in data engineering technology, compared to the ad-hoc
                                       state of the art. Therefore, we first review the notion of trust established in other disciplines, based on which we derive a
                                       model for trust in data engineering technology. We then present guidelines on how to proceed to devise a trust strategy
                                       aiming at enriching data engineering technology such that it potentially fosters trust conforming to our model. We further
                                       discuss how to possibly evaluate a trust strategy. We apply our trust model on a use case, for which we devise, implement, and
                                       evaluate a trust strategy using our proposed guidelines and methods. The results of our evaluation indicate that statements
                                       like “transparency helps build trust” should be used cautiously. This highlights the need for contributions like those we
                                       present here, as only a more systematic approach to defining, integrating, and evaluating trust in data engineering can bring
                                       us a step closer to provably fostering trust in such technologies.


1. Introduction                                                                                         gained attention – yielding approaches to possibly quan-
                                                                                                        tify, assess, or even improve trust – we observe that the
Our society depends on us humans trusting each other. notion of trust is usually not well defined and does not
From crossing the streets, to collaborating with cowork- correspond to the concept of trust established in other
ers, to being treated by doctors, our society is built on disciplines, e.g., philosophy or psychology. In a first line
trust. The rise of technology and its integration into our of research, the notion of trust considered in the con-
world, has created a socio-technical environment where text of data engineering and data analysis reduces to a
humans live together with technology. This means that possibly related metric and trust in the broader sense is
we now not only have to trust other humans, we have to neither considered nor evaluated. For instance, trust as
also establish a similar relationship to technology rather understood in [5] reduces back to the accuracy of a ma-
than second guessing its every “action”, in order to bene- chine learning model. In [10, 11, 12], trust is quantified,
fit, for instance, from its improvements in efficiency or e.g., based on the similarity of information and source
productivity.                                                                                           provenance provided by different data sources. While the
   In an increasingly data-driven world, data engineer- resulting trust scores are measured in different settings,
ing, data analysis, and machine learning are software it is never validated whether or not the scores actually
technologies that can significantly affect human lives correspond to some established notion of trust. A second
(e.g., [1, 2, 3]) and for which some notion of trust has line of research considers transparency and explanations
been recognized as an aspect to consider (e.g., [4, 5, 6]). to foster trust (see, e.g., [13, 14, 15]). In this context,
This paper focuses on trust in data engineering that en- data provenance [16], which offers transparency in data
compasses the full data preparation pipeline to get from engineering pipelines, is frequently named as relevant
raw data (as collected) to data “fit for analysis”, e.g., data for evaluating trust (e.g., in [10, 17, 18, 19]). Yet, we are
used for training machine learning models. Typical data not aware of any validation of this claim. In that sense,
engineering steps include data transformation [7], clean- the use of the term trust in data engineering has been
ing [8], and integration [9]. Data engineering is usually mostly ad-hoc, without a clear or consistent definition.
required in any data-driven process and a plethora of Furthermore, methods to evaluate solutions for trust in
systems and algorithms for it exist.                                                                    data engineering with respect to such a definition are
   While trust in such engineered data has recently lacking.
Proc. of the First International Workshop on Data Ecosystems (DEco’22),
                                                                                                           Clearly, we need a more nuanced and systematic dis-
September 5, 2022, Sydney, Australia                                                                    cussion on trust in data engineering, to which we con-
*
  Corresponding author                                                                                  tribute considering the following questions: How can we
$ sarah.oppold@ipvs.uni-stuttgart.de (S. Oppold);                                                       incorporate the concept of trust into the development pro-
melanie.herschel@ipvs.uni-stuttgart.de (M. Herschel)                                                    cess of data engineering pipelines to obtain trustworthy
           © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License
    CEUR
           Attribution 4.0 International (CC BY 4.0).
           CEUR Workshop Proceedings (CEUR-WS.org)
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                                                                                        data engineering? How can we assess trust or trustwor-




                                                                                              4
thiness in a data engineering pipeline? While we expect            2.1.2. Trust
there are many different types of solutions, our focus here
                                                               Considering trust, a truster A usually trusts a trustee B
lies on technical solutions to possibly influence trust in
                                                               to do C [24]. As natural, familiar, and elemental it is to
data engineering. Our contributions are: (1) We critically
                                                               trust for us as humans, as complicated it is to describe it
review the term “trust” (Section 2) to define a theoret-
                                                               as a concept. What philosophers agree on is that trust
ical model for trust in data engineering (Section 3). (2)
                                                               entails that (1) A is somehow vulnerable to a risk when
Based on this model, we describe a framework for trust
                                                               they trust B, and (2) A relies on B to both be competent
engineering that integrates trust in the data engineering
                                                               and willing to do C [24]. Related to the psychological
pipeline and serves as a guideline to develop a trust strat-
                                                               attitude of trust is the property “trustworthiness” that
egy (Section 4). (3) We describe a general procedure one
                                                               we can ascribe to others when we think that we can
can use to evaluate a trust strategy (Section 5). (4) We
                                                               trust them (i.e., we think that they fulfill point 2). While
apply our methods to devise a trust strategy to a use case
                                                               philosophers thus agree that trust is based on reliance
based on a credit scoring scenario, where explanations
                                                               (see point 2), they cannot agree what the additional fac-
are integrated into a data engineering step as evidence
                                                               tor is that differentiates trust from mere reliance. While
to possibly foster trust. Our systematic evaluation, how-
                                                               some argue that the trustee’s motive must be of some
ever, reveals that the explanations may not reach this
                                                               moral nature such as self-interest, goodwill, or moral
goal, highlighting the importance of a more systematic
                                                               integrity, others argue that the additional factor is some
study of the problem with the methods we propose in
                                                               sort of normative expectation the truster has vis a vis
this paper (Section 6). Note that we are aware that it is
                                                               the trustee. It seems to depend highly on the trust rela-
possible to manipulate and deceive people by creating an
                                                               tionship example used. A different stance philosophers
illusion of trustworthy data engineering solutions [20]
                                                               use to differentiate between trust and reliance is that if
and that our contributions can lead to such deceptions
                                                               B fails A in a reliance relationship, A feels disappointed,
and manipulations. Countering or regulating this is how-
                                                               whereas in a trust relationship, A feels betrayed [24]. Im-
ever out of the scope of this paper.
                                                               portant characteristics of trust are pro-attitude (truster
                                                               wants trustee to succeed in doing C), vulnerability, lack
2. Trust perspectives                                          of control, and active acceptance of risk [25, 24].
                                                                  While trust remains an elusive concept, a widely
As we motivated above, trust in data engineering and adopted model is the ABI trust model [26]. It identifies
analysis has been considered in an ad-hoc manner, while three factors of perceived trustworthiness: (A)bility, that
it has been systematically discussed in other disciplines, is the skills or competencies of the trustee, (B)enevolence,
leading to some common understanding what trust typi- which refers to the extent to which the trustee is well
cally entails.                                                 meaning to the trustor, and (I)ntegrity, which is that the
                                                               trustee seems upright in the eyes of the truster because
2.1. Philosophical perspective on trust                        they share a common set of values or principles. As we
                                                               shall see in Section 3, we incorporate the ABI charac-
The discussion on trust has a long history in philoso- teristics in our trust model targeting data engineering
phy [21] and while the concept remains elusive, there technology rather than a human as trustee.
are some underlying ideas that most philosophers seem
to agree upon. One key facet of the discussion that we 2.1.3. Trust in technology
highlight here is the distinction of trust and its related
concept reliance. Note that while most philosophical re- While philosophers have studied different variants of
search has dealt with interpersonal trust, our discussion trust (e.g., self-trust, trust in groups, trust in organiza-
will also review the philosophical perspective on trust in tions), they all are based on human interaction and com-
technology.                                                    munication [24]. Technology strongly differs from hu-
                                                               mans. On the one hand, it lacks human characteristics
2.1.1. Reliance                                                such as intentionality and hope [27], it cannot use lan-
                                                               guage, and is not free to act as it will [28]. On the other
In general, person A relies on a proposition p (e.g., that an- hand, it presents other non-human characteristics such
other person performs a certain action) to achieve their as opaqueness to the user, or unnoticed updates [27]. So
goals, when p is a productive means to achieve their can technology be a trustee in a trust relationship accord-
goals and p has to be true for its success [22]. Reasons ing to the previously described notion of trust? Indeed,
for reliance are often of pragmatic nature [22]. We rely when people talk about trusting technology, they some-
on forces beyond our control or even our comprehen- times talk about a computer artefact, a mere object that
sion [23].                                                     is just expected to work as intended, an object that is an




                                                               5
instrument to achieve one’s goals. This would be consid-           2.3. Computer science perspective on
ered what philosophers call “trust as reliance” [27, 28]                trust
and not “real” trust.
   However, if we take a closer look, technology often is          Finally, we review the perspective from computer science
more than just a simple artefact. Technology can feature           on trust, with a special focus on trust with respect to data
“logical complexity, capacity to store and manipulate data,        processing software that performs or relies on data engi-
potential for sophisticated interaction with humans” [27]          neering or data analysis. While trust is also considered
and can show unpredictable behavior [27, 28]. Thus,                in other branches of computer science (e.g., security and
technology seems to encompass more than just mere                  privacy), we do not review these in detail due to space
objects that we rely on. In addition to that, humans, as           constraints.
the partner in a trust relationship with technology, can              As we have pointed out in the introduction, the term
become emotionally involved in the relationship because            trust is typically used in an ad-hoc way, yielding different
trust comes easily for humans [20] who have a capacity to          notions of so-called trust that do not necessarily corre-
anthropomorphize (form bonds with machines similarly               spond to the common notion philosophy or psychology
to how they personify pets) [20, 28]. Thus, within a               agree on. In particular, we observe that trust often re-
socio-technical system, technology can appear as “quasi-           duces back to a measurable metric that is indicative of
other” with qualities similar enough to humans for them            the quality or performance of a solution, but where it is
to create a trust relationship [28].                               unclear if and how it correlates with trust. Other work
   Trust in technology might not be human trust but                advocates that transparency and explanations are key
something similar, lying between interpersonal trust and           factors to establish trust, which is typically not evaluated
trust as reliance [27]. It might even be on a spectrum rang-       or validated though.
ing from simple machines that only afford reliance and
where the trust is based on functional criteria up to com-         2.3.1. Metric-reduced trust
plex autonomous machines with unpredictable behavior
                                                                   First approaches have emerged to quantify, assess, or
that cannot be verified but have to be trusted [20, 27, 28].
                                                                   even improve what the authors call trust in data pro-
Further layers of trust need to be placed in the developers,
                                                                   cessing. For example, [5, 33] attempt to measure trust of
designers, and company [20], which makes an analysis
                                                                   machine learning predictions. However, their trust boils
of trust in technology even more challenging. To make
                                                                   down to the precision or accuracy of machine learning re-
the distinction between trust in technology and interper-
                                                                   sults. Similarity-metrics are another category of metrics
sonal trust more explicit, researchers have introduced
                                                                   standing in for trust. For instance, [10, 11, 12] quantify
some additional naming and have begun a differentiated
                                                                   trust based on the similarity of information and source
discussion. Grodzinsky et al. [27] for example introduce
                                                                   provenance provided by different data sources. While the
new terms: they call trust in electronic and trust in physi-
                                                                   proposed methods are certainly valuable to improve the
cal (face-to-face) encounters E-Trust and P-Trust, respec-
                                                                   likelihood that approaches return the “correct” result and
tively. Sullins [20] defines different situations of robotic
                                                                   improve the overall quality or performance, this notion
trust, and Coeckelbergh [28] analyzes the impact of differ-
                                                                   of trust does clearly not bear the same characteristics as
ent cultures on trust in robots. In this context, our work
                                                                   trust reviewed in the previous subsections.
focuses on E-Trust but rather than focusing on robots,
we focus on data engineering technology as trustee.
                                                                   2.3.2. Transparency and explanations for trust

2.2. Psychological perspective on trust                            Several works discuss interpretability and explanations
                                                                   for machine learning models, seen as a possible means to
While the philosophical approach is fueled by the inten-           improve trust (e.g., [34, 35]). The general argument is that
tion to analyze human phenomena, psychologists attempt             such methods offer evidence and verifiability that foster
to assess why we engage in this behavior of trusting               trust in a user or developer. Ribeiro et al. [35] evaluate
or distrusting another person. Psychologists also strug-           their methods for trust, but this evaluation either simu-
gle to conceptualize and operationalize trust behavior,            lates users or equates trust with which model performs
but see the same main characteristics of vulnerability,            better (relating back to the metric-reduced trust). Trans-
risk, uncertainty, and pro-attitude that are present in the        parency and explanations in data engineering pipelines
philosophers’ view [29, 30, 31, 32]. We consider psycho-           can be achieved via data provenance [16]. Also in this
logical studies on behavioral causes to not be directly            area, these are frequently named as relevant for evaluat-
relevant to the development of a first model of trust in           ing trust (e.g., in [10, 17, 18, 19]). Yet, we are not aware of
data engineering technology and thus leave their discus-           any work that has studied or validated how transparency
sion deliberately short.                                           and explanations truly relate to trust.




                                                               6
2.3.3. Towards trust modeling                                           • Modeling both main parties involved in a trust re-
                                                                          lationship. While classical trust models assume
As a starting point to address the aforementioned short-
                                                                          both parties to be humans and thus having simi-
comings, a more nuanced discussion about trust has re-
                                                                          lar properties, in our setting, the truster and the
cently emerged in the area of computer science. Siau and
                                                                          trustee are inherently different types of entities.
Wang [15] for example discuss trust in artificial intelli-
                                                                          Modeling both in detail opens opportunities for
gence (AI). They collect a set of different definitions for
                                                                          a more detailed discussion of what trust in this
trust and derive a set of factors for trust in AI technology
                                                                          kind of relationship entails.
along multiple dimensions. They also list a variety of
                                                                        • Modeling influencing factors. Various factors may
approaches to build and then nurture trust in AI. Having
                                                                          influence the kind of trust relationship estab-
focused on methods for trust in AI, this work lacks a
                                                                          lished between a truster and a trustee, making
catalog of methods for trust for data engineering. Fur-
                                                                          a concise and unique definition of trust difficult
thermore, it does not include an actionable process taking
                                                                          (see Section 2). The model should integrate in-
up their discussion to “implement” trust in AI.
                                                                          fluencing factors to reflect this ambiguity and in-
   Meeßen et al. [36] derive a model for trust in Manage-
                                                                          corporate the different nuances of trust, thereby
ment Information Systems (MIS) based on both the ABI
                                                                          offering a more detailed model for a systematic
trust model [26], which we reviewed in Section 2.1.2, and
                                                                          and multi-facetted study.
research in automation and organizations. They translate
the ABI terms from interpersonal trust to trust in MIS,
allowing a more differentiated discussion about trust in           3.2. Model for trust in data engineering
technology. While MIS cover data engineering applica-
                                                                   Given the desiderata described above, we build our novel
tions, the proposed trust model is centered around the
                                                                   model for trust in data engineering. An overview of the
trustors, mainly identifying factors such as perceived
                                                                   model is depicted in Figure 1. Note that it is based on
trustworthiness that lead to their use of an MIS. In addi-
                                                                   the ABI model [26] discussed in Section 2.1.2, similarly
tion, this work does not model or show what developers
                                                                   to [36]. While our model is more comprehensive than
of MIS can actually do to build and foster trust that can
                                                                   previous work and tailored to data engineering, we do not
lead to the decision to use the system.
                                                                   claim completeness (it can be extended) and leave open
   Thornton et al. [37] call for a more nuanced discussion
                                                                   the discussion how far it applies beyond data engineering
on the methods developers can use in order to foster trust,
                                                                   (our area of expertise).
proposing what they call trust affordances: “characteris-
tics of the technology by virtue of itself or of features
designed into the technology to promote trust by pro-              3.2.1. The truster - a human
viding access to evidence of (dis)trustworthiness specific         In the trust relationship we consider, a human is the
to a user, a technology, and their context”. As they con-          truster. Based on the general notion of trust (see Sec-
sider technology in a broad sense, the discussion remains          tion 2), we define the human in a trust in data engineer-
very general. We build on their methodology and gen-               ing relationship has to be aware of a vulnerability to some
eral ideas to devise guidelines for built-in trust in data         sort of risk when using the data engineering technology.
engineering.                                                       Otherwise, the human will use the application as just
                                                                   another tool and we are looking at a “trust as reliance”
                                                                   situation. A human could for example feel vulnerable
3. Trust in Data Engineering                                       and at risk when, while using a website, they are aware
We build on the research presented in the previous sec-            that they thereby may indirectly divulge preferences or
tion to define a trust model for data engineering technol-         personal information that can affect what information
ogy.                                                               they will be shown, e.g., which news or which job ad-
                                                                   vertisements are recommended. We argue that humans
                                                                   also feel vulnerability when it is not themselves but other
3.1. Desiderata                                                    people that are subjected to a risk from the trustee.
The following desiderata, derived from our discussion of              The trust relationship a truster may or may not engage
different trust perspectives, underly our model of trust:          in inherently depends on several influencing factors: The
                                                                   human could be in the role of a user of the technology,
     • Distinguishing trust vs. reliance. The model                but also others, such as an examiner, operator, executer,
       should incorporate distinctive features that cap-           etc. [38]. This will influence how the truster approaches
       ture trust as opposed to mere reliance. This dis-           the trust relationship. Humans’ decisions to trust are not
       tinction usually implies the truster’s risk aware-          only influenced by their role, but also by their general
       ness with respect to the trustee.                           disposition to trust, their past experiences in general (e.g.,




                                                               7
                                                                                          (Social) power, uncertainty,               Ability
              Awareness of
                                                                                                unpredictability,                 Benevolence
             vulnerability, risk
                                                                                                 unverifiability                    Integrity




                                                                      Trustee image
                                       judgement ➜ trust ➜ use                                              DETA
                                                                                                                                      …
                                                                                               Data Engineering           Company            Developers,
                  Human
                                                                                                 Technology                                   Designers
                                       evidence for trustworthiness

                Truster                                                                                              Trustee

         Role, disposition to trust,
                                                                                                 Functionality,             Culture, past experience,
        past experience, contextual
                                                                                                usability, quality        training, skills, goals, policies
              factors, culture
Figure 1: Model for trust in data engineering. A human truster builds a trust relationship with a trustee, i.e., a data engineering
application. The latter divides into DETAs and relates to further trust entities (e.g., company). Solid boxes surround necessary
characteristics of either the truster or (parts of) the trustee to establish a trust relationship. Dashed boxes group influencing
factors.

based on their privileges and power) and in particular ship. Note that these are parties with which the truster
with (similar) technology, and contextual factors of the can also engage individual trust relationships. However,
interaction. A human’s actions are also influenced by we also include these in the model of trust with respect to
the culture(s) the human is part of, shaping expectations, data engineering software, because their characteristics
behaviors and beliefs [39].                                can influence this trust relationship as well. Indeed, their
   Note that given the large variety of human trusters ability, benevolence, and integrity have shaped the data
resulting from different influencing factors and degree of engineering technology and can indicate to the truster
risk awareness, the trust relationship to a trustee can be whether the trustee is trustworthy or not. How parties
significantly different from one human to another. For behind the technology act when developing the product
instance, one human’s relationship with the trustee may is again influenced by their culture - including organiza-
actually be based on reliance because they do not see nor tional and functional culture [39] - but also their past
are aware of any risks involved in interacting with the experiences, training, skills, goals, and policies. All of this
trustee. At the other side of the spectrum, someone else can affect the trustworthiness of the product, i.e., the data
might not engage in a trust relationship at all because engineering technology, and may be taken into account
they feel too vulnerable and thus decide not to use the by the truster when making the decision whether or not
system.                                                    to trust the data engineering technology.

3.2.2. The trustee - a data engineering technology                                        3.2.3. Interactions.
Given the context of our work, the trustee is some data                                   We now describe the interaction of the two parties in-
engineering technology. For the truster to feel vulnerable,                               volved in establishing a trust relationship.
it has to have some (social) power, element of uncertainty,                                  When a truster judges the trustworthiness of someone,
unpredictability, or unverifiability, thus preventing the                                 they are actually assessing pieces of evidence they are
assertion that the data engineering technology will not                                   provided with to evaluate whether it is worth taking the
cause any harm.                                                                           risk to trust the other party and be vulnerable in some
   Typically, such an application is complex and consists                                 aspect. Whether we are in the process of judging humans
of multiple different data engineering technology artifacts                               or now data engineering technology, we think the human
(DETAs). These include for instance services, datasets,                                   truster continues to act the same. Therefore, we adapt
or algorithms. Note that the truster may or may not be                                    the ABI framework by Mayer et al. [26] (Section 2) which
aware of DETAs. Each DETA, as well as the data engi-                                      states that the trustee is assessed with respect to their
neering technology perceived as a whole, is characterized                                 ability (i.e., skills and competences) to fulfil their tasks,
by its functionality, usability, and quality. These have                                  their benevolence towards the trustee, and their integrity
to be sufficient in order for the truster to perceive the                                 of principles they act upon. Wile these are classically
technology as reliable. Each DETA could also carry the                                    characteristics of persons and organizations, in our set-
potential to harm and therefore could also be individually                                ting, the truster usually creates an imaginary image of
trusted or distrusted by the truster.                                                     the trustee based on visuals and communication with the
   Given that technology is shaped by humans and orga-                                    data engineering technology. Indeed, communication to
nizations, parties like developers, designers, or companies                               developers or the company behind the application, or
are part of the trustee in a trust in technology relation-                                access to the codebase are usually not available to the




                                                                                      8
truster, so their ABI characteristics are transposed to thetechnology as summarized in Figure 2. The top of the fig-
image of the data engineering technology. Based on the     ure shows the different steps of the data engineering pro-
truster’s epistemic and practical judgment, the truster    cess, whereas the two bottom components “accompany”
then decides whether to trust and then potentially use     the whole process from a technical and organizational
the technology [36].                                       perspective, respectively.
   Going from the trustee to the truster, the trustee pro-    In general, before developing actual data engineering
vides evidence towards the truster. In case of data en-    software, the goals to reach with the use of data need to
gineering applications, this could be through a modern     be defined. Based on these goals, relevant data need to
or old-looking visual interface, whether questions are     be identified and collected. As these data may come in
answered in an FAQ, etc. Opposed to interpersonal trust,   various formats from different sources, data wrangling is
trust in data engineering technology involves trust in a   implemented to transform, integrate, and clean the data
complex system of people, groups, institutions, who of-    to obtain a unified and consistent view of the data rele-
ten cannot be judged directly but only through the pieces  vant to the goal. These data can be further enriched with
of technology the truster has access to. In addition to    application specific data and annotations, before they are
that, the truster often does not have the capabilities to  distributed to downstream data consuming applications
understand the inner workings of the technology they       such as data analysis techniques. To monitor, document,
are supposed to assess. Following the ABI model [26],      and support the process, metadata are typically gathered
information on ability, benevolence, and integrity of the  and maintained. In addition, a data engineering process
trustee with respect to the potential risk might be evi-   is usually subjected to some form of governance.
dence that increases the perceived trustworthiness.           Following our model of trust in data engineering, the
                                                           data engineering technology in its role of trustee can
                                                           support a trust relationship by providing appropriate ev-
4. Design data processing for trust idence. This may involve evidence collected at all stages
                                                           of data engineering. The methods applicable to collect
Clearly, when developing data engineering technology,
                                                           evidence possibly vary from one stage to another, mak-
the evidence that can be provided is under the trustee’s
                                                           ing it important and challenging to select appropriate
control, who can adapt this evidence to potentially in-
                                                           methods. The collected evidence can be managed within
fluence the trust relationship. We propose guidelines on
                                                           the metadata management component. While there are
how to systematically integrate trust in the development
                                                           many ways to possibly foster trust in data engineering
of data engineering pipelines, by enriching the general
                                                           applications, as well as trust in the parties behind the ap-
data engineering process with further steps fostering
                                                           plications that can also have an effect on the considered
trust.
                                                           trust relationship, this paper focuses on the technical
                                                           solutions targeting trust, leaving the study of trust with
4.1. Assumptions                                           respect to governance to the future. This paper also
To align with the trust model we defined in Section 3, we does not aim at exhaustively reviewing how to collect
make the following assumptions. First, to guarantee that and manage evidence (we mentioned some approaches
we are fostering a trust relationship conforming to our in Section 2), as for different trust scenarios, different
model, we assume that the truster is aware technology solutions apply or may need adaptation. Instead, our
is used, that it poses a risk to themselves or others, and work here offers guidelines on how to generally proceed
its functionality cannot be completely verified. Second, to systematically integrate the consideration of trust in
we assume that the truster has an ambivalent attitude data engineering technology. This naturally integrates
towards the data engineering technology and can be led into the conceptual planning phase of data engineering
to trust it. Finally, we acknowledge that the actions of processes (i.e., the leftmost step in Figure 2).
developers and companies can also create an illusion of
trustworthiness, e.g., through clever designed evidence. 4.3. Identify trust scenarios
Here, we assume a benevolent trustee, who intends to
                                                           Our model for trust in data engineering represents a mul-
provide actual evidence of trustworthiness and does not
                                                           titude of scenarios in which humans with specific roles,
want to trick the user into trusting a non-trustworthy
                                                           risks, and vulnerabilities are in a trust relationship with
technology.
                                                           a data engineering technology. Specific evidence will be
                                                           needed - and at the same time enough - for individual
4.2. Trust-integrated data engineering                     trusters to perceive a particular application as trustwor-
With these underlying assumptions, we enrich the gen- thy. Therefore, it makes sense to identify the specific trust
eral data engineering process to integrate trust in the scenarios anticipated with respect to the application goal,
                                                           such that that the collection of evidence can be tailored




                                                              9
        Goal definition               Data collection      Data wrangling       Data enrichment          Data distribution
        + identify trust scenarios    + collect evidence   + collect evidence   + collect evidence       + collect evidence
        + identify trust break points                                                                    + distribute evidence
        + devise trust strategy

                      Metadata management
                      + collect evidence                                                             Governance
                      + manage evidence

Figure 2: Framework integrating trust in the development of data engineering pipelines. We show the main components of
traditional data engineering development in black and our enrichments that integrate trust in blue.

to these.                                                            4.5. Devise a trust strategy
   At this stage, we propose to think about scenarios, re-
                                                                    In a sense, identifying trust scenarios and trust break-
lationships, or use cases where the targeted application
                                                                    points can be seen as a requirements analysis on how
(goal) has some sort of power over the truster, putting the
                                                                    to cover trust. This analysis forms the foundation to de-
truster at risk. Modalities of power as identified in the
                                                                    vise a trust strategy, i.e., a plan to meet the requirements.
field of political philosophy could be a starting point. Fur-
                                                                    Referring back to the distinction of reliance and trust, it
thermore, different kinds of trusters, i.e., trusters exhibit-
                                                                    will not be enough to provide evidence that convinces
ing different influencing factors, should be considered. It
                                                                    the truster that the application is pragmatically the best
is important to identify which different combinations of
                                                                    option to use. Instead, following our trust model, the
influencing factors may define trusters in relevant trust
                                                                    trust strategy should be designed to provide sufficient ev-
scenarios, as well as the specific risks they potentially
                                                                    idence on ability, benevolence, and integrity to increase
face, to then devise trust strategies tailored to the dif-
                                                                    perceived trustworthiness.
ferent kinds of trusters. For a wide coverage of possible
                                                                       The first idea that comes to mind is to transparently
trust scenarios, we recommend a diverse set of examiners
                                                                    provide more information about the trust breakpoints,
with a critical mindset.
                                                                    which the user can use to judge the trustworthiness of
                                                                    the application. This will mostly respond to the ability
4.4. Identify trust breakpoints                                     of the trust breakpoint’s DETAs, but could also include
After identifying trust scenarios, it is time to pinpoint           evidence for the integrity and benevolence of the com-
the critical parts for perceived trustworthiness in the             pany and developers behind the application. Several
(planned) data engineering process. We call these trust             methods have been developed to provide metadata that
breakpoints. They may comprise methods, algorithms,                 can serve as evidence, including plain information about
or other DETAs that could expose a truster to some risk             datasets [40], data provenance [16], or machine learning
by not meeting specific quality, functionality, or usabil-          explanations [35]. However, the problem of choosing a
ity guarantees, as their behavior bears some degree of              suited strategy for requirements given by trust scenar-
uncertainty, unpredictability, or unverifiability.                  ios and breakpoints remains. To systematically devise a
   It is possible that one trust scenario has multiple trust        strategy and identify pertinent methods, we propose to
breakpoints or that different trust scenarios share the             answer the following six questions in a structured way:
same breakpoint. This leads to many-to-many relation-                   (Q1) What should the trust strategy enable the truster to
ships between trust scenarios and trust breakpoints. For            do? This refers to additional “-ility" requirements of the
each application-relevant combination, we further rec-              system that support the truster in their trust assessment
ommend to determine the requirements each breakpoint                and ultimately decision. Answers could include verifiabil-
in each scenario has to meet in order to minimize or avoid          ity, reproducibility, traceability [41], reviewability [42],
risk.                                                               accountability [43], auditability [44], or trialability [45].
   Since the data engineering software is a technolog-              Different answers will require different pieces of evidence
ical product, the quality of its trust breakpoints is al-           produced by different methods. For example, verifiabil-
ways shaped by the human capabilities, thoughts, and                ity of an output may require an explanation on how the
attitudes of its designers, developers, and surrounding             output was generated, whereas the reproducibility of an
organization. Therefore, there are truster-organization             algorithm asks for information about the algorithm and
and truster-developer trust relationships to be identified          its parameters.
and addressed as well.                                                  (Q2) For what kind of component does the truster need




                                                                10
evidence for? Different components of the data engi-           sibly be evaluated and a trust strategy validated. Given
neering technology will require different methods. For         the complexity of human trusters through the number
example, methods applying to SQL processing [46] sig-          and variety of influencing factors on trust, we postulate
nificantly differ from methods for data transformations        that a trust reaction can hardly be simulated, as has been
in Map/Reduce pipelines [47]. This question also asks for      attempted for instance by Ribeiro et al. [35]. Therefore,
the granularity of the component that the truster needs        we suggest to resort to proper user studies, analogously
evidence of. Whether it is one, multiple, or only the out-     to studies conducted for instance in social sciences or
put of a DETA will influence the choice of methods to          human-computer-interaction, to evaluate a trust strategy.
use.                                                           We provide guidelines on how to perform such studies
    (Q3) What is the timeframe the truster needs evidence      relating to trust in data engineering.
for? Depending on the trust scenario, the evidence should
cover past information (e.g., evolution provenance [48]),       5.1. Study participants and goals
real-time information (e.g., machine learning model ex-
planation [35]), or future information (e.g., future use of    As we have seen, a trust strategy is designed and imple-
sensitive data [49]).                                          mented specifically for a trust scenario. Therefore, the
    (Q4) What type of information is needed? To provide        evaluation of the strategy should reuse this scenario in
the truster with the necessary evidence, different types       order to validate the strategy with respect to the scenario.
of information can be used. Examples include factual           This means that participants in the user study should
information such as fairness scores [50], explanations of      have the same role towards the application as the truster
outcomes [35], or less technical information, e.g., limita-    in the scenario. Furthermore, the participants should
tions or legal considerations [40].                            satisfy the modeled requirements on trusters, i.e., they
    (Q5) What presentation is appropriate for the truster?     should be aware that the application is uncertain and
Depending on the truster’s role, level of expertise, and       its use is related to a specific risk, as defined in the sce-
other characteristics (influencing factors), the evidence      nario. To ensure this, proper participant selection and
has to be prepared and presented accordingly. Therefore,       gauging questions in the questionnaire of the user study
an appropriate level of abstraction and appearance have        are possible methods one can employ. Additionally, we
to be chosen, that provides the evidence without over-         recommend properly introducing the participants to the
whelming the truster. It could for instance be presented       scenario, where they should be made aware of their role
like in Datasheets for Datasets [40], where the informa-       and the risk the application can pose.
tion is presented as structured text and kept at a very           Before deciding on the study setup or devising the
technical level, or the evidence can be presented as in        questionnaire, the question on what hypotheses to verify
Nutritional Labels for Rankings [50], where the informa-       needs to be answered. One example of such a hypothesis
tion is (visually) supported using icons, diagrams, and        is: “The devised trust strategy increases the perceived trust-
information boxes.                                             worthiness of the data engineering technology compared
   (Q6) What other requirements have to be fulfilled? Since    to the same technology without trust enrichment.”. Clearly,
the trust strategy has to fit the overall development plan     the hypothesis should explicitly focus on an aspect of the
and requirements, other (technical) requirements may           trust model, for which the impact of the trust strategy
also apply. These could include storage constraints [51],      is then evaluated. The impact itself also encompasses
privacy considerations [52], access control [53] or execu-     different possible aspects, e.g., perceived trustworthiness
tion speed [54].                                               (wrt image in the model), actual use, etc. This should
   After these questions have been answered for all pre-       be clarified as part of the hypothesis. Finally, the scope
viously determined relevant trust breakpoint-scenario          of the evaluation needs to be defined, clarifying which
combinations, the developers have enough information           aspects of the trustee are covered (e.g., the whole data
to identify or develop appropriate methods.                    engineering technology or just selected DETAs).


5. Trust strategy evaluation                                    5.2. Methods for trust evaluation
                                                                Once the “what” has been defined, one can address the
After the trust strategy has been defined and imple-            question on “how” to conduct the study. Here, study de-
mented, including the collection of evidence, the question      signers have to decide which methods to use to evaluate
remains whether the strategy performs as expected. That         the target aspects. The notion of trust is inherently diffi-
is, whether the collected evidence helps trusters to estab-     cult to quantify, which explains why a set of measurable
lish a trust relationship with the trustee, in our setting a    proxies is usually used that, ideally, highly correlate with
data engineering technology. In this section, we discuss        the aspects of interest. We review methods that have
how the notion of trust we defined in this paper can pos-




                                                           11
been used to evaluate trust and which are amenable to           development in Section 6.1 and report on its evaluation
our data engineering setting.                                   in Section 6.2.
   Experiments. For interpersonal trust, researchers have
conducted various studies in which the participants could       6.1. Record linkage in a credit scoring
choose between different options [55, 56]. Each of these
was implicitly related to trust or distrust based on a risk
                                                                     application
and reward system. By tracking participants’ actions, re-      Credit scores for individuals as provided by companies
searches could conclude whether the participants trusted       like Equifax or TransUnion are widely used to evaluate
each other or not. This technique can be adjusted for          the “creditworthiness” of individuals. This can have a
evaluating data engineering technology by creating eval-       significant impact on human’s lives, e.g., depending on
uation scenarios in which the participants can actively        their credit score, they may or may not be granted a loan,
choose between different options that correlate with trust     may have to pay higher or lower interest rates, may be
or distrust. Recording the decisions of participants can       preferred or not in the competitive housing market to
be used as a proxy to measure actual use.                      sign a lease, etc. Therefore, it is crucial for all parties (the
   Questionnaires. In designing questionnaires to evaluate     human customers, banks, landlords) that a person’s credit
trust in data engineering technology, we can adapt and         history or report, on which the scores are based, is correct
extend questionnaires that have been devised to evaluate       and complete. A report itself comprises various customer
trust in other settings. Examples of questions used to         activities that are shared by different entities (banks, in-
measure trust appear in the trust section of the General       surances, credit card companies, mobile phone providers,
Social Survey [57] (an annually conducted study in the         etc.) cooperating with the credit scoring company that
US). Another option is to derive trust questions analogous     are potentially related to the customers’ creditworthiness.
to the questions on usability and understandability of         Examples include opening of a bank account, successfully
the technology acceptance model (TAM) [58]. These              paying back a loan, etc.
techniques allow to examine the thoughts, attitudes, etc.         To ensure the data of persons’ credit reports are accu-
of the participants including perceived trustworthiness,       rate, newly shared customer activities need to be inte-
intention to use, and perceived risk.                          grated in the consolidated master database of the credit
   Structured interviews and unstructured questionnaires.      storing company. This is performed by a dedicated data
Information about perception, attitudes, etc. that are dif-    engineering software, which we assume to be similar
ficult to express in a question with predefined answers        to the pipeline for a similar goal described in [60]. Fol-
can be collected or captured via interviews or free text       lowing the steps of the general data engineering process
fields in questionnaires. This includes, e.g., the reasoning   outlined in Figure 2, the goal definition is to correctly
behind participants’ answers to structured questions or        update the master database, given the data of a newly re-
additional comments on the study. Such answers can             ported activity record. In this context, the data collection
provide valuable information on aspects that study de-         step includes accessing data of the master database (we
signers did not anticipate and offer insights on how to        can assume an SQL query interface) and newly reported
potentially improve the technology, including the trust        records, e.g., obtained via an API. The subsequent data
strategy.                                                      processing that will result in the transformed (updated)
   Quantitative metrics. In some settings, it is possible      master database is all part of data wrangling. Sub-tasks
to include quantitative metrics into the trust evaluation.     of data wrangling in our use case include the standardiza-
For instance, Wintersberger et al. [59] measure the heart      tion of addresses to all be in the same format, the match-
rate of their participants during their study on trust in      ing of a record from the master database corresponding
traffic augmentation for automated driving systems. In         to the same person as the new entry (record linkage) pos-
their scenario, there was a correlation between heart          sibly followed by human intervention when the match
rate and trust. For data engineering technology, other         is uncertain (e.g., when no global unique identifier like
quantitative metrics such as reaction time may apply.          a social security number is available and not all fields
                                                               match). If a match is identified, the record on file and
                                                               the new record are merged to a new record (data fusion).
6. Application of our methods to a                             The merged record is then written back to the master
   use case                                                    database, which can then be queried by subsequent ap-
                                                               plications, such as an application deriving a credit score.
After defining our model of trust with respect to data
engineering technology as well as guidelines on how to          6.1.1. Trust scenarios
devise and evaluate a corresponding trust strategy, we
put our approach to the test by applying it on a real world    In the use case introduced above, the first step towards de-
use case. We describe the use case and its trust strategy      vising a trust strategy is to define trust scenarios. To this




                                                           12
end, we first identify various parties (possible trusters)      relationship between the employees and the data engi-
that have some kind of relationship with the data en-           neering technology they use to consolidate credit reports.
gineering application that can potentially be a trust re-
lationship. These include, for instance, the customers,         6.1.2. Trust breakpoints
whose personal data are stored and evaluated by the
credit scoring company and the employees of the credit         For this specific trust scenario we identified above, we
scoring company that should trust the technology to sup-       consider several trust breakpoints, i.e., DETAs that may
port them in their task of matching and merging records.       affect employees’ trust relationship with the technology.
   Let us now analyze the potential trust relationship         A first review reveals for instance that during data collec-
between a customer in the role of truster and the data         tion, trust may be jeopardized by the reporting entities
engineering technology (trustee) in more detail. Clearly,      that may transmit erroneous data. During data wran-
the customer relies on the credit reporting technology         gling, the address standardization may sometimes be
(e.g., accessible through a web interface) to be able to       inaccurate, depending on which (external) address check
provide the described service (maintaining the credit          service is used. Next, the record linkage may match the
report), e.g., to secure a loan. While the customer may be     wrong records or present the employees with what can
aware of the impact a (wrong) credit history can have on       be perceived as misleading information to make their
the loan application, the customer usually simply expects      decision. Finally, the merge of records could yield an
the service to work as intended, considering it as an          erroneous record. We consider employees unlikely to
instrument to achieve a goal. As we saw in Section 2.1.3,      question the data collection or address standardization
this rather qualifies as trust as reliance. Also, customers    DETAs directly (they more likely may not trust exter-
may not be aware that the underlying technology cannot         nal entities serving as data providers, which are other
be completely verified and can exhibit quality issues.         trust relationships). We assume their trust relationship is
   This picture changes when we turn our attention to          mostly affected by the internal workings of the assistance
the employees involved in the “human-in-the-loop” data         the system gives them during record linkage or merge.
engineering technology as potential trusters in a trust        To demonstrate the development of a trust strategy, we
relationship. Clearly, being part of the process, they are     focus on the first of these two breakpoints.
well aware that the data engineering technology can-
not be completely verified and can cause quality issues.        6.1.3. Trust strategy
They are also aware of the risk the use of the technology      In order to devise a trust strategy for the trust scenario
poses, not necessarily to themselves but to their friends,     and breakpoint identified above, we answer the questions
their relatives, and the society in general. For their work,   proposed in Section 4.5. Essentially, the trust strategy
however, they rely on the technology and depending             should enable employees of a credit scoring company
on company policy, the use of the technology bearing           who consolidate personal data to judge the trustworthi-
some uncertainty with respect to quality may also put          ness of technology, which, in this scenario, we assume re-
these employees at risk, e.g., if, in a performance review     lates mostly to verifiability of its functionality and quality
it turns out that these employees did match and merge          (Q1). Given the trust breakpoint under consideration, we
a significant amount of credit reports that have led to        need evidence for the record linkage component (Q2). As
claims for correction or to too generous credit scores         the employees make point-wise match decisions, work-
for non-creditworthy customers. Overall, we see that all       ing with the technology for each individual case, the ade-
criteria are met by employees to be a truster in a trust       quate time frame for evidence is “the now”, i.e., real-time
relationship as defined by our trust model.                    (Q3). Considering what type of information is needed as
   On the trustee side, the credit reporting technology        evidence, we argue that developers are probably inter-
comprises several DETAs, e.g., the different steps of the      ested in explanations on how the program came to the
data engineering pipeline we described above. Given            conclusion that two records could match, while design
the common uses of such technology, it undoubtedly has         decisions on system level and implementations are not
some social power. As mentioned before, it also exhibits       pertinent (Q4). In terms of presentation, employees bene-
some uncertainty and unverifiability on how the credit         fit from simple and easy to understand explanations that
reports are generated. Influencing factors relating to         do not use technical terms from underlying algorithms,
the DETAs are mainly their functionality and quality.          as well as visual cues that support the understandability
Besides the credit reporting technology, developers and        of explanations (Q5). We consider no additional require-
designers, but also the reporting entities cooperating         ments (Q6).
with the credit scoring company also potentially affect            With the answers to the questions given above, we can
the trust employees put into the trustee.                      determine suited methods and algorithms to implement
   Given the discussion above, we focus on devising a          the trust strategy, where we essentially opt to provide
trust strategy for the trust scenario defined by the trust




                                                           13
employees with an explanation of matching candidates              (e.g., [61, 62, 63]). We rely both on the visualization of
that serves as evidence of the trustees ABI, so that the          feature importance by using different color highlights for
employees can potentially gain trust in the system’s be-          attributes that are important for making a match decision
havior.                                                           and attributes that are important towards a non-match.
                                                                  We further provide explanations in the form of human-
6.2. Evaluating the trust strategy                                readable model approximation, listing positive semantic
                                                                  indicators (e.g., important fields firstname, lastname, and
The goal of the trust strategy in this use case is to foster      date of birth are equal) and negative semantic indicators
trust of employees in the data engineering technology             (e.g., contradictory gender).
they use, by means of explanations. To evaluate if the               In the third section of the study, each participant
trust strategy implementation achieves this goal, we con-         answers an exit questionnaire that covers several as-
duct a user study, following our discussion in Section 5.         pects, including usability, by adapting questions from
This section summarizes the study design, presents re-            the TAM [58]. We formulate additional questions to as-
sults, and discusses these.                                       sess perceived risk and trustworthiness (see Figure 4),
                                                                  following the same rationale as TAM questions. The an-
6.2.1. Study design                                               swers to these questions again follow a 7-grade Likert
                                                                  scale, ranging from the most positive answer “strongly
The participants we aim to recruit should take the po-            agree” (1) to the most negative answer “strongly disagree”
sition of employees of a fictive credit scoring company           (7). The study section concludes with a free text field for
and review the ambivalent decision of a record linkage            additional remarks.
DETA. Given the ongoing pandemic, we design an online                During the second section of the study, we capture par-
study. From the different methods for trust evaluation            ticipants’ decision time per match as quantitative metric.
(see Section 5.2), we focus mostly on questionnaires to
capture the participants’ stance on the data engineering
                                                                      6.2.2. Results
technology. The study includes three main sections, we
summarize next. Full details are available on our repeata-        At the time of submission, a total of 19 participants with
bility website1 .                                                 a computer science background took part in our user
   In its first section, the study provides an introduction       study (10 without / 9 with explanations). We opted for
to the setting of the study and the topic of record linkage       participants with a computer science background to en-
in the context of credit report generation. Thereby, we           sure all participants have a general understanding of data
enable the participants to make informed decisions in             engineering technology, to better grasp the task we ask
the next section focusing on record linkage, and raise            them to perform. Based on responses to the first section
their awareness for the underlying potential risk. We             of the study, we conclude that the participants are gen-
further add questions based on 7-grade Likert scales to           erally optimistic that technology can be helpful rather
assess the participants’ ambivalent attitude towards the          than harmful (mean of 2.7) while they are aware that the
technology they evaluate and their risk awareness with            technology may put others at risk (mean of 2.7). Thus,
respect to the scenario. Answers to these questions allow         they are aware and careful because of associated risks
us to verify the assumptions stated in Section 4.1. We            (mean of 2.7).
also include test questions to determine if participants             Determining if the explanations implemented follow-
have understood the problem of record linkage.                    ing the devised trust strategy have any effect, we ana-
   Next, participants are presented with potential                lyze if there is some statistically significant difference
matches, i.e., pairs of records the system suggests to be         between the group of participants without explanations
matches, for which participants, in their role as employ-         and the group with explanations. Considering reaction
ees, have to decide if they agree with the system or not.         time, accuracy of participant match decisions, and the
The study comprises 60 matches that each participant              Likert scale questions relating to trust, the applicable sta-
reviews. We ensure that these matches cover diverse               tistical tests (t-tests or Wilcoxon-Mann-Whitney-Tests)
real-life match situations of varying difficulty in a bal-        do not reveal a difference between groups of participants
anced way. The participants were shown the matches in             with and without explanations. We thus cannot conclude
a random order.                                                   that explanations have a significant effect on the interac-
   To evaluate the effect of the trust strategy, participants     tion between employees and the record linkage DETA,
are split into two groups: one gets to see explanations           in particular, on trust. While the study may benefit from
alongside matches, the other group not. Different op-             a larger number of participants, the current results show
tions for record linkage explanation have been proposed           that statements of the sort “explanations are a means to
1
                                                                  improve trust” should be used cautiously, as it remains an
    https://www.ipvs.uni-stuttgart.de/departments/de/research/
                                                                  open question in our use case (and others that have not
    projects/fat_dss/




                                                                 14
                               Q4.15 Q4.16 Q4.17 Q4.18 Q4.19                            Question ID   Question
               7                                                                          Q4.15       I would feel safe if people’s data were processed by this
               6                                                                                      system.
                                                                                           Q4.16      I would feel at risk if the system was used to decide
Likert scale
               5
                                                                                                      about me and my data.
               4
                                                                                           Q4.17      I believe in the benefits of the new system.
               3                                                                           Q4.18      Assuming I have the power to make decisions in a credit
               2                                                                                      scoring company, I would predict that I would decide
               1                                                                                      to use the system.
                   witout explanation                      with explanation                Q4.19      I trust the system.
                                         Study group
 Figure 3: Excerpt of the answers on perceived risk and trust                         Figure 4: Study questions relating to the judgment, per-
 by participants without and with explanations.                                       ceived trust, and eventual intention to use
been evaluated), if this holds. Clearly, there is a need for                       ing of matched records. This indicates that the second
a more systematic consideration of trust, how to possibly                          trust breakpoint we identified in our use case is indeed
integrate it in the design of a data engineering technology                        relevant. Second, participants indicated that additional
(and others), and how to evaluate it. The contributions                            information in the records such as bank account num-
of this paper are a first step in that direction.                                  bers would be helpful for their task. This can be seen as
   Questions included in the first section and the third                           relating to the system’s functionality and quality.
section of the study can further be used to compare the
“state-of-mind” of participants before and after they have                         7. Conclusion and Outlook
interacted and gained some experience with the record
linkage system. Here, we determine that, without expla-                        This paper started a nuanced discussion on trust in data
nations, participants show increased trust in potentially                      engineering. Grounded in established notions of trust
risky technology after the study, compared to before                           from philosophy and psychology, we defined a trust
the study (p=0.039). This could not be observed in the                         model and proposed guidelines on how to consider such
presence of explanations. On the contrary, we observe                          trust when developing data engineering pipelines by de-
a statistically significant decrease in technological opti-                    vising a trust strategy. Such a strategy ideally fosters
mism and trust for participants that were shown expla-                         trust in data engineering applications, which needs to be
nations (p=0.009). That is, not only can we not confirm                        validated. To this end, we suggested a general evaluation
that explanations are helpful to foster trust, but we have                     procedure. We applied our methods to a real-world use
an indication that they may actually harm it. A reason                         case, demonstrating the applicability of the model, guide-
may be that explanations give employees further infor-                         line, and evaluation procedure. However, our evaluation
mation they can question or that may raise suspicion,                          failed to assert that the explanations we provided as evi-
outweighing possible benefits of explanations.                                 dence fostered trust in our use case, strengthening us in
   While not showing a statistically significant difference                    our initial motivation that statements like “explanations
between the two studied groups, we still provide some                          improve trust” may be unfounded. This highlights the
further discussion on the answers to questions relating to                     need for further investigation on systematically incorpo-
the judgment, perceived trust, and eventual intention to                       rating and evaluating trust in data engineering.
use (Q4.14 – Q4.19, summarized in Figure 4). The answers
to these questions ranging from 1 (strongly agree) to 7                            References
(strongly disagree) are summarized in Figure 3. Wee see
that while the majority of participants in any of the two                           [1] J. Angwin, J. Larson, S. Mattu, L. Kirchner, Ma-
groups do not feel safe (Q4.15) but rather at risk (Q 4.16),                            chine bias: There’s software used across the country
they do believe in the benefits of the system (Q 4.17). Also,                           to predict future criminals. and it’s biased against
the majority of participants, irrespective of whether they                              blacks, https://propublica.org/article/machine-bias-
have been shown explanations or not, predict they would                                 risk-assessments-in-criminal-sentencing, 2016.
decide to use the system (Q4.18). However, when directly                            [2] L. Sweeney, Discrimination in online ad delivery,
asking about trust, participants with explanations tend                                 Queue 11 (2013).
to give a lower rating to perceived trust (Q4.19). Indeed,                          [3] S. Lowry, G. Macpherson, A blot on the profession.,
while all but one participant in this group gave a neutral                              British medical journal (Clinical research ed.) 296
or negative rating (the median as well as the most positive                             (1988) 657—-658.
value are 4), more positive ratings are given by almost                             [4] X. L. Dong, E. Gabrilovich, K. Murphy, V. Dang,
half the participants not having seen explanations.                                     W. Horn, C. Lugaresi, S. Sun, W. Zhang, Knowledge-
   Finally, we report on the two main comments partic-                                  based trust: Estimating the trustworthiness of web
ipants provided as part of the final unstructured ques-                                 sources, Proceedings of the VLDB Endowment 8
tion. First, participants inquired about further details                                (2015) 938–949.
concerning the step following record linkage, i.e., merg-                           [5] A. Fariha, A. Tiwari, A. Radhakrishna, S. Gulwani,
                                                                                        A. Meliou, Conformance constraint discovery: Mea-




                                                                              15
     suring trust in data-driven systems, in: Proceedings          Handbook of Trust and Philosophy, Routledge,
     of the 2021 International Conference on Manage-               2020, pp. 313–325.
     ment of Data, 2021, p. 499–512.                          [21] Plato, The Republic, 1994. URL: http://classics.mit.
 [6] X. Zhang, B. Qian, S. Cao, Y. Li, H. Chen, Y. Zheng,          edu/Plato/republic.html.
     I. Davidson, INPREM: an interpretable and trust-         [22] F. M. Alonso, Reasons for reliance, Ethics 126 (2016)
     worthy predictive model for healthcare, in: ACM               311–338.
     SIGKDD Conference on Knowledge Discovery and             [23] M. N. Smith, Reliance, Noûs 44 (2010) 135–157.
     Data Mining, 2020, pp. 450–460.                          [24] C. McLeod, Trust, in: E. N. Zalta (Ed.), The Stanford
 [7] P. Vassiliadis, A. Simitsis, S. Skiadopoulos, Concep-         Encyclopedia of Philosophy, Fall 2020 ed., Meta-
     tual modeling for ETL processes, in: Proceedings              physics Research Lab, Stanford University, 2020.
     of the ACM International Workshop on Data Ware-          [25] J. Simon, The Routledge handbook of trust and phi-
     housing and OLAP, 2002, p. 14–21.                             losophy, Routledge, 2020.
 [8] I. F. Ilyas, X. Chu, Data cleaning, Morgan & Clay-       [26] R. C. Mayer, J. H. Davis, F. D. Schoorman, An inte-
     pool, 2019.                                                   grative model of organizational trust, Academy of
 [9] A. Doan, A. Halevy, Z. Ives, Principles of Data Inte-         management review 20 (1995) 709–734.
     gration, 2012.                                           [27] F. Grodzinsky, K. Miller, M. J. Wolf, Trust in artificial
[10] C. Dai, D. Lin, E. Bertino, M. Kantarcioglu, An               agents, in: The Routledge Handbook of Trust and
     approach to evaluate data trustworthiness based on            Philosophy, Routledge, 2020, pp. 298–312.
     data provenance, 2008, pp. 82–98.                        [28] M. Coeckelbergh, Can we trust robots?, Ethics and
[11] C. Dai, H. Lim, E. Bertino, Y. Moon, Assessing the            Information Technology 14 (2011) 53–60.
     trustworthiness of location data based on prove-         [29] A. M. Evans, J. I. Krueger, The psychology (and eco-
     nance, in: 17th ACM SIGSPATIAL International                  nomics) of trust, Social and Personality Psychology
     Symposium on Advances in Geographic Informa-                  Compass 3 (2009) 1003–1017.
     tion Systems, 2009, pp. 276–285.                         [30] J. A. Simpson, Foundations of interpersonal trust,
[12] L. D. Santis, M. Scannapieco, T. Catarci, Trusting            in: Social psychology: Handbook of basic principles,
     data quality in cooperative information systems, in:          2007, pp. 587–607.
     On The Move to Meaningful Internet Systems 2003:         [31] D. Dunning, D. Fetchenhauer, T. Schlösser, Why
     CoopIS, DOA, and ODBASE, 2003, pp. 354–369.                   people trust: Solved puzzles and open mysteries,
[13] H. Felzmann, E. F. Villaronga, C. Lutz, A. Tamò-              Current Directions in Psychological Science 28
     Larrieux, Transparency you can trust: Trans-                  (2019) 366–371.
     parency requirements for artificial intelligence be-     [32] M. Deutsch, Trust and suspicion: Theoretical notes,
     tween legal norms and contextual concerns, Big                in: The Resolution of Conflict, 1973, pp. 143–176.
     Data & Society 6 (2019) 1–14.                            [33] H. Jiang, B. Kim, M. Guan, M. Gupta, To trust or
[14] M. Janic, J. P. Wijbenga, T. Veugen, Transparency             not to trust a classifier, in: Advances in Neural
     enhancing tools (tets): An overview, in: Third                Information Processing Systems, volume 31, 2018,
     Workshop on Socio-Technical Aspects in Security               pp. 1–12.
     and Trust, 2013, pp. 18–25.                              [34] M. Reyes, R. Meier, S. Pereira, C. A. Silva, F.-M.
[15] K. Siau, W. Wang, Building trust in artificial in-            Dahlweid, H. v. Tengg-Kobligk, R. M. Summers,
     telligence, machine learning, and robotics, Cutter            R. Wiest, On the interpretability of artificial intelli-
     Business Technology Journal 31 (2018) 47–53.                  gence in radiology: Challenges and opportunities,
[16] M. Herschel, R. Diestelkämper, H. Ben Lahmar, A               Radiology: Artificial Intelligence 2 (2020) 1–12.
     survey on provenance: What for? what form? what          [35] M. T. Ribeiro, S. Singh, C. Guestrin, "why should
     from?, The VLDB Journal 26 (2017) 881–906.                    i trust you?": Explaining the predictions of any
[17] B. Glavic, Big data provenance: Challenges and im-            classifier, in: Proceedings of the 22nd ACM SIGKDD
     plications for benchmarking, in: Specifying Big               International Conference on Knowledge Discovery
     Data Benchmarks - First Workshop and Second                   and Data Mining, 2016, p. 1135–1144.
     Workshop, WBDB, Revised Selected Papers, 2012,           [36] S. M. Meeßen, M. T. Thielsch, G. Hertel, Trust in
     pp. 72–80.                                                    management information systems (MIS), Zeitschrift
[18] L. Kot, Tracking personal data use: Provenance and            für Arbeits- und Organisationspsychologie A&O 64
     trust, in: Seventh Biennial Conference on Innova-             (2020) 6–16.
     tive Data Systems Research, 2015, p. 1.                  [37] L. Thornton, B. Knowles, G. Blair, Fifty shades
[19] Y. L. Simmhan, B. Plale, D. Gannon, A survey of               of grey, in: Proceedings of the 2021 ACM Confer-
     data provenance in e-science, SIGMOD Rec. 34                  ence on Fairness, Accountability, and Transparency,
     (2005) 31–36.                                                 2021, pp. 64–76.
[20] J. P. Sullins, Trust in robots, in: The Routledge        [38] R. Tomsett, D. Braines, D. Harborne, A. Preece,




                                                         16
     S. Chakraborty, Interpretable to whom? a role-                  in: Proceedings of the 2018 International Confer-
     based model for analyzing interpretable machine                 ence on Management of Data, 2018, p. 1773–1776.
     learning systems, in: Workshop on Human Inter-             [51] A. P. Chapman, H. V. Jagadish, P. Ramanan, Effi-
     pretability in Machine Learning, 2018, pp. 8–14.                cient provenance storage, in: Proceedings of the
[39] C. B. Gibson, J. A. Manuel, Building trust - effective          2008 ACM SIGMOD international conference on
     multicultural communication processes in virtual                Management of data, 2008, p. 993–1006.
     teams, in: Virtual Teams That Work: Creating               [52] S. B. Davidson, S. Khanna, S. Roy, J. Stoyanovich,
     Conditions for Virtual Team Effectiveness, Jossey-              V. Tannen, Y. Chen, On provenance and privacy,
     Bass, 2003, pp. 59–86.                                          in: Proc. of the 14th Intl. Conference on Database
[40] T. Gebru, J. Morgenstern, B. Vecchione, J. W.                   Theory, 2011, p. 3–10.
     Vaughan, H. Wallach, H. Daumeé III, K. Crawford,           [53] A. Chebotko, S. Lu, S. Chang, F. Fotouhi, P. Yang,
     Datasheets for datasets, in: Proceedings of the 5th             Secure abstraction views for scientific workflow
     Workshop on Fairness, Accountability, and Trans-                provenance querying, IEEE Transactions on Ser-
     parency in Machine Learning, 2018, pp. 1–17.                    vices Computing 3 (2010) 322–337.
[41] J. A. Kroll, Outlining traceability, in: Proceedings       [54] N. Bidoit, M. Herschel, A. Tzompanaki, Efficient
     of the 2021 ACM Conference on Fairness, Account-                computation of polynomial explanations of why-
     ability, and Transparency, 2021, p. 758–771.                    not questions, in: Proceedings of the 24th ACM
[42] J. Cobbe, M. S. A. Lee, J. Singh, Reviewable auto-              International on Conference on Information and
     mated decision-making: A framework for account-                 Knowledge Management, 2015, p. 713–722.
     able algorithmic systems, in: Proceedings of the           [55] R. L. Swinth, The establishment of the trust rela-
     2021 ACM Conference on Fairness, Accountability,                tionship, Journal of conflict resolution 11 (1967)
     and Transparency, 2021, p. 598–609.                             335–344.
[43] M. Wieringa, What to account for when accounting           [56] E. L. Glaeser, D. I. Laibson, J. A. Scheinkman, C. L.
     for algorithms: A systematic literature review on               Soutter, Measuring trust, Quarterly Journal of
     algorithmic accountability, in: Proceedings of the              Economics 115 (2000) 811–846.
     2020 Conference on Fairness, Accountability, and           [57] T. W. Smith, M. Davern, J. Freese, S. L. Morgan,
     Transparency, 2020, p. 1–18.                                    General social surveys, https://gss.norc.org/, 1972-
[44] R. Cloete, C. Norval, J. Singh, A call for auditable            2018.
     virtual, augmented and mixed reality, in: 26th ACM         [58] F. D. Davis, A technology acceptance model for em-
     Symposium on Virtual Reality Software and Tech-                 pirically testing new end-user information systems:
     nology, 2020, pp. 1–6.                                          Theory and results, Ph.D. thesis, Massachusetts In-
[45] R. Agarwal, J. Prasad, The role of innovation char-             stitute of Technology, 1986.
     acteristics and perceived voluntariness in the ac-         [59] P. Wintersberger, T. von Sawitzky, A.-K. Frison,
     ceptance of information technologies, Decision                  A. Riener, Traffic augmentation as a means to in-
     Sciences 28 (1997) 557–582.                                     crease trust in automated driving systems, in: Pro-
[46] C. Li, Z. Miao, Q. Zeng, B. Glavic, S. Roy, Putting             ceedings of the 12th Biannual Conference on Italian
     things into context: Rich explanations for query                SIGCHI Chapter, 2017, pp. 1–7.
     answers using join graphs, in: Proceedings of the          [60] M. Weis, F. Naumann, U. Jehle, J. Lufter, H. Schuster,
     2021 ACM SIGMOD international conference on                     Industry-scale duplicate detection, Proceedings of
     Management of data, 2021, pp. 1051–1063.                        the VLDB Endowment 1 (2008) 1253–1264.
[47] M. Interlandi, K. Shah, S. D. Tetali, M. A. Gulzar,        [61] S. Thirumuruganathan, M. Ouzzani, N. Tang, Ex-
     S. Yoo, M. Kim, T. D. Millstein, T. Condie, Titian:             plaining entity resolution predictions: Where are
     Data provenance support in spark, Proceedings of                we and what needs to be done?, in: Proceedings of
     the VLDB Endowment 9 (2015) 216–227.                            the Workshop on Human-In-the-Loop Data Analyt-
[48] B. Ludäscher, I. Altintas, C. Berkley, D. Higgins,              ics, 2019, pp. 1–6.
     E. Jaeger, M. Jones, E. A. Lee, J. Tao, Y. Zhao, Scien-    [62] A. Ebaid, S. Thirumuruganathan, W. G. Aref, A. K.
     tific workflow management and the kepler system,                Elmagarmid, M. Ouzzani, EXPLAINER: entity res-
     Concurrency and Computation: Practice and Expe-                 olution explanations, in: 35th IEEE International
     rience 18 (2006) 1039–1065.                                     Conference on Data Engineering, 2019, pp. 2000–
[49] S. Oppold, M. Herschel, Accountable data analytics              2003.
     start with accountable data: The liquid metadata           [63] S. Gurajada, L. Popa, K. Qian, P. Sen, Learning-
     model., in: ER Forum/Posters/Demos, 2020, pp.                   based methods with human-in-the-loop for entity
     59–72.                                                          resolution, in: Proceedings of the 28th ACM In-
[50] K. Yang, J. Stoyanovich, A. Asudeh, B. Howe, H. Ja-             ternational Conference on Information and Knowl-
     gadish, G. Miklau, A nutritional label for rankings,            edge Management, 2019, pp. 2969–2970.




                                                           17