Trust in data engineering: reflection, framework, and evaluation methodology Sarah Oppold1,* , Melanie Herschel1 1 University of Stuttgart, Germany Abstract Trust is and has been essential to human interactions. With the rise of technology, we now live in a socio-technical environment where people frequently interact with technology as well. It is therefore natural to expect that people will also develop trust in technology. Data engineering researchers have at least assumed this when claiming certain methods they devise (e.g, explanations using provenance), likely help to foster some notion of trust. But rarely is the notion of trust clarified or this claim validated. We propose a more systematic consideration of trust in data engineering technology, compared to the ad-hoc state of the art. Therefore, we first review the notion of trust established in other disciplines, based on which we derive a model for trust in data engineering technology. We then present guidelines on how to proceed to devise a trust strategy aiming at enriching data engineering technology such that it potentially fosters trust conforming to our model. We further discuss how to possibly evaluate a trust strategy. We apply our trust model on a use case, for which we devise, implement, and evaluate a trust strategy using our proposed guidelines and methods. The results of our evaluation indicate that statements like “transparency helps build trust” should be used cautiously. This highlights the need for contributions like those we present here, as only a more systematic approach to defining, integrating, and evaluating trust in data engineering can bring us a step closer to provably fostering trust in such technologies. 1. Introduction gained attention – yielding approaches to possibly quan- tify, assess, or even improve trust – we observe that the Our society depends on us humans trusting each other. notion of trust is usually not well defined and does not From crossing the streets, to collaborating with cowork- correspond to the concept of trust established in other ers, to being treated by doctors, our society is built on disciplines, e.g., philosophy or psychology. In a first line trust. The rise of technology and its integration into our of research, the notion of trust considered in the con- world, has created a socio-technical environment where text of data engineering and data analysis reduces to a humans live together with technology. This means that possibly related metric and trust in the broader sense is we now not only have to trust other humans, we have to neither considered nor evaluated. For instance, trust as also establish a similar relationship to technology rather understood in [5] reduces back to the accuracy of a ma- than second guessing its every “action”, in order to bene- chine learning model. In [10, 11, 12], trust is quantified, fit, for instance, from its improvements in efficiency or e.g., based on the similarity of information and source productivity. provenance provided by different data sources. While the In an increasingly data-driven world, data engineer- resulting trust scores are measured in different settings, ing, data analysis, and machine learning are software it is never validated whether or not the scores actually technologies that can significantly affect human lives correspond to some established notion of trust. A second (e.g., [1, 2, 3]) and for which some notion of trust has line of research considers transparency and explanations been recognized as an aspect to consider (e.g., [4, 5, 6]). to foster trust (see, e.g., [13, 14, 15]). In this context, This paper focuses on trust in data engineering that en- data provenance [16], which offers transparency in data compasses the full data preparation pipeline to get from engineering pipelines, is frequently named as relevant raw data (as collected) to data “fit for analysis”, e.g., data for evaluating trust (e.g., in [10, 17, 18, 19]). Yet, we are used for training machine learning models. Typical data not aware of any validation of this claim. In that sense, engineering steps include data transformation [7], clean- the use of the term trust in data engineering has been ing [8], and integration [9]. Data engineering is usually mostly ad-hoc, without a clear or consistent definition. required in any data-driven process and a plethora of Furthermore, methods to evaluate solutions for trust in systems and algorithms for it exist. data engineering with respect to such a definition are While trust in such engineered data has recently lacking. Proc. of the First International Workshop on Data Ecosystems (DEco’22), Clearly, we need a more nuanced and systematic dis- September 5, 2022, Sydney, Australia cussion on trust in data engineering, to which we con- * Corresponding author tribute considering the following questions: How can we $ sarah.oppold@ipvs.uni-stuttgart.de (S. Oppold); incorporate the concept of trust into the development pro- melanie.herschel@ipvs.uni-stuttgart.de (M. Herschel) cess of data engineering pipelines to obtain trustworthy © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License CEUR Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 data engineering? How can we assess trust or trustwor- 4 thiness in a data engineering pipeline? While we expect 2.1.2. Trust there are many different types of solutions, our focus here Considering trust, a truster A usually trusts a trustee B lies on technical solutions to possibly influence trust in to do C [24]. As natural, familiar, and elemental it is to data engineering. Our contributions are: (1) We critically trust for us as humans, as complicated it is to describe it review the term “trust” (Section 2) to define a theoret- as a concept. What philosophers agree on is that trust ical model for trust in data engineering (Section 3). (2) entails that (1) A is somehow vulnerable to a risk when Based on this model, we describe a framework for trust they trust B, and (2) A relies on B to both be competent engineering that integrates trust in the data engineering and willing to do C [24]. Related to the psychological pipeline and serves as a guideline to develop a trust strat- attitude of trust is the property “trustworthiness” that egy (Section 4). (3) We describe a general procedure one we can ascribe to others when we think that we can can use to evaluate a trust strategy (Section 5). (4) We trust them (i.e., we think that they fulfill point 2). While apply our methods to devise a trust strategy to a use case philosophers thus agree that trust is based on reliance based on a credit scoring scenario, where explanations (see point 2), they cannot agree what the additional fac- are integrated into a data engineering step as evidence tor is that differentiates trust from mere reliance. While to possibly foster trust. Our systematic evaluation, how- some argue that the trustee’s motive must be of some ever, reveals that the explanations may not reach this moral nature such as self-interest, goodwill, or moral goal, highlighting the importance of a more systematic integrity, others argue that the additional factor is some study of the problem with the methods we propose in sort of normative expectation the truster has vis a vis this paper (Section 6). Note that we are aware that it is the trustee. It seems to depend highly on the trust rela- possible to manipulate and deceive people by creating an tionship example used. A different stance philosophers illusion of trustworthy data engineering solutions [20] use to differentiate between trust and reliance is that if and that our contributions can lead to such deceptions B fails A in a reliance relationship, A feels disappointed, and manipulations. Countering or regulating this is how- whereas in a trust relationship, A feels betrayed [24]. Im- ever out of the scope of this paper. portant characteristics of trust are pro-attitude (truster wants trustee to succeed in doing C), vulnerability, lack 2. Trust perspectives of control, and active acceptance of risk [25, 24]. While trust remains an elusive concept, a widely As we motivated above, trust in data engineering and adopted model is the ABI trust model [26]. It identifies analysis has been considered in an ad-hoc manner, while three factors of perceived trustworthiness: (A)bility, that it has been systematically discussed in other disciplines, is the skills or competencies of the trustee, (B)enevolence, leading to some common understanding what trust typi- which refers to the extent to which the trustee is well cally entails. meaning to the trustor, and (I)ntegrity, which is that the trustee seems upright in the eyes of the truster because 2.1. Philosophical perspective on trust they share a common set of values or principles. As we shall see in Section 3, we incorporate the ABI charac- The discussion on trust has a long history in philoso- teristics in our trust model targeting data engineering phy [21] and while the concept remains elusive, there technology rather than a human as trustee. are some underlying ideas that most philosophers seem to agree upon. One key facet of the discussion that we 2.1.3. Trust in technology highlight here is the distinction of trust and its related concept reliance. Note that while most philosophical re- While philosophers have studied different variants of search has dealt with interpersonal trust, our discussion trust (e.g., self-trust, trust in groups, trust in organiza- will also review the philosophical perspective on trust in tions), they all are based on human interaction and com- technology. munication [24]. Technology strongly differs from hu- mans. On the one hand, it lacks human characteristics 2.1.1. Reliance such as intentionality and hope [27], it cannot use lan- guage, and is not free to act as it will [28]. On the other In general, person A relies on a proposition p (e.g., that an- hand, it presents other non-human characteristics such other person performs a certain action) to achieve their as opaqueness to the user, or unnoticed updates [27]. So goals, when p is a productive means to achieve their can technology be a trustee in a trust relationship accord- goals and p has to be true for its success [22]. Reasons ing to the previously described notion of trust? Indeed, for reliance are often of pragmatic nature [22]. We rely when people talk about trusting technology, they some- on forces beyond our control or even our comprehen- times talk about a computer artefact, a mere object that sion [23]. is just expected to work as intended, an object that is an 5 instrument to achieve one’s goals. This would be consid- 2.3. Computer science perspective on ered what philosophers call “trust as reliance” [27, 28] trust and not “real” trust. However, if we take a closer look, technology often is Finally, we review the perspective from computer science more than just a simple artefact. Technology can feature on trust, with a special focus on trust with respect to data “logical complexity, capacity to store and manipulate data, processing software that performs or relies on data engi- potential for sophisticated interaction with humans” [27] neering or data analysis. While trust is also considered and can show unpredictable behavior [27, 28]. Thus, in other branches of computer science (e.g., security and technology seems to encompass more than just mere privacy), we do not review these in detail due to space objects that we rely on. In addition to that, humans, as constraints. the partner in a trust relationship with technology, can As we have pointed out in the introduction, the term become emotionally involved in the relationship because trust is typically used in an ad-hoc way, yielding different trust comes easily for humans [20] who have a capacity to notions of so-called trust that do not necessarily corre- anthropomorphize (form bonds with machines similarly spond to the common notion philosophy or psychology to how they personify pets) [20, 28]. Thus, within a agree on. In particular, we observe that trust often re- socio-technical system, technology can appear as “quasi- duces back to a measurable metric that is indicative of other” with qualities similar enough to humans for them the quality or performance of a solution, but where it is to create a trust relationship [28]. unclear if and how it correlates with trust. Other work Trust in technology might not be human trust but advocates that transparency and explanations are key something similar, lying between interpersonal trust and factors to establish trust, which is typically not evaluated trust as reliance [27]. It might even be on a spectrum rang- or validated though. ing from simple machines that only afford reliance and where the trust is based on functional criteria up to com- 2.3.1. Metric-reduced trust plex autonomous machines with unpredictable behavior First approaches have emerged to quantify, assess, or that cannot be verified but have to be trusted [20, 27, 28]. even improve what the authors call trust in data pro- Further layers of trust need to be placed in the developers, cessing. For example, [5, 33] attempt to measure trust of designers, and company [20], which makes an analysis machine learning predictions. However, their trust boils of trust in technology even more challenging. To make down to the precision or accuracy of machine learning re- the distinction between trust in technology and interper- sults. Similarity-metrics are another category of metrics sonal trust more explicit, researchers have introduced standing in for trust. For instance, [10, 11, 12] quantify some additional naming and have begun a differentiated trust based on the similarity of information and source discussion. Grodzinsky et al. [27] for example introduce provenance provided by different data sources. While the new terms: they call trust in electronic and trust in physi- proposed methods are certainly valuable to improve the cal (face-to-face) encounters E-Trust and P-Trust, respec- likelihood that approaches return the “correct” result and tively. Sullins [20] defines different situations of robotic improve the overall quality or performance, this notion trust, and Coeckelbergh [28] analyzes the impact of differ- of trust does clearly not bear the same characteristics as ent cultures on trust in robots. In this context, our work trust reviewed in the previous subsections. focuses on E-Trust but rather than focusing on robots, we focus on data engineering technology as trustee. 2.3.2. Transparency and explanations for trust 2.2. Psychological perspective on trust Several works discuss interpretability and explanations for machine learning models, seen as a possible means to While the philosophical approach is fueled by the inten- improve trust (e.g., [34, 35]). The general argument is that tion to analyze human phenomena, psychologists attempt such methods offer evidence and verifiability that foster to assess why we engage in this behavior of trusting trust in a user or developer. Ribeiro et al. [35] evaluate or distrusting another person. Psychologists also strug- their methods for trust, but this evaluation either simu- gle to conceptualize and operationalize trust behavior, lates users or equates trust with which model performs but see the same main characteristics of vulnerability, better (relating back to the metric-reduced trust). Trans- risk, uncertainty, and pro-attitude that are present in the parency and explanations in data engineering pipelines philosophers’ view [29, 30, 31, 32]. We consider psycho- can be achieved via data provenance [16]. Also in this logical studies on behavioral causes to not be directly area, these are frequently named as relevant for evaluat- relevant to the development of a first model of trust in ing trust (e.g., in [10, 17, 18, 19]). Yet, we are not aware of data engineering technology and thus leave their discus- any work that has studied or validated how transparency sion deliberately short. and explanations truly relate to trust. 6 2.3.3. Towards trust modeling • Modeling both main parties involved in a trust re- lationship. While classical trust models assume As a starting point to address the aforementioned short- both parties to be humans and thus having simi- comings, a more nuanced discussion about trust has re- lar properties, in our setting, the truster and the cently emerged in the area of computer science. Siau and trustee are inherently different types of entities. Wang [15] for example discuss trust in artificial intelli- Modeling both in detail opens opportunities for gence (AI). They collect a set of different definitions for a more detailed discussion of what trust in this trust and derive a set of factors for trust in AI technology kind of relationship entails. along multiple dimensions. They also list a variety of • Modeling influencing factors. Various factors may approaches to build and then nurture trust in AI. Having influence the kind of trust relationship estab- focused on methods for trust in AI, this work lacks a lished between a truster and a trustee, making catalog of methods for trust for data engineering. Fur- a concise and unique definition of trust difficult thermore, it does not include an actionable process taking (see Section 2). The model should integrate in- up their discussion to “implement” trust in AI. fluencing factors to reflect this ambiguity and in- Meeßen et al. [36] derive a model for trust in Manage- corporate the different nuances of trust, thereby ment Information Systems (MIS) based on both the ABI offering a more detailed model for a systematic trust model [26], which we reviewed in Section 2.1.2, and and multi-facetted study. research in automation and organizations. They translate the ABI terms from interpersonal trust to trust in MIS, allowing a more differentiated discussion about trust in 3.2. Model for trust in data engineering technology. While MIS cover data engineering applica- Given the desiderata described above, we build our novel tions, the proposed trust model is centered around the model for trust in data engineering. An overview of the trustors, mainly identifying factors such as perceived model is depicted in Figure 1. Note that it is based on trustworthiness that lead to their use of an MIS. In addi- the ABI model [26] discussed in Section 2.1.2, similarly tion, this work does not model or show what developers to [36]. While our model is more comprehensive than of MIS can actually do to build and foster trust that can previous work and tailored to data engineering, we do not lead to the decision to use the system. claim completeness (it can be extended) and leave open Thornton et al. [37] call for a more nuanced discussion the discussion how far it applies beyond data engineering on the methods developers can use in order to foster trust, (our area of expertise). proposing what they call trust affordances: “characteris- tics of the technology by virtue of itself or of features designed into the technology to promote trust by pro- 3.2.1. The truster - a human viding access to evidence of (dis)trustworthiness specific In the trust relationship we consider, a human is the to a user, a technology, and their context”. As they con- truster. Based on the general notion of trust (see Sec- sider technology in a broad sense, the discussion remains tion 2), we define the human in a trust in data engineer- very general. We build on their methodology and gen- ing relationship has to be aware of a vulnerability to some eral ideas to devise guidelines for built-in trust in data sort of risk when using the data engineering technology. engineering. Otherwise, the human will use the application as just another tool and we are looking at a “trust as reliance” situation. A human could for example feel vulnerable 3. Trust in Data Engineering and at risk when, while using a website, they are aware We build on the research presented in the previous sec- that they thereby may indirectly divulge preferences or tion to define a trust model for data engineering technol- personal information that can affect what information ogy. they will be shown, e.g., which news or which job ad- vertisements are recommended. We argue that humans also feel vulnerability when it is not themselves but other 3.1. Desiderata people that are subjected to a risk from the trustee. The following desiderata, derived from our discussion of The trust relationship a truster may or may not engage different trust perspectives, underly our model of trust: in inherently depends on several influencing factors: The human could be in the role of a user of the technology, • Distinguishing trust vs. reliance. The model but also others, such as an examiner, operator, executer, should incorporate distinctive features that cap- etc. [38]. This will influence how the truster approaches ture trust as opposed to mere reliance. This dis- the trust relationship. Humans’ decisions to trust are not tinction usually implies the truster’s risk aware- only influenced by their role, but also by their general ness with respect to the trustee. disposition to trust, their past experiences in general (e.g., 7 (Social) power, uncertainty, Ability Awareness of unpredictability, Benevolence vulnerability, risk unverifiability Integrity Trustee image judgement ➜ trust ➜ use DETA … Data Engineering Company Developers, Human Technology Designers evidence for trustworthiness Truster Trustee Role, disposition to trust, Functionality, Culture, past experience, past experience, contextual usability, quality training, skills, goals, policies factors, culture Figure 1: Model for trust in data engineering. A human truster builds a trust relationship with a trustee, i.e., a data engineering application. The latter divides into DETAs and relates to further trust entities (e.g., company). Solid boxes surround necessary characteristics of either the truster or (parts of) the trustee to establish a trust relationship. Dashed boxes group influencing factors. based on their privileges and power) and in particular ship. Note that these are parties with which the truster with (similar) technology, and contextual factors of the can also engage individual trust relationships. However, interaction. A human’s actions are also influenced by we also include these in the model of trust with respect to the culture(s) the human is part of, shaping expectations, data engineering software, because their characteristics behaviors and beliefs [39]. can influence this trust relationship as well. Indeed, their Note that given the large variety of human trusters ability, benevolence, and integrity have shaped the data resulting from different influencing factors and degree of engineering technology and can indicate to the truster risk awareness, the trust relationship to a trustee can be whether the trustee is trustworthy or not. How parties significantly different from one human to another. For behind the technology act when developing the product instance, one human’s relationship with the trustee may is again influenced by their culture - including organiza- actually be based on reliance because they do not see nor tional and functional culture [39] - but also their past are aware of any risks involved in interacting with the experiences, training, skills, goals, and policies. All of this trustee. At the other side of the spectrum, someone else can affect the trustworthiness of the product, i.e., the data might not engage in a trust relationship at all because engineering technology, and may be taken into account they feel too vulnerable and thus decide not to use the by the truster when making the decision whether or not system. to trust the data engineering technology. 3.2.2. The trustee - a data engineering technology 3.2.3. Interactions. Given the context of our work, the trustee is some data We now describe the interaction of the two parties in- engineering technology. For the truster to feel vulnerable, volved in establishing a trust relationship. it has to have some (social) power, element of uncertainty, When a truster judges the trustworthiness of someone, unpredictability, or unverifiability, thus preventing the they are actually assessing pieces of evidence they are assertion that the data engineering technology will not provided with to evaluate whether it is worth taking the cause any harm. risk to trust the other party and be vulnerable in some Typically, such an application is complex and consists aspect. Whether we are in the process of judging humans of multiple different data engineering technology artifacts or now data engineering technology, we think the human (DETAs). These include for instance services, datasets, truster continues to act the same. Therefore, we adapt or algorithms. Note that the truster may or may not be the ABI framework by Mayer et al. [26] (Section 2) which aware of DETAs. Each DETA, as well as the data engi- states that the trustee is assessed with respect to their neering technology perceived as a whole, is characterized ability (i.e., skills and competences) to fulfil their tasks, by its functionality, usability, and quality. These have their benevolence towards the trustee, and their integrity to be sufficient in order for the truster to perceive the of principles they act upon. Wile these are classically technology as reliable. Each DETA could also carry the characteristics of persons and organizations, in our set- potential to harm and therefore could also be individually ting, the truster usually creates an imaginary image of trusted or distrusted by the truster. the trustee based on visuals and communication with the Given that technology is shaped by humans and orga- data engineering technology. Indeed, communication to nizations, parties like developers, designers, or companies developers or the company behind the application, or are part of the trustee in a trust in technology relation- access to the codebase are usually not available to the 8 truster, so their ABI characteristics are transposed to thetechnology as summarized in Figure 2. The top of the fig- image of the data engineering technology. Based on the ure shows the different steps of the data engineering pro- truster’s epistemic and practical judgment, the truster cess, whereas the two bottom components “accompany” then decides whether to trust and then potentially use the whole process from a technical and organizational the technology [36]. perspective, respectively. Going from the trustee to the truster, the trustee pro- In general, before developing actual data engineering vides evidence towards the truster. In case of data en- software, the goals to reach with the use of data need to gineering applications, this could be through a modern be defined. Based on these goals, relevant data need to or old-looking visual interface, whether questions are be identified and collected. As these data may come in answered in an FAQ, etc. Opposed to interpersonal trust, various formats from different sources, data wrangling is trust in data engineering technology involves trust in a implemented to transform, integrate, and clean the data complex system of people, groups, institutions, who of- to obtain a unified and consistent view of the data rele- ten cannot be judged directly but only through the pieces vant to the goal. These data can be further enriched with of technology the truster has access to. In addition to application specific data and annotations, before they are that, the truster often does not have the capabilities to distributed to downstream data consuming applications understand the inner workings of the technology they such as data analysis techniques. To monitor, document, are supposed to assess. Following the ABI model [26], and support the process, metadata are typically gathered information on ability, benevolence, and integrity of the and maintained. In addition, a data engineering process trustee with respect to the potential risk might be evi- is usually subjected to some form of governance. dence that increases the perceived trustworthiness. Following our model of trust in data engineering, the data engineering technology in its role of trustee can support a trust relationship by providing appropriate ev- 4. Design data processing for trust idence. This may involve evidence collected at all stages of data engineering. The methods applicable to collect Clearly, when developing data engineering technology, evidence possibly vary from one stage to another, mak- the evidence that can be provided is under the trustee’s ing it important and challenging to select appropriate control, who can adapt this evidence to potentially in- methods. The collected evidence can be managed within fluence the trust relationship. We propose guidelines on the metadata management component. While there are how to systematically integrate trust in the development many ways to possibly foster trust in data engineering of data engineering pipelines, by enriching the general applications, as well as trust in the parties behind the ap- data engineering process with further steps fostering plications that can also have an effect on the considered trust. trust relationship, this paper focuses on the technical solutions targeting trust, leaving the study of trust with 4.1. Assumptions respect to governance to the future. This paper also To align with the trust model we defined in Section 3, we does not aim at exhaustively reviewing how to collect make the following assumptions. First, to guarantee that and manage evidence (we mentioned some approaches we are fostering a trust relationship conforming to our in Section 2), as for different trust scenarios, different model, we assume that the truster is aware technology solutions apply or may need adaptation. Instead, our is used, that it poses a risk to themselves or others, and work here offers guidelines on how to generally proceed its functionality cannot be completely verified. Second, to systematically integrate the consideration of trust in we assume that the truster has an ambivalent attitude data engineering technology. This naturally integrates towards the data engineering technology and can be led into the conceptual planning phase of data engineering to trust it. Finally, we acknowledge that the actions of processes (i.e., the leftmost step in Figure 2). developers and companies can also create an illusion of trustworthiness, e.g., through clever designed evidence. 4.3. Identify trust scenarios Here, we assume a benevolent trustee, who intends to Our model for trust in data engineering represents a mul- provide actual evidence of trustworthiness and does not titude of scenarios in which humans with specific roles, want to trick the user into trusting a non-trustworthy risks, and vulnerabilities are in a trust relationship with technology. a data engineering technology. Specific evidence will be needed - and at the same time enough - for individual 4.2. Trust-integrated data engineering trusters to perceive a particular application as trustwor- With these underlying assumptions, we enrich the gen- thy. Therefore, it makes sense to identify the specific trust eral data engineering process to integrate trust in the scenarios anticipated with respect to the application goal, such that that the collection of evidence can be tailored 9 Goal definition Data collection Data wrangling Data enrichment Data distribution + identify trust scenarios + collect evidence + collect evidence + collect evidence + collect evidence + identify trust break points + distribute evidence + devise trust strategy Metadata management + collect evidence Governance + manage evidence Figure 2: Framework integrating trust in the development of data engineering pipelines. We show the main components of traditional data engineering development in black and our enrichments that integrate trust in blue. to these. 4.5. Devise a trust strategy At this stage, we propose to think about scenarios, re- In a sense, identifying trust scenarios and trust break- lationships, or use cases where the targeted application points can be seen as a requirements analysis on how (goal) has some sort of power over the truster, putting the to cover trust. This analysis forms the foundation to de- truster at risk. Modalities of power as identified in the vise a trust strategy, i.e., a plan to meet the requirements. field of political philosophy could be a starting point. Fur- Referring back to the distinction of reliance and trust, it thermore, different kinds of trusters, i.e., trusters exhibit- will not be enough to provide evidence that convinces ing different influencing factors, should be considered. It the truster that the application is pragmatically the best is important to identify which different combinations of option to use. Instead, following our trust model, the influencing factors may define trusters in relevant trust trust strategy should be designed to provide sufficient ev- scenarios, as well as the specific risks they potentially idence on ability, benevolence, and integrity to increase face, to then devise trust strategies tailored to the dif- perceived trustworthiness. ferent kinds of trusters. For a wide coverage of possible The first idea that comes to mind is to transparently trust scenarios, we recommend a diverse set of examiners provide more information about the trust breakpoints, with a critical mindset. which the user can use to judge the trustworthiness of the application. This will mostly respond to the ability 4.4. Identify trust breakpoints of the trust breakpoint’s DETAs, but could also include After identifying trust scenarios, it is time to pinpoint evidence for the integrity and benevolence of the com- the critical parts for perceived trustworthiness in the pany and developers behind the application. Several (planned) data engineering process. We call these trust methods have been developed to provide metadata that breakpoints. They may comprise methods, algorithms, can serve as evidence, including plain information about or other DETAs that could expose a truster to some risk datasets [40], data provenance [16], or machine learning by not meeting specific quality, functionality, or usabil- explanations [35]. However, the problem of choosing a ity guarantees, as their behavior bears some degree of suited strategy for requirements given by trust scenar- uncertainty, unpredictability, or unverifiability. ios and breakpoints remains. To systematically devise a It is possible that one trust scenario has multiple trust strategy and identify pertinent methods, we propose to breakpoints or that different trust scenarios share the answer the following six questions in a structured way: same breakpoint. This leads to many-to-many relation- (Q1) What should the trust strategy enable the truster to ships between trust scenarios and trust breakpoints. For do? This refers to additional “-ility" requirements of the each application-relevant combination, we further rec- system that support the truster in their trust assessment ommend to determine the requirements each breakpoint and ultimately decision. Answers could include verifiabil- in each scenario has to meet in order to minimize or avoid ity, reproducibility, traceability [41], reviewability [42], risk. accountability [43], auditability [44], or trialability [45]. Since the data engineering software is a technolog- Different answers will require different pieces of evidence ical product, the quality of its trust breakpoints is al- produced by different methods. For example, verifiabil- ways shaped by the human capabilities, thoughts, and ity of an output may require an explanation on how the attitudes of its designers, developers, and surrounding output was generated, whereas the reproducibility of an organization. Therefore, there are truster-organization algorithm asks for information about the algorithm and and truster-developer trust relationships to be identified its parameters. and addressed as well. (Q2) For what kind of component does the truster need 10 evidence for? Different components of the data engi- sibly be evaluated and a trust strategy validated. Given neering technology will require different methods. For the complexity of human trusters through the number example, methods applying to SQL processing [46] sig- and variety of influencing factors on trust, we postulate nificantly differ from methods for data transformations that a trust reaction can hardly be simulated, as has been in Map/Reduce pipelines [47]. This question also asks for attempted for instance by Ribeiro et al. [35]. Therefore, the granularity of the component that the truster needs we suggest to resort to proper user studies, analogously evidence of. Whether it is one, multiple, or only the out- to studies conducted for instance in social sciences or put of a DETA will influence the choice of methods to human-computer-interaction, to evaluate a trust strategy. use. We provide guidelines on how to perform such studies (Q3) What is the timeframe the truster needs evidence relating to trust in data engineering. for? Depending on the trust scenario, the evidence should cover past information (e.g., evolution provenance [48]), 5.1. Study participants and goals real-time information (e.g., machine learning model ex- planation [35]), or future information (e.g., future use of As we have seen, a trust strategy is designed and imple- sensitive data [49]). mented specifically for a trust scenario. Therefore, the (Q4) What type of information is needed? To provide evaluation of the strategy should reuse this scenario in the truster with the necessary evidence, different types order to validate the strategy with respect to the scenario. of information can be used. Examples include factual This means that participants in the user study should information such as fairness scores [50], explanations of have the same role towards the application as the truster outcomes [35], or less technical information, e.g., limita- in the scenario. Furthermore, the participants should tions or legal considerations [40]. satisfy the modeled requirements on trusters, i.e., they (Q5) What presentation is appropriate for the truster? should be aware that the application is uncertain and Depending on the truster’s role, level of expertise, and its use is related to a specific risk, as defined in the sce- other characteristics (influencing factors), the evidence nario. To ensure this, proper participant selection and has to be prepared and presented accordingly. Therefore, gauging questions in the questionnaire of the user study an appropriate level of abstraction and appearance have are possible methods one can employ. Additionally, we to be chosen, that provides the evidence without over- recommend properly introducing the participants to the whelming the truster. It could for instance be presented scenario, where they should be made aware of their role like in Datasheets for Datasets [40], where the informa- and the risk the application can pose. tion is presented as structured text and kept at a very Before deciding on the study setup or devising the technical level, or the evidence can be presented as in questionnaire, the question on what hypotheses to verify Nutritional Labels for Rankings [50], where the informa- needs to be answered. One example of such a hypothesis tion is (visually) supported using icons, diagrams, and is: “The devised trust strategy increases the perceived trust- information boxes. worthiness of the data engineering technology compared (Q6) What other requirements have to be fulfilled? Since to the same technology without trust enrichment.”. Clearly, the trust strategy has to fit the overall development plan the hypothesis should explicitly focus on an aspect of the and requirements, other (technical) requirements may trust model, for which the impact of the trust strategy also apply. These could include storage constraints [51], is then evaluated. The impact itself also encompasses privacy considerations [52], access control [53] or execu- different possible aspects, e.g., perceived trustworthiness tion speed [54]. (wrt image in the model), actual use, etc. This should After these questions have been answered for all pre- be clarified as part of the hypothesis. Finally, the scope viously determined relevant trust breakpoint-scenario of the evaluation needs to be defined, clarifying which combinations, the developers have enough information aspects of the trustee are covered (e.g., the whole data to identify or develop appropriate methods. engineering technology or just selected DETAs). 5. Trust strategy evaluation 5.2. Methods for trust evaluation Once the “what” has been defined, one can address the After the trust strategy has been defined and imple- question on “how” to conduct the study. Here, study de- mented, including the collection of evidence, the question signers have to decide which methods to use to evaluate remains whether the strategy performs as expected. That the target aspects. The notion of trust is inherently diffi- is, whether the collected evidence helps trusters to estab- cult to quantify, which explains why a set of measurable lish a trust relationship with the trustee, in our setting a proxies is usually used that, ideally, highly correlate with data engineering technology. In this section, we discuss the aspects of interest. We review methods that have how the notion of trust we defined in this paper can pos- 11 been used to evaluate trust and which are amenable to development in Section 6.1 and report on its evaluation our data engineering setting. in Section 6.2. Experiments. For interpersonal trust, researchers have conducted various studies in which the participants could 6.1. Record linkage in a credit scoring choose between different options [55, 56]. Each of these was implicitly related to trust or distrust based on a risk application and reward system. By tracking participants’ actions, re- Credit scores for individuals as provided by companies searches could conclude whether the participants trusted like Equifax or TransUnion are widely used to evaluate each other or not. This technique can be adjusted for the “creditworthiness” of individuals. This can have a evaluating data engineering technology by creating eval- significant impact on human’s lives, e.g., depending on uation scenarios in which the participants can actively their credit score, they may or may not be granted a loan, choose between different options that correlate with trust may have to pay higher or lower interest rates, may be or distrust. Recording the decisions of participants can preferred or not in the competitive housing market to be used as a proxy to measure actual use. sign a lease, etc. Therefore, it is crucial for all parties (the Questionnaires. In designing questionnaires to evaluate human customers, banks, landlords) that a person’s credit trust in data engineering technology, we can adapt and history or report, on which the scores are based, is correct extend questionnaires that have been devised to evaluate and complete. A report itself comprises various customer trust in other settings. Examples of questions used to activities that are shared by different entities (banks, in- measure trust appear in the trust section of the General surances, credit card companies, mobile phone providers, Social Survey [57] (an annually conducted study in the etc.) cooperating with the credit scoring company that US). Another option is to derive trust questions analogous are potentially related to the customers’ creditworthiness. to the questions on usability and understandability of Examples include opening of a bank account, successfully the technology acceptance model (TAM) [58]. These paying back a loan, etc. techniques allow to examine the thoughts, attitudes, etc. To ensure the data of persons’ credit reports are accu- of the participants including perceived trustworthiness, rate, newly shared customer activities need to be inte- intention to use, and perceived risk. grated in the consolidated master database of the credit Structured interviews and unstructured questionnaires. storing company. This is performed by a dedicated data Information about perception, attitudes, etc. that are dif- engineering software, which we assume to be similar ficult to express in a question with predefined answers to the pipeline for a similar goal described in [60]. Fol- can be collected or captured via interviews or free text lowing the steps of the general data engineering process fields in questionnaires. This includes, e.g., the reasoning outlined in Figure 2, the goal definition is to correctly behind participants’ answers to structured questions or update the master database, given the data of a newly re- additional comments on the study. Such answers can ported activity record. In this context, the data collection provide valuable information on aspects that study de- step includes accessing data of the master database (we signers did not anticipate and offer insights on how to can assume an SQL query interface) and newly reported potentially improve the technology, including the trust records, e.g., obtained via an API. The subsequent data strategy. processing that will result in the transformed (updated) Quantitative metrics. In some settings, it is possible master database is all part of data wrangling. Sub-tasks to include quantitative metrics into the trust evaluation. of data wrangling in our use case include the standardiza- For instance, Wintersberger et al. [59] measure the heart tion of addresses to all be in the same format, the match- rate of their participants during their study on trust in ing of a record from the master database corresponding traffic augmentation for automated driving systems. In to the same person as the new entry (record linkage) pos- their scenario, there was a correlation between heart sibly followed by human intervention when the match rate and trust. For data engineering technology, other is uncertain (e.g., when no global unique identifier like quantitative metrics such as reaction time may apply. a social security number is available and not all fields match). If a match is identified, the record on file and the new record are merged to a new record (data fusion). 6. Application of our methods to a The merged record is then written back to the master use case database, which can then be queried by subsequent ap- plications, such as an application deriving a credit score. After defining our model of trust with respect to data engineering technology as well as guidelines on how to 6.1.1. Trust scenarios devise and evaluate a corresponding trust strategy, we put our approach to the test by applying it on a real world In the use case introduced above, the first step towards de- use case. We describe the use case and its trust strategy vising a trust strategy is to define trust scenarios. To this 12 end, we first identify various parties (possible trusters) relationship between the employees and the data engi- that have some kind of relationship with the data en- neering technology they use to consolidate credit reports. gineering application that can potentially be a trust re- lationship. These include, for instance, the customers, 6.1.2. Trust breakpoints whose personal data are stored and evaluated by the credit scoring company and the employees of the credit For this specific trust scenario we identified above, we scoring company that should trust the technology to sup- consider several trust breakpoints, i.e., DETAs that may port them in their task of matching and merging records. affect employees’ trust relationship with the technology. Let us now analyze the potential trust relationship A first review reveals for instance that during data collec- between a customer in the role of truster and the data tion, trust may be jeopardized by the reporting entities engineering technology (trustee) in more detail. Clearly, that may transmit erroneous data. During data wran- the customer relies on the credit reporting technology gling, the address standardization may sometimes be (e.g., accessible through a web interface) to be able to inaccurate, depending on which (external) address check provide the described service (maintaining the credit service is used. Next, the record linkage may match the report), e.g., to secure a loan. While the customer may be wrong records or present the employees with what can aware of the impact a (wrong) credit history can have on be perceived as misleading information to make their the loan application, the customer usually simply expects decision. Finally, the merge of records could yield an the service to work as intended, considering it as an erroneous record. We consider employees unlikely to instrument to achieve a goal. As we saw in Section 2.1.3, question the data collection or address standardization this rather qualifies as trust as reliance. Also, customers DETAs directly (they more likely may not trust exter- may not be aware that the underlying technology cannot nal entities serving as data providers, which are other be completely verified and can exhibit quality issues. trust relationships). We assume their trust relationship is This picture changes when we turn our attention to mostly affected by the internal workings of the assistance the employees involved in the “human-in-the-loop” data the system gives them during record linkage or merge. engineering technology as potential trusters in a trust To demonstrate the development of a trust strategy, we relationship. Clearly, being part of the process, they are focus on the first of these two breakpoints. well aware that the data engineering technology can- not be completely verified and can cause quality issues. 6.1.3. Trust strategy They are also aware of the risk the use of the technology In order to devise a trust strategy for the trust scenario poses, not necessarily to themselves but to their friends, and breakpoint identified above, we answer the questions their relatives, and the society in general. For their work, proposed in Section 4.5. Essentially, the trust strategy however, they rely on the technology and depending should enable employees of a credit scoring company on company policy, the use of the technology bearing who consolidate personal data to judge the trustworthi- some uncertainty with respect to quality may also put ness of technology, which, in this scenario, we assume re- these employees at risk, e.g., if, in a performance review lates mostly to verifiability of its functionality and quality it turns out that these employees did match and merge (Q1). Given the trust breakpoint under consideration, we a significant amount of credit reports that have led to need evidence for the record linkage component (Q2). As claims for correction or to too generous credit scores the employees make point-wise match decisions, work- for non-creditworthy customers. Overall, we see that all ing with the technology for each individual case, the ade- criteria are met by employees to be a truster in a trust quate time frame for evidence is “the now”, i.e., real-time relationship as defined by our trust model. (Q3). Considering what type of information is needed as On the trustee side, the credit reporting technology evidence, we argue that developers are probably inter- comprises several DETAs, e.g., the different steps of the ested in explanations on how the program came to the data engineering pipeline we described above. Given conclusion that two records could match, while design the common uses of such technology, it undoubtedly has decisions on system level and implementations are not some social power. As mentioned before, it also exhibits pertinent (Q4). In terms of presentation, employees bene- some uncertainty and unverifiability on how the credit fit from simple and easy to understand explanations that reports are generated. Influencing factors relating to do not use technical terms from underlying algorithms, the DETAs are mainly their functionality and quality. as well as visual cues that support the understandability Besides the credit reporting technology, developers and of explanations (Q5). We consider no additional require- designers, but also the reporting entities cooperating ments (Q6). with the credit scoring company also potentially affect With the answers to the questions given above, we can the trust employees put into the trustee. determine suited methods and algorithms to implement Given the discussion above, we focus on devising a the trust strategy, where we essentially opt to provide trust strategy for the trust scenario defined by the trust 13 employees with an explanation of matching candidates (e.g., [61, 62, 63]). We rely both on the visualization of that serves as evidence of the trustees ABI, so that the feature importance by using different color highlights for employees can potentially gain trust in the system’s be- attributes that are important for making a match decision havior. and attributes that are important towards a non-match. We further provide explanations in the form of human- 6.2. Evaluating the trust strategy readable model approximation, listing positive semantic indicators (e.g., important fields firstname, lastname, and The goal of the trust strategy in this use case is to foster date of birth are equal) and negative semantic indicators trust of employees in the data engineering technology (e.g., contradictory gender). they use, by means of explanations. To evaluate if the In the third section of the study, each participant trust strategy implementation achieves this goal, we con- answers an exit questionnaire that covers several as- duct a user study, following our discussion in Section 5. pects, including usability, by adapting questions from This section summarizes the study design, presents re- the TAM [58]. We formulate additional questions to as- sults, and discusses these. sess perceived risk and trustworthiness (see Figure 4), following the same rationale as TAM questions. The an- 6.2.1. Study design swers to these questions again follow a 7-grade Likert scale, ranging from the most positive answer “strongly The participants we aim to recruit should take the po- agree” (1) to the most negative answer “strongly disagree” sition of employees of a fictive credit scoring company (7). The study section concludes with a free text field for and review the ambivalent decision of a record linkage additional remarks. DETA. Given the ongoing pandemic, we design an online During the second section of the study, we capture par- study. From the different methods for trust evaluation ticipants’ decision time per match as quantitative metric. (see Section 5.2), we focus mostly on questionnaires to capture the participants’ stance on the data engineering 6.2.2. Results technology. The study includes three main sections, we summarize next. Full details are available on our repeata- At the time of submission, a total of 19 participants with bility website1 . a computer science background took part in our user In its first section, the study provides an introduction study (10 without / 9 with explanations). We opted for to the setting of the study and the topic of record linkage participants with a computer science background to en- in the context of credit report generation. Thereby, we sure all participants have a general understanding of data enable the participants to make informed decisions in engineering technology, to better grasp the task we ask the next section focusing on record linkage, and raise them to perform. Based on responses to the first section their awareness for the underlying potential risk. We of the study, we conclude that the participants are gen- further add questions based on 7-grade Likert scales to erally optimistic that technology can be helpful rather assess the participants’ ambivalent attitude towards the than harmful (mean of 2.7) while they are aware that the technology they evaluate and their risk awareness with technology may put others at risk (mean of 2.7). Thus, respect to the scenario. Answers to these questions allow they are aware and careful because of associated risks us to verify the assumptions stated in Section 4.1. We (mean of 2.7). also include test questions to determine if participants Determining if the explanations implemented follow- have understood the problem of record linkage. ing the devised trust strategy have any effect, we ana- Next, participants are presented with potential lyze if there is some statistically significant difference matches, i.e., pairs of records the system suggests to be between the group of participants without explanations matches, for which participants, in their role as employ- and the group with explanations. Considering reaction ees, have to decide if they agree with the system or not. time, accuracy of participant match decisions, and the The study comprises 60 matches that each participant Likert scale questions relating to trust, the applicable sta- reviews. We ensure that these matches cover diverse tistical tests (t-tests or Wilcoxon-Mann-Whitney-Tests) real-life match situations of varying difficulty in a bal- do not reveal a difference between groups of participants anced way. The participants were shown the matches in with and without explanations. We thus cannot conclude a random order. that explanations have a significant effect on the interac- To evaluate the effect of the trust strategy, participants tion between employees and the record linkage DETA, are split into two groups: one gets to see explanations in particular, on trust. While the study may benefit from alongside matches, the other group not. Different op- a larger number of participants, the current results show tions for record linkage explanation have been proposed that statements of the sort “explanations are a means to 1 improve trust” should be used cautiously, as it remains an https://www.ipvs.uni-stuttgart.de/departments/de/research/ open question in our use case (and others that have not projects/fat_dss/ 14 Q4.15 Q4.16 Q4.17 Q4.18 Q4.19 Question ID Question 7 Q4.15 I would feel safe if people’s data were processed by this 6 system. Q4.16 I would feel at risk if the system was used to decide Likert scale 5 about me and my data. 4 Q4.17 I believe in the benefits of the new system. 3 Q4.18 Assuming I have the power to make decisions in a credit 2 scoring company, I would predict that I would decide 1 to use the system. witout explanation with explanation Q4.19 I trust the system. Study group Figure 3: Excerpt of the answers on perceived risk and trust Figure 4: Study questions relating to the judgment, per- by participants without and with explanations. ceived trust, and eventual intention to use been evaluated), if this holds. Clearly, there is a need for ing of matched records. This indicates that the second a more systematic consideration of trust, how to possibly trust breakpoint we identified in our use case is indeed integrate it in the design of a data engineering technology relevant. Second, participants indicated that additional (and others), and how to evaluate it. The contributions information in the records such as bank account num- of this paper are a first step in that direction. bers would be helpful for their task. This can be seen as Questions included in the first section and the third relating to the system’s functionality and quality. section of the study can further be used to compare the “state-of-mind” of participants before and after they have 7. Conclusion and Outlook interacted and gained some experience with the record linkage system. Here, we determine that, without expla- This paper started a nuanced discussion on trust in data nations, participants show increased trust in potentially engineering. Grounded in established notions of trust risky technology after the study, compared to before from philosophy and psychology, we defined a trust the study (p=0.039). This could not be observed in the model and proposed guidelines on how to consider such presence of explanations. On the contrary, we observe trust when developing data engineering pipelines by de- a statistically significant decrease in technological opti- vising a trust strategy. Such a strategy ideally fosters mism and trust for participants that were shown expla- trust in data engineering applications, which needs to be nations (p=0.009). That is, not only can we not confirm validated. To this end, we suggested a general evaluation that explanations are helpful to foster trust, but we have procedure. We applied our methods to a real-world use an indication that they may actually harm it. A reason case, demonstrating the applicability of the model, guide- may be that explanations give employees further infor- line, and evaluation procedure. However, our evaluation mation they can question or that may raise suspicion, failed to assert that the explanations we provided as evi- outweighing possible benefits of explanations. dence fostered trust in our use case, strengthening us in While not showing a statistically significant difference our initial motivation that statements like “explanations between the two studied groups, we still provide some improve trust” may be unfounded. This highlights the further discussion on the answers to questions relating to need for further investigation on systematically incorpo- the judgment, perceived trust, and eventual intention to rating and evaluating trust in data engineering. use (Q4.14 – Q4.19, summarized in Figure 4). The answers to these questions ranging from 1 (strongly agree) to 7 References (strongly disagree) are summarized in Figure 3. Wee see that while the majority of participants in any of the two [1] J. Angwin, J. Larson, S. Mattu, L. Kirchner, Ma- groups do not feel safe (Q4.15) but rather at risk (Q 4.16), chine bias: There’s software used across the country they do believe in the benefits of the system (Q 4.17). Also, to predict future criminals. and it’s biased against the majority of participants, irrespective of whether they blacks, https://propublica.org/article/machine-bias- have been shown explanations or not, predict they would risk-assessments-in-criminal-sentencing, 2016. decide to use the system (Q4.18). However, when directly [2] L. Sweeney, Discrimination in online ad delivery, asking about trust, participants with explanations tend Queue 11 (2013). to give a lower rating to perceived trust (Q4.19). Indeed, [3] S. Lowry, G. Macpherson, A blot on the profession., while all but one participant in this group gave a neutral British medical journal (Clinical research ed.) 296 or negative rating (the median as well as the most positive (1988) 657—-658. value are 4), more positive ratings are given by almost [4] X. L. Dong, E. Gabrilovich, K. Murphy, V. Dang, half the participants not having seen explanations. W. Horn, C. Lugaresi, S. Sun, W. Zhang, Knowledge- Finally, we report on the two main comments partic- based trust: Estimating the trustworthiness of web ipants provided as part of the final unstructured ques- sources, Proceedings of the VLDB Endowment 8 tion. First, participants inquired about further details (2015) 938–949. concerning the step following record linkage, i.e., merg- [5] A. Fariha, A. Tiwari, A. Radhakrishna, S. Gulwani, A. Meliou, Conformance constraint discovery: Mea- 15 suring trust in data-driven systems, in: Proceedings Handbook of Trust and Philosophy, Routledge, of the 2021 International Conference on Manage- 2020, pp. 313–325. ment of Data, 2021, p. 499–512. [21] Plato, The Republic, 1994. URL: http://classics.mit. [6] X. Zhang, B. Qian, S. Cao, Y. Li, H. Chen, Y. Zheng, edu/Plato/republic.html. I. Davidson, INPREM: an interpretable and trust- [22] F. M. Alonso, Reasons for reliance, Ethics 126 (2016) worthy predictive model for healthcare, in: ACM 311–338. SIGKDD Conference on Knowledge Discovery and [23] M. N. Smith, Reliance, Noûs 44 (2010) 135–157. Data Mining, 2020, pp. 450–460. [24] C. McLeod, Trust, in: E. N. Zalta (Ed.), The Stanford [7] P. Vassiliadis, A. Simitsis, S. Skiadopoulos, Concep- Encyclopedia of Philosophy, Fall 2020 ed., Meta- tual modeling for ETL processes, in: Proceedings physics Research Lab, Stanford University, 2020. of the ACM International Workshop on Data Ware- [25] J. Simon, The Routledge handbook of trust and phi- housing and OLAP, 2002, p. 14–21. losophy, Routledge, 2020. [8] I. F. Ilyas, X. Chu, Data cleaning, Morgan & Clay- [26] R. C. Mayer, J. H. Davis, F. D. Schoorman, An inte- pool, 2019. grative model of organizational trust, Academy of [9] A. Doan, A. Halevy, Z. Ives, Principles of Data Inte- management review 20 (1995) 709–734. gration, 2012. [27] F. Grodzinsky, K. Miller, M. J. Wolf, Trust in artificial [10] C. Dai, D. Lin, E. Bertino, M. Kantarcioglu, An agents, in: The Routledge Handbook of Trust and approach to evaluate data trustworthiness based on Philosophy, Routledge, 2020, pp. 298–312. data provenance, 2008, pp. 82–98. [28] M. Coeckelbergh, Can we trust robots?, Ethics and [11] C. Dai, H. Lim, E. Bertino, Y. Moon, Assessing the Information Technology 14 (2011) 53–60. trustworthiness of location data based on prove- [29] A. M. Evans, J. I. Krueger, The psychology (and eco- nance, in: 17th ACM SIGSPATIAL International nomics) of trust, Social and Personality Psychology Symposium on Advances in Geographic Informa- Compass 3 (2009) 1003–1017. tion Systems, 2009, pp. 276–285. [30] J. A. Simpson, Foundations of interpersonal trust, [12] L. D. Santis, M. Scannapieco, T. Catarci, Trusting in: Social psychology: Handbook of basic principles, data quality in cooperative information systems, in: 2007, pp. 587–607. On The Move to Meaningful Internet Systems 2003: [31] D. Dunning, D. Fetchenhauer, T. Schlösser, Why CoopIS, DOA, and ODBASE, 2003, pp. 354–369. people trust: Solved puzzles and open mysteries, [13] H. Felzmann, E. F. Villaronga, C. Lutz, A. Tamò- Current Directions in Psychological Science 28 Larrieux, Transparency you can trust: Trans- (2019) 366–371. parency requirements for artificial intelligence be- [32] M. Deutsch, Trust and suspicion: Theoretical notes, tween legal norms and contextual concerns, Big in: The Resolution of Conflict, 1973, pp. 143–176. Data & Society 6 (2019) 1–14. [33] H. Jiang, B. Kim, M. Guan, M. Gupta, To trust or [14] M. Janic, J. P. Wijbenga, T. Veugen, Transparency not to trust a classifier, in: Advances in Neural enhancing tools (tets): An overview, in: Third Information Processing Systems, volume 31, 2018, Workshop on Socio-Technical Aspects in Security pp. 1–12. and Trust, 2013, pp. 18–25. [34] M. Reyes, R. Meier, S. Pereira, C. A. Silva, F.-M. [15] K. Siau, W. Wang, Building trust in artificial in- Dahlweid, H. v. Tengg-Kobligk, R. M. Summers, telligence, machine learning, and robotics, Cutter R. Wiest, On the interpretability of artificial intelli- Business Technology Journal 31 (2018) 47–53. gence in radiology: Challenges and opportunities, [16] M. Herschel, R. Diestelkämper, H. Ben Lahmar, A Radiology: Artificial Intelligence 2 (2020) 1–12. survey on provenance: What for? what form? what [35] M. T. Ribeiro, S. Singh, C. Guestrin, "why should from?, The VLDB Journal 26 (2017) 881–906. i trust you?": Explaining the predictions of any [17] B. Glavic, Big data provenance: Challenges and im- classifier, in: Proceedings of the 22nd ACM SIGKDD plications for benchmarking, in: Specifying Big International Conference on Knowledge Discovery Data Benchmarks - First Workshop and Second and Data Mining, 2016, p. 1135–1144. Workshop, WBDB, Revised Selected Papers, 2012, [36] S. M. Meeßen, M. T. Thielsch, G. Hertel, Trust in pp. 72–80. management information systems (MIS), Zeitschrift [18] L. Kot, Tracking personal data use: Provenance and für Arbeits- und Organisationspsychologie A&O 64 trust, in: Seventh Biennial Conference on Innova- (2020) 6–16. tive Data Systems Research, 2015, p. 1. [37] L. Thornton, B. Knowles, G. Blair, Fifty shades [19] Y. L. Simmhan, B. Plale, D. Gannon, A survey of of grey, in: Proceedings of the 2021 ACM Confer- data provenance in e-science, SIGMOD Rec. 34 ence on Fairness, Accountability, and Transparency, (2005) 31–36. 2021, pp. 64–76. [20] J. P. Sullins, Trust in robots, in: The Routledge [38] R. Tomsett, D. Braines, D. Harborne, A. Preece, 16 S. Chakraborty, Interpretable to whom? a role- in: Proceedings of the 2018 International Confer- based model for analyzing interpretable machine ence on Management of Data, 2018, p. 1773–1776. learning systems, in: Workshop on Human Inter- [51] A. P. Chapman, H. V. Jagadish, P. Ramanan, Effi- pretability in Machine Learning, 2018, pp. 8–14. cient provenance storage, in: Proceedings of the [39] C. B. Gibson, J. A. Manuel, Building trust - effective 2008 ACM SIGMOD international conference on multicultural communication processes in virtual Management of data, 2008, p. 993–1006. teams, in: Virtual Teams That Work: Creating [52] S. B. Davidson, S. Khanna, S. Roy, J. Stoyanovich, Conditions for Virtual Team Effectiveness, Jossey- V. Tannen, Y. Chen, On provenance and privacy, Bass, 2003, pp. 59–86. in: Proc. of the 14th Intl. Conference on Database [40] T. Gebru, J. Morgenstern, B. Vecchione, J. W. Theory, 2011, p. 3–10. Vaughan, H. Wallach, H. Daumeé III, K. Crawford, [53] A. Chebotko, S. Lu, S. Chang, F. Fotouhi, P. Yang, Datasheets for datasets, in: Proceedings of the 5th Secure abstraction views for scientific workflow Workshop on Fairness, Accountability, and Trans- provenance querying, IEEE Transactions on Ser- parency in Machine Learning, 2018, pp. 1–17. vices Computing 3 (2010) 322–337. [41] J. A. Kroll, Outlining traceability, in: Proceedings [54] N. Bidoit, M. Herschel, A. Tzompanaki, Efficient of the 2021 ACM Conference on Fairness, Account- computation of polynomial explanations of why- ability, and Transparency, 2021, p. 758–771. not questions, in: Proceedings of the 24th ACM [42] J. Cobbe, M. S. A. Lee, J. Singh, Reviewable auto- International on Conference on Information and mated decision-making: A framework for account- Knowledge Management, 2015, p. 713–722. able algorithmic systems, in: Proceedings of the [55] R. L. Swinth, The establishment of the trust rela- 2021 ACM Conference on Fairness, Accountability, tionship, Journal of conflict resolution 11 (1967) and Transparency, 2021, p. 598–609. 335–344. [43] M. Wieringa, What to account for when accounting [56] E. L. Glaeser, D. I. Laibson, J. A. Scheinkman, C. L. for algorithms: A systematic literature review on Soutter, Measuring trust, Quarterly Journal of algorithmic accountability, in: Proceedings of the Economics 115 (2000) 811–846. 2020 Conference on Fairness, Accountability, and [57] T. W. Smith, M. Davern, J. Freese, S. L. Morgan, Transparency, 2020, p. 1–18. General social surveys, https://gss.norc.org/, 1972- [44] R. Cloete, C. Norval, J. Singh, A call for auditable 2018. virtual, augmented and mixed reality, in: 26th ACM [58] F. D. Davis, A technology acceptance model for em- Symposium on Virtual Reality Software and Tech- pirically testing new end-user information systems: nology, 2020, pp. 1–6. Theory and results, Ph.D. thesis, Massachusetts In- [45] R. Agarwal, J. Prasad, The role of innovation char- stitute of Technology, 1986. acteristics and perceived voluntariness in the ac- [59] P. Wintersberger, T. von Sawitzky, A.-K. Frison, ceptance of information technologies, Decision A. Riener, Traffic augmentation as a means to in- Sciences 28 (1997) 557–582. crease trust in automated driving systems, in: Pro- [46] C. Li, Z. Miao, Q. Zeng, B. Glavic, S. Roy, Putting ceedings of the 12th Biannual Conference on Italian things into context: Rich explanations for query SIGCHI Chapter, 2017, pp. 1–7. answers using join graphs, in: Proceedings of the [60] M. Weis, F. Naumann, U. Jehle, J. Lufter, H. Schuster, 2021 ACM SIGMOD international conference on Industry-scale duplicate detection, Proceedings of Management of data, 2021, pp. 1051–1063. the VLDB Endowment 1 (2008) 1253–1264. [47] M. Interlandi, K. Shah, S. D. Tetali, M. A. Gulzar, [61] S. Thirumuruganathan, M. Ouzzani, N. Tang, Ex- S. Yoo, M. Kim, T. D. Millstein, T. Condie, Titian: plaining entity resolution predictions: Where are Data provenance support in spark, Proceedings of we and what needs to be done?, in: Proceedings of the VLDB Endowment 9 (2015) 216–227. the Workshop on Human-In-the-Loop Data Analyt- [48] B. Ludäscher, I. Altintas, C. Berkley, D. Higgins, ics, 2019, pp. 1–6. E. Jaeger, M. Jones, E. A. Lee, J. Tao, Y. Zhao, Scien- [62] A. Ebaid, S. Thirumuruganathan, W. G. Aref, A. K. tific workflow management and the kepler system, Elmagarmid, M. Ouzzani, EXPLAINER: entity res- Concurrency and Computation: Practice and Expe- olution explanations, in: 35th IEEE International rience 18 (2006) 1039–1065. Conference on Data Engineering, 2019, pp. 2000– [49] S. Oppold, M. Herschel, Accountable data analytics 2003. start with accountable data: The liquid metadata [63] S. Gurajada, L. Popa, K. Qian, P. Sen, Learning- model., in: ER Forum/Posters/Demos, 2020, pp. based methods with human-in-the-loop for entity 59–72. resolution, in: Proceedings of the 28th ACM In- [50] K. Yang, J. Stoyanovich, A. Asudeh, B. Howe, H. Ja- ternational Conference on Information and Knowl- gadish, G. Miklau, A nutritional label for rankings, edge Management, 2019, pp. 2969–2970. 17