<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Sensemaking in Multi-artefact Information Tasks</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Tianwa Chen</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>School of Information Technology and Electrical Engineering, The University of Queensland</institution>
          ,
          <addr-line>Brisbane</addr-line>
          ,
          <country country="AU">Australia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Confronted with information silos and a growing volume of data in an increasingly interconnected datadriven world, knowledge workers, including technical and business users, often have to navigate multiple information artefacts to complete their tasks. These artefacts dispersed across various representational formats, and various information systems, can lead to overlapping, redundant or even conflicting information and ineficiency in information retrieval and knowledge workers' understanding. Despite a growing market of tools, there is a lack of understanding in the current body of knowledge of how knowledge workers make sense of the multi-artefact information tasks and through what strategies. Motivated by the human-centric nature of the problem, this PhD project employs experiments, both in lab studies and on crowdsourcing platforms, and uses a number of behavioral and performance measures to unpack the cognitive demands on knowledge workers as they make sense of dual artefact tasks and multi-artefact tasks respectively. This project aims to propose an integrative model of sensemaking and cognitive processing in multi-artefact information tasks. The findings contribute to a better understanding of the sensemaking processes in various settings, inform modeling practice, and design supporting tools.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Sensemaking</kwd>
        <kwd>Business process modeling</kwd>
        <kwd>Data curation</kwd>
        <kwd>Data quality</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        With the widespread problem of information silos and an increase in data accessibility,
knowledge workers, including technical and business users, often rely on multiple information artefacts
across diferent systems to complete their tasks. According to IDC [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], a typical knowledge
worker can spend 36% of the daily time searching for and consolidating information from
multiple artefacts, but workers can find the required information only 56% of the time. 61% of
knowledge workers regularly access four or more diferent artefacts to retrieve the information
they need for their work, and 15% access 11 or more. These artefacts dispersed across various
representational formats, and various information systems, can lead to overlapping, redundant
or even conflicting information and ineficiency in information retrieval and knowledge workers’
understanding.
      </p>
      <p>
        In practice, given the process can be more diverse and exploratory when knowledge workers
navigate through multi-artefact information tasks, there has been a strong response from the
market with a plethora of tools to support the ‘human-in-the-loop’ [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. However, despite there
being an increasing focus for researchers to study the behaviour of knowledge workers in many
contexts [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ], there has been little focus on the process of knowledge workers making sense
of these multi-artefact information tasks. The current body of knowledge does not adequately
explain knowledge workers’ sensemaking behaviours and strategies when interacting with
these tasks.
      </p>
      <p>
        To explore this problem, we undertake exploratory studies to investigate knowledge workers’
behaviour in two settings that ofer dual artefact and multi-artefact tasks respectively. For
the setting of dual artefact tasks, in the context of business process management systems
and business rule management systems, two commonly used artefacts are business process
models and business rule repositories. When presented separately, these two artefacts are
known to cause a lack of shared understanding,and conflicts and redundancies that can lead to
ineficiencies and even compliance breaches [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Although a number of integrated modeling
approaches for business processes and rules have been proposed, there is limited knowledge on
how these approaches afect worker behavior and task performance.
      </p>
      <p>
        As for the setting of multi-artefact tasks, there is increasing evidence that knowledge workers,
including data scientists, engineers and analysts, can spend in excess of 80 percent of their
time and efort engaged in the data curation process in a typical data science project [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. These
cost-intensive processes constitute a number of artefact tasks and are considered a drain on
analytic functions within organizations. Due to the inherent complexity of these tasks, the
bulk of data curation tasks still cannot feasibly and eficiently be addressed by machine-based
algorithms [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] without human intervention (e.g., manual inspection) [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Moreover, the existing
tools that support data curation tasks are often domain-focused and challenging to use in
coordination with other program functionalities. Therefore, given the increasing demand for a
more cost-eficient data curation process, researchers have started to look at how knowledge
workers engage with data (e.g., [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]). However, there is a paucity of research focusing on how
knowledge workers interact with various artefact information tasks and what processes they
follow while carrying out data curation activities.
      </p>
      <p>Accordingly, motivated by the human-centric nature of the problem, this PhD project employs
exploratory studies to investigate the behavior of knowledge workers engaging with various
artefact information tasks in the context of a dual artefact information tasks setting and also in
multi-artefact information tasks settings. This project aims to propose an integrative model of
sensemaking and cognitive processing in multi-artefact information tasks. We approach the
design of the research through a sensemaking lens and consider foundational sensemaking
constructs of information foraging and information processing.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Research Goal</title>
      <p>The project is divided into three studies, including experiments in controlled lab studies and
crowdsourcing platforms to understand the cognitive demands on knowledge workers as
they make sense of multi-artefact information tasks. It uses a number of behavioral and
performance measures through the use of eye-tracking and electroencephalography (EEG)
devices in controlled lab experiments.</p>
      <p>The first study aims to investigate knowledge workers’ behavior in dual artefact tasks when
the form of integrated representation of the artefacts (namely business process models and
business rules) and task complexity changes.</p>
      <p>The second study aims to understand how knowledge workers engage with multi-artefact
tasks in the data curation process. We will first investigate the data curation process specifically
related to data quality detection and how to build repeatable and eficient data curation processes
harnessing the collective intelligence of a group of knowledge workers.</p>
      <p>The third study aims to propose an integrative model of sensemaking and cognitive processing
in multi-artefact information tasks by consolidating the research results learnt from the lab
studies and existing frameworks and theories in sensemaking and cognition processing. In
addition, we will test the research model by collecting empirical data from a crowdsourcing
platform.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Related Work</title>
      <sec id="sec-3-1">
        <title>3.1. Sensemaking</title>
        <p>The sensemaking methodology was introduced by Dervin in 1972 to design a human
communications system, which was later developed as the model of the sensemaking triangle representing
how a person makes sense of the situation through a space-time context [8]. Russell et al.’s
“learning loop complex” [9] first proposed the cost structure of the sensemaking model, which
describes the process people use to understand and encode data to answer tasks-specific
questions. Since these seminal works, literature in various domains has contributed to theories and
models of sensemaking.</p>
        <p>More recently, there has been an increased focus on understanding how sensemaking operates
in the era of increasingly complex information artefacts [10]. For instance, researchers have used
a sensemaking perspective to understand how individuals make sense of the fairness assessment
system in ML [11], reusing knowledge [12], debugging strategies [13] and supporting knowledge
acceleration for programming [14].</p>
        <p>Cognitive constructs of attention and memory have a natural and strong afinity to the two
phases in sensemaking models. Cognitive load theory [15, 16] provides proven mechanisms
through which these constructs can be operationalized. For example, attention and search
behaviour has been measured through eye-tracking devices, which can capture data on visual
scanning (eye movement) and attention (eye fixations) [ 17]. This data, in turn, can be used for
various behavioural measurements, such as cognitive load, visual association, visual cognition
eficiency, and intensity [18].</p>
        <p>While there is a long history of the use of eye-tracking technology in medical and psychology
studies [19], the use of it in the context of data work with human-machine teaming is relatively
recent. However, it holds great promise for a deeper understanding of user behaviour in complex
tasks. To our best knowledge, existing sensemaking studies are focused on qualitative or
perceptionary measures with limited use of behavioural and performance measures. Hence, we
considered the use of eye-tracking devices in a controlled experiment as a novel and objective
means to capture and expose sensemaking behaviours and the interactive process of how
knowledge workers explore multi-artefact tasks in diferent settings.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Dual Artefact Information Tasks in the Case of Business Process Models and Business Rules</title>
        <p>
          In dual artefact tasks setting, our study considers the specific context of business process and
business rule modeling – two complementary approaches for modeling business activities,
which have multiple integration methods [20] to improve their individual representational
capacity. In summary, the integration methods can be categorized into three approaches with
distinct format and construction, namely: text annotation, diagrammatic integration, and link
integration [21]. Text annotation and link integration both use a textual expression to describe
the business rules and connect them with the corresponding section of the process model.
With link integration, visual links can explicitly connect corresponding rules with the relevant
process section. Diagrammatic integration relies on graphical process model construction, such
as sequence flows and gateways, to represent business rules in the process model. Each of these
methods has strengths and weaknesses, as summarized in [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], and thus a potential impact on a
knowledge worker’s understanding of a process.
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Multi-artefact Information Tasks in Data Curation</title>
        <p>The importance and scope of data curation have increased multi-fold in the era of big data, due
to the prevalence of external and repurposed data in data science projects. A primary reason
for data curation is the large proportion of externally acquired datasets with diferent quality
levels. In fact, even internal data may have to be repurposed [22] to meet the specific needs of
a certain data science project. In either case, the data curation process constitutes a number
of multi-artefact tasks, which may include selection, classification, transformation, filtering,
imputation, integration/fusion, or validation [23].</p>
        <p>
          Currently, three main approaches are evident in the context of data curation, namely:
adhoc/manual, automated, and crowd-sourced approaches. The manual approach is the most
common approach [23, 24]. However, data quality issues constitute a major challenge for
knowledge workers using a manual approach as it is likely that multiple data quality issues
exist in large datasets, e.g. completeness, accuracy, and consistency [
          <xref ref-type="bibr" rid="ref3">3, 25</xref>
          ].
        </p>
        <p>
          To study knowledge workers, recent research outlined the work cycle of data scientists,
ranging from discovery to design [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. We note that utilizing a crowd-sourcing approach for
building data curation processes from multiple crowd-sourced tasks is currently under-studied
and a key objective.
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Study 1 - Sensemaking in Dual Artefact Tasks – The Case of</title>
    </sec>
    <sec id="sec-5">
      <title>Business Process Models and Business Rules</title>
      <p>In this study, we investigate how user behavior occurs in dual artefact tasks when the form of
integrated representation of the artefacts (namely business process models and business rules)
and task complexity changes. Using a sensemaking lens in our study, we can delineate the
behavior between developing model understanding and task accomplishment.</p>
      <sec id="sec-5-1">
        <title>4.1. Study Design</title>
        <p>We use an experimental research design. In line with sensemaking foundations, we segment
the experiment into two phases, namely a searching and encoding phase (we term this as the
understanding phase) and a task specific information processing phase (termed the answering
phase). The understanding phase commences when the participant first fixates on the experiment
screen, and the answering phase commences when the participant starts to type the answer in
the question area for the first time (see Fig. 1). Due to space limitations, the complete experiment
instruments are available for download 1.</p>
        <p>The experiment data consists of a pre-experiment questionnaire, eye tracking log data, and
task performance data. The eye tracking data was collected through a Tobii Pro TX300 eye
tracker2, which captures data on fixations, gaze, saccades, etc., with timestamps. To capture
sensemaking behavior, we used measurements related to fixation durations and frequencies,
measurements related to AOI specific fixations, and transitions between AOIs.</p>
        <p>The experiment instruments included a tutorial, the treatments and a questionnaire. Each
group of participants was first provided with a BPMN tutorial and was then ofered a model
using one of the three diferent rule integration approaches. In the treatment, we used the three
integration approaches (one per each treatment group). The scenario of the model and rules
originated from a travel booking diagram included in OMG’s BPMN 2.0 documentation3. We
ensured, through multiple revisions, that we created informationally equivalent models for all
three integration approaches, and all confounding factors were constant, including the same
eye-tracking lab equipment and tutorial content. We did not limit the experiment duration nor
a word count limit on participants’ answers. The model was adjusted to ensure consistency
1The experiment materials can be downloaded from bit.ly/3N5Kr6O
2For more specifications of the eye tracker, please visit https://www.tobiipro.com/product-listing/tobii-pro-tx300/
3Model originated from OMG’s BPMN 2.0 examples can be viewed in
http://www.omg.org/cgi-bin/doc?dtc/1006-02
of format for each of the integration approaches, while providing some diversity in terms
of constructs and coverage to gain further insights into the relationship between integration
approaches and task complexity.</p>
      </sec>
      <sec id="sec-5-2">
        <title>4.2. Current Progress</title>
        <p>More details of the current results can be found in the publication [26]. Our results show
that link representation shows better task performance in terms of accuracy and eficiency,
especially as task complexity increases. Additionally, our results provide some evidence that
diagrammatic integration has better task performance on local questions in terms of accuracy,
but also requires the most efort in the initial information foraging (understanding) phase.</p>
        <p>The findings from this study also form the basis of our investigation for the next step. We
will use complementary approaches such as cued retrospective ‘thinking-out-loud’ [27] and
biosensors (e.g. electroencephalography, captured by Emotive4) to provide further explanations
on the sensemaking behavior and cognition process. We also consider the limitations of the
current research, where we only included the basic constructs in business process models,
whereas advanced loop and nesting structures may introduce further complexities in
sensemaking. Therefore, we will also analyze the change in knowledge workers’ behavior over longer
tasks with more variability in task complexity to help further reveal insights into sensemaking,
and this may especially be valuable for training and work allocation purposes.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>5. Study 2 - Sensemaking in Multi-artefact Information Tasks –</title>
    </sec>
    <sec id="sec-7">
      <title>The Case of Data Curation</title>
      <p>As the first step, we aim to understand how knowledge workers engage with multi-artefact
tasks in data curation specifically related to data quality detection.</p>
      <sec id="sec-7-1">
        <title>5.1. Study Design</title>
        <p>To capture knowledge worker sensemaking behaviours while discovering data quality issues,
we used an experimental design method in a lab study with purpose-built experiment platforms
that mimic typical data exploration tools. The lab setting [25] enabled us to use advanced
tracking devices (e.g., eye-trackers and activity loggers) to capture the interaction behaviors.</p>
        <p>Our interface design is typical of several existing data exploration platforms that provide
UI areas with a similar arrangement, see e.g., Talend Cloud API (jupyter.org), RapidMiner
(rapidminer.com), or PowerBI (powerbi.microsoft.com). The UI of our data curation platform
has three main panels: the DataOps area as the internal functions on the left, the working
console area in the middle, and the data view and toolkit area to view and record data quality
annotations on the right (see Fig. 2). We custom built two experiment platforms, with one
requiring manual coding to undertake data quality discovery and the other ofering built-in
functions. On both experiment platforms, we kept all other variables constant and provided
4For more information about Emotiv, please see https://www.emotiv.com/
equivalent information with the same interface design, including the same dataset and a set of
pre-defined functions.</p>
        <p>(a) Experiment platform with coding [28].</p>
        <p>To provide internal data curation resources, we pre-define 21 DataOps, ranging from
importing essential libraries to complex Boolean operations involving regular expressions [28]. This
set of DataOps is suficient to complete all tasks in our experiment (i.e., participants do not
necessarily need to refer to external materials). The dataset includes 13,000 records and four
columns (ID, name, contact number and join date). We chose five most recognised and common
types of data quality issues [28] and injected them into the dataset with the help of Parallel
Data Generation Framework [29] to provide the ground truth. The size of the data and injected
number of errors removed the option of manual annotation.</p>
        <p>The participants were required to complete the task of identifying and annotating the data
quality issues. They are only allowed to use the given browser throughout the experiment.
The experiment commences with a pre-experiment survey, followed by a tutorial outlining
definitions and examples of data quality issues and a practice example, and then they start the
formal experiment whenever they feel ready. At the end of the experiment, the participants are
asked to complete a post-experiment survey. The surveys based on [30] captured participant
perceptions on the experiment tasks and helped ensure internal validity.</p>
      </sec>
      <sec id="sec-7-2">
        <title>5.2. Current Progress</title>
        <p>More details of the current results can be found in the publications [28, 31, 32]. Our findings
show that the approaches taken by the knowledge workers participating in our study were
often diverse and complementary in that they were able to identify diferent data quality issues
with diferent levels of efectiveness and robustness. This bears implications for automatically
creating aggregated data curation process through crowd intelligence.</p>
        <p>However, the current work is not without limitations as it was based on a lab experiment,
and we only focused on detecting data quality issues. Therefore, in the next step, we will
conduct experiments with real crowd workers to fully understand the sensemaking process
in the complex artefact tasks of data curation, and build efective, robust, and repeatable data
curation processes by learning from a crowd of knowledge workers.
The work is still in the early stages. Based on the existing frameworks and theories in
sensemaking and cognition processing, we plan to consolidate all research findings we found in studies 1
and 2 to propose a integrative sensemaking and cognitive processing model in multi-artefact
information tasks. We will test the proposed research model and hypotheses using empirical
data collected from a crowdsourcing platform.</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>7. Expected Contributions</title>
      <p>This research project allows us to understand the cognitive demands on knowledge workers as
they make sense of multi-artefact information tasks. Expected contributions include bridging
the gap of the current limitation of understanding the sensemaking process of knowledge
workers in multi-artefact information tasks, contributing to sensemaking theory, informing
modelling practice, providing guidelines on training to knowledge workers, and the design of
supporting tools and tasks.</p>
    </sec>
    <sec id="sec-9">
      <title>Acknowledgments References</title>
      <p>This research is supported by UQ RTP reserach scholarship. I would like to thank Prof. Shazia
Sadiq, Prof. Marta Indulska, and A/Prof. Gianluca Demartini for supervising this PhD project.
[8] B. Dervin, A theoretic perspective and research approach for generating research helpful to
communication practice, Sense-making methodology reader: selected writings of Brenda
Dervin (2003) 251–268.
[9] D. M. Russell, M. J. Stefik, P. Pirolli, S. K. Card, The cost structure of sensemaking, in:</p>
      <p>Proceedings of the INTERACT’93 and CHI’93, 1993, pp. 269–276.
[10] D. M. Russell, G. Convertino, A. Kittur, P. Pirolli, E. A. Watkins, Sensemaking in a senseless
world: 2018 workshop abstract, in: Extended Abstracts of the 2018 CHI Conference on
Human Factors in Computing Systems, 2018, pp. 1–7.
[11] Z. Gu, J. N. Yan, J. M. Rzeszotarski, Understanding user sensemaking in machine learning
fairness assessment systems, in: Proceedings of the Web Conference, 2021, pp. 658–668.
[12] M. X. Liu, A. Kittur, B. A. Myers, To reuse or not to reuse? a framework and system for
evaluating summarized knowledge, Proceedings of the ACM on HCI 5 (2021) 1–35.
[13] V. Grigoreanu, M. Burnett, S. Wiedenbeck, J. Cao, K. Rector, I. Kwan, End-user
debugging strategies: A sensemaking perspective, ACM Transactions on Computer-Human
Interaction (TOCHI) 19 (2012) 1–28.
[14] M. X. Liu, S. Burley, E. Deng, A. Zhou, A. Kittur, B. A. Myers, Supporting knowledge
acceleration for programming from a sensemaking perspective, in: Sensemaking Workshop
at CHI Conference on Human Factors in Computing Systems, 2018.
[15] F. Chen, J. Zhou, Y. Wang, K. Yu, S. Z. Arshad, A. Khawaji, D. Conway, Robust multimodal
cognitive load measurement, Springer, 2016.
[16] J. Sweller, P. Ayres, S. Kalyuga, Measuring cognitive load, in: Cognitive load theory,</p>
      <p>Springer, 2011, pp. 71–85.
[17] A. T. Duchowski, Gaze-based interaction: A 30 year retrospective, Computers &amp; Graphics
73 (2018) 59–69.
[18] K. Rayner, Eye movements in reading and information processing: 20 years of research.,</p>
      <p>Psychological bulletin 124 (1998) 372.
[19] M. A. Just, P. A. Carpenter, Eye fixations and cognitive processes, Cognitive psychology 8
(1976) 441–480.
[20] G. Knolmayer, R. Endl, M. Pfahrer, Modeling processes and workflows by business rules,
in: Business Process Management, Springer, 2000, pp. 16–29.
[21] T. Chen, W. Wang, M. Indulska, S. Sadiq, Business process and rule integration
approachesan empirical analysis, in: International Conference on Business Process Management,
Springer, 2018, pp. 37–52.
[22] R. Zhang, M. Indulska, S. Sadiq, Discovering data quality problems, Business &amp; Information</p>
      <p>Systems Engineering 61 (2019) 575–593.
[23] T. Hey, A. Trefethen, The data deluge: An e-science perspective, Grid computing: Making
the global infrastructure a reality (2003) 809–824.
[24] E. Rahm, H. H. Do, Data cleaning: Problems and current approaches, IEEE Data Eng. Bull.</p>
      <p>23 (2000) 3–13.
[25] C. Sutton, T. Hobson, J. Geddes, R. Caruana, Data dif: Interpretable, executable summaries
of changes in distributions for data wrangling, in: Proceedings of the 24th ACM SIGKDD
Conference, 2018, pp. 2279–2288.
[26] T. Chen, S. Sadiq, M. Indulska, Sensemaking in dual artefact tasks–the case of business
process models and business rules, in: International Conference on Conceptual Modeling,
Springer, 2020, pp. 105–118.
[27] T. Van Gog, F. Paas, J. J. Van Merriënboer, P. Witte, Uncovering the problem-solving
process: Cued retrospective reporting versus concurrent and retrospective reporting.,
Journal of Experimental Psychology: Applied 11 (2005) 237.
[28] T. Chen, L. Han, G. Demartini, M. Indulska, S. Sadiq, Building data curation processes
with crowd intelligence, in: International Conference on Advanced Information Systems
Engineering, Springer, 2020, pp. 29–42.
[29] Y. Tay, Data generation for application-specific benchmarking, Proceedings of the VLDB</p>
      <p>Endowment 4 (2011) 1470–1473.
[30] S. G. Hart, Nasa-task load index (nasa-tlx); 20 years later, in: Proceedings of the human
factors and ergonomics society annual meeting, volume 50, 2006, pp. 904–908.
[31] L. Han, T. Chen, G. Demartini, M. Indulska, S. Sadiq, On understanding data worker
interaction behaviors, in: Proceedings of the 43rd International ACM SIGIR Conference
on Research and Development in Information Retrieval, 2020, pp. 269–278.
[32] S. Yu, T. Chen, L. Han, G. Demartini, S. Sadiq, Dataops-4g: On supporting generalists in
data quality discovery, IEEE Transactions on Knowledge and Data Engineering (2022).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D.</given-names>
            <surname>Schubmehl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Vesset</surname>
          </string-name>
          ,
          <article-title>The knowledge quotient: Unlocking the hidden value of information using search and content analytics, White paper</article-title>
          , IDC (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>N.</given-names>
            <surname>Sambasivan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kapania</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Highfill</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Akrong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Paritosh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. M.</given-names>
            <surname>Aroyo</surname>
          </string-name>
          , “
          <article-title>everyone wants to do the model work, not the data work”: Data cascades in high-stakes ai</article-title>
          ,
          <source>in: Proceedings of the 2021 CHI</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>15</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Muller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Lange</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Piorkowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tsay</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q. V.</given-names>
            <surname>Liao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Dugan</surname>
          </string-name>
          , T. Erickson,
          <article-title>How data science workers work with data: Discovery, capture, curation, design, creation</article-title>
          ,
          <source>in: Proceedings of the 2019 CHI Conference</source>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>D. J.</given-names>
            <surname>Piorkowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. D.</given-names>
            <surname>Fleming</surname>
          </string-name>
          , I. Kwan,
          <string-name>
            <surname>M. M. Burnett</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Scafidi</surname>
            ,
            <given-names>R. K.</given-names>
          </string-name>
          <string-name>
            <surname>Bellamy</surname>
            ,
            <given-names>J. Jordahl,</given-names>
          </string-name>
          <article-title>The whats and hows of programmers' foraging diets</article-title>
          ,
          <source>in: Proceedings of the CHI Conference</source>
          ,
          <year>2013</year>
          , pp.
          <fpage>3063</fpage>
          -
          <lpage>3072</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>W.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Indulska</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. W.</given-names>
            <surname>Sadiq</surname>
          </string-name>
          ,
          <article-title>Cognitive eforts in using integrated models of business processes and rules</article-title>
          .,
          <source>in: CAiSE Forum</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>33</fpage>
          -
          <lpage>40</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>D.</given-names>
            <surname>Patil</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Data</given-names>
            <surname>Jujitsu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O</given-names>
            <surname>'Reilly Media</surname>
          </string-name>
          , Inc.,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Sadiq</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Dasu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X. L.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Freire</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. F.</given-names>
            <surname>Ilyas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Link</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Naumann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Srivastava</surname>
          </string-name>
          ,
          <article-title>Data quality: The role of empiricism 46 (</article-title>
          <year>2018</year>
          )
          <fpage>35</fpage>
          -
          <lpage>43</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>