=Paper=
{{Paper
|id=Vol-3078/paper-81
|storemode=property
|title=A Taxonomy of Tools for Reproducible Machine Learning Experiments
|pdfUrl=https://ceur-ws.org/Vol-3078/paper-81.pdf
|volume=Vol-3078
|authors=Luigi Quaranta,Fabio Calefato,Filippo Lanubile
|dblpUrl=https://dblp.org/rec/conf/aiia/QuarantaCL21
}}
==A Taxonomy of Tools for Reproducible Machine Learning Experiments==
<pdf width="1500px">https://ceur-ws.org/Vol-3078/paper-81.pdf</pdf>
<pre>
A Taxonomy of Tools for Reproducible Machine
Learning Experiments
Luigi Quaranta1 , Fabio Calefato1 and Filippo Lanubile1
1
    Dept. of Computer Science, University of Bari, Via Edoardo Orabona 4, 70125 Bari BA, Italy


                                         Abstract
                                         The broad availability of machine learning (ML) libraries and frameworks makes the rapid prototyping
                                         of ML models a relatively easy task to achieve. However, the quality of prototypes is challenged by their
                                         reproducibility. Reproducing an ML experiment typically entails repeating the whole process, from data
                                         collection to model building, other than multiple optimization steps that must be carefully tracked. In
                                         this paper, we define a comprehensive taxonomy to characterize tools for ML experiment tracking and
                                         review some of the most popular solutions under the lens of the taxonomy. The taxonomy and related
                                         recommendations may help data scientists to more easily orient themselves and make an informed choice
                                         when selecting appropriate tools to shape the workflow of their ML experiments.

                                         Keywords
                                         Reproducibility, ML experiment, Collaboration


1. Introduction
Machine Learning (ML)-based components are today being massively adopted and integrated
into traditional software products. The broad availability of ML libraries and frameworks
makes the rapid prototyping of ML models relatively easy to achieve. On the other hand, it is
challenging to translate ML prototypes into production-ready artifacts, able to offer robust and
scalable performances in real-world scenarios [1]. Sato et al. [2] suggest that some solutions
to many of the emerging challenges in this area might be built upon consolidated software
engineering practices such as Continuous Delivery [3].
   A prominent challenge that urges to be addressed is experiment reproducibility. Reproducing
an ML experiment means repeating the whole process, from data collection to model building
and deployment, requiring – at every stage – multiple optimization steps that must be carefully
tracked to ensure reproducibility. Experiment reproducibility is not only a fundamental trait of
scientific experimentations, but it is also crucial for many industrial domains, such as banking
and automotive, where ML is being employed to solve mission-critical tasks [4].
   Ensuring experiment reproducibility can be even more complex in collaborative environments,
as teams of data scientists have to safely share not only the code they develop but also the

AIxIA 2021 Discussion Papers
Envelope-Open luigi.quaranta@uniba.it (L. Quaranta); fabio.calefato@uniba.it (F. Calefato); filippo.lanubile@uniba.it
(F. Lanubile)
GLOBE http://collab.di.uniba.it/luigi-quaranta/ (L. Quaranta); http://collab.di.uniba.it/fabio/ (F. Calefato);
http://www.di.uniba.it/~lanubile/ (F. Lanubile)
Orcid 0000-0002-9221-0739 (L. Quaranta); 0000-0003-2654-1588 (F. Calefato); 0000-0003-3373-7589 (F. Lanubile)
                                       © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)
datasets and the experimental environments in which they work. Furthermore, this multifaceted
challenge could be exacerbated by the heterogeneity of data science teams, whose members
often have different cultural backgrounds and may lack software engineering expertise [5]; in
such scenarios, significant training efforts can be required even for the simple adoption of basic
software engineering best practices (e.g., code versioning). In the industrial context, the ability
to reproduce ML experiments is of paramount importance also to ease the aforementioned
transition from an exploratory to a mature production phase, in which ML models are delivered
alongside traditional software and carefully monitored over time.


2. Taxonomy
Today we are witnessing a multitude of software solutions to support the reproducibility of ML
experiments. The choice of the solution that best fits a researcher’s or a company’s needs is
non-trivial, and the abundance of options can be overwhelming at first. To guide potential users
in this plethora of available services, we have reviewed some of the most popular solutions and
defined a comprehensive characterization of them. We used an inductive approach inspired
by Grounded Theory [6], which we found useful to get a sense of this new-born and rapidly-
changing area of the software market.
   We started with as few preconceptions as possible and strived to build a sound abstraction
upon identifying recurring patterns in our reviews of software solutions. We then grouped
common ideas into concepts and then concepts into categories, deriving the concise taxonomy
shown in Fig. 1. While we cannot claim completeness (we did not try all the available software
solutions), we went on examining up to 19 tools until we reached saturation, i.e., we were no
more able to identify new tool characteristics through review. Please, refer to the auxiliary
online material1 for a complete list of the tools examined.
   The taxonomy groups the tool offering into three main categories, whose corresponding
concepts are described in Sections 2.2-2.4.

2.1. Tool Sample
To make the taxonomy more evident, we complement it with a selection of four representative
software solutions drawn from the tools we reviewed. Before delving into the particulars of
our classification, we provide some details on the selected tool sample, as it will be used to
exemplify the main ideas throughout the paper.

2.1.1. DVC.
DVC2 is defined as an open-source version control system for machine learning projects. It
offers a command-line interface heavily inspired by git . Developers well versed in using git
get comfortable with DVC; on the other hand, those who have no previous experiences with
git might find DVC to have a steep learning curve. The implementation also relies on git ,
adding support for cloud storage services and big dataset versioning. As such, DVC offers
   1
       https://github.com/collab-uniba/Software-Solutions-for-Reproducible-ML-Experiments
   2
       https://dvc.org
                                                                             CLI
                                                                             API
                                                Interaction Mode
                                                                             Cloud IDE
                                                                             AutoML Platform
                                                                       Data Preparation
                                                 Workﬂow
                                                                       Model Building
                                                 Coverage
                                                                       Model Deployment

                               General                                                                  Scikit-Learn
                                                                      Python
                                                                                                        TensorFlow
                                                Languages             R
                                                                                                        Keras
                                                                      Language agnostic
                                                                       …                                …

                                                                    FLOSS
                                                  License
                                                                    Proprietary


                                                                                             Hosted
                                                                                   Yes
                                                 Notebook Support                            On-Premise
                                                                                   No

                                                                                   Yes
                                                 Data Visualization
                                                                                   No
                 Tool              Analysis                                                 Local
                                                                               Yes
            Characterization       Support       Web Dashboard                              Remote
                                                                               No

                                                                                    Synchronous (co-editing)
                                                 Collaboration Mode
                                                                                    Asynchronous (comments)

                                                                                         Local
                                                Computational Resources                  On-premise
                                                                                                          In-house
                                                                                         Remote
                                                                                                          Third-party


                                                                         Integrated             git-based
                                                Code Versioning                                 File-sharing services
                                                                         External
                                                                                                (Google Drive)
                                                                       Local
                                                 Data Access                              Internal
                                                                       Remote
                                                                                          Third-party

                           Reproducibility                                  Yes
                                                Data Versioning
                              Support                                       No

                                                                                     Manual
                                                Experiment             Yes           Automatic
                                                  Logging                            Mixed
                                                                       No

                                                                                           Script
                                                                             Yes           Conﬁguration ﬁle
                                                Pipeline Creation                          Automatic
                                                                             No
                                                                                           Visual
                                                                                           Built-in


Figure 1: The taxonomy inferred from the analysis of existing tools for reproducible ML experiments.
full code and data provenance, enabling users to track the complete history of each artifact
they produce. DVC also has experiment management capabilities: experiment metrics can be
tracked, and pipelines are automatically detected and represented as a directed acyclic graph
(DAG). With all these features into place, experiment runs can be easily repeated and compared.
Finally, additional features are push/pull commands that enable model sharing as well as model
deployment into production servers.

2.1.2. MLflow.
MLflow3 is an open-source tool that can be used through an API other than as a command-line
tool. It is composed of four modules. The tracking module is intended for experiment tracking
and management, including logging of parameters, code versions, metrics, and output files.
Multiple runs can be compared using a web dashboard, which can be either served locally or
run remotely (for results sharing). The projects module enables experiment reproducibility.
Compared to DVC, which automatically tracks data and model provenance, the support for
reproducibility appears weaker, as this module relies on conventions to be followed for the
experiment to be repeatable. The models module offers model deployment support through the
definition of a standard format for packaging machine learning artifacts that can be used with a
variety of downstream tools (e.g., real-time serving through a REST API). Lastly, registry is a
module that acts as a centralized model store, featuring model lineage, versioning, and other
useful model annotation functions.

2.1.3. Google Colaboratory.
Google Colaboratory4 , also known as Colab, is a cloud-based IDE mimicking the popular Jupyter
Notebook environment. Users are provided with a generous amount of computing resources.
Nevertheless, Colab can be easily wired to external cloud services to afford more computational
power. Being built upon Google Drive, Colab offers storage capabilities (including a versioning
feature with extended support for notebook diffs), and enables asynchronous collaboration,
allowing users to co-edit a notebook and leave comments on its cells. It can also be linked to
other Google Cloud Platform services.

2.1.4. H2O Driverless AI.
H2O Driverless AI5 is an AutoML platform that manages the whole end-to-end ML pipeline
in complete autonomy. First, the user loads his training data on the platform, sets a few
parameters, and starts the training process. Then, H2O Driverless AI seeks the best model
within the user’s constraints and eventually enables model deployment by creating a REST
endpoint or by automatically running the model as a service in the cloud. Users can upload and
share extensions in the form of Python recipes, that is, machine learning models, transformers,
and scorers specialized for a particular domain (e.g., NLP, time series).

   3
     https://mlflow.org
   4
     https://colab.research.google.com
   5
     www.h2o.ai/products/h2o-driverless-ai
Table 1
The tool sample classified according to the features of the General category.

        Tool         Interaction      Workflow             Languages            License
                     Mode             Coverage

       DVC           CLI              All                  Language agnostic    FLOSS
                                                                                (Apache 2.0)

       MLflow        API, CLI         All                  Python, R, Java      FLOSS
                                                                                (Apache 2.0)

       Google        Cloud IDE        Data Preparation     Python               Proprietary
       Colab                          + Model Building

       H2O           AutoML           All                  (Python Recipes)     Proprietary
       Driverless    Platform
       AI


2.2. General
The General category captures the basic connotations for comparing the existing software
solutions, including how to interact with the tool, the covered phases of ML experiments, the
programming languages supported, and the type of software license. As such, this category
defines the context in which every further tool feature should be analyzed.
  Table 1 exemplifies the classification of our tool selection according to the features covered
by this category.

2.2.1. Interaction Mode
The existing tools come in different flavors, diverging especially in terms of their interaction
mode. We identify four main approaches: CLI, API, Cloud IDE, and AutoML Platforms.

CLI. Command-Line Interface (CLI)-based tools require users to manually invoke commands
to perform data and code versioning, experiment logging, and the tracking of script executions.
Sometimes, the commands represent an interface to a local installation of the tool; in other cases,
they allow interaction with a remote service. One of the main benefits of a CLI-based solution
is that it does not require changes to the source code, and then it can be seamlessly applied
to already existing projects. However, CLI-based solutions generally provide less support for
computational notebooks and do not offer a centralized platform for team collaboration (e.g.,
a web dashboard to share experiment results). As such, only a few of the reviewed tools rely
exclusively on text-based interaction, as they often combine CLI with one of the other interaction
modes.
API. Various tools offer an API that requires data scientists to generate a token and add a
couple of boiler-plate lines of code to enable automatic logging. Solutions like MLflow and
Comet offer a web dashboard that allows easy inspection and comparison of experiment runs and
often include other useful visualization capabilities. The API server (along with the webserver
hosting the web dashboard, when present) can be installed and executed either locally or - more
often - remotely.

Cloud IDE. A few software solutions offer a cloud IDE, which generally consists of an
implementation of the Jupyter notebook environment, running on a cloud infrastructure. The
main advantages are the ease of sharing and portability, as the whole development environment
is indeed accessible with a few clicks from whatever connected machine. Cloud IDE services
can also support co-editing (e.g., Google Colaboratory) and offer social features, such as the
ability to leave document-level and cell-level comments.

AutoML Platforms. Automated Machine Learning (AutoML) solutions like H2O and DataRobot
typically cover the whole end-to-end pipeline, starting from the collection of raw data all the
way through to the model deployment. Experiments conducted using AutoML platforms can be
considered reproducible because they are fully automated processes. Yet, AutoML platforms are,
of course, very different from the other kinds of tools. Their interaction model is a ‘black box’
that does not require coding skills for basic usages: users upload a dataset, optionally set some
basic parameters in a web-based UI, and wait for the system to search for the best performing
algorithm after fully-automatic hyperparameter tuning. Of course, AutoML cannot substitute
human expertise, because real-world problems hardly ever fit the simplistic nature of predefined
cases. Yet these tools can be useful to teams that are still building their ML expertise, and to
experts for a quick assessment of algorithm performance.

2.2.2. Workflow Coverage
With the concept of workflow coverage, we identify the major phases of an ML experiment,
namely data preparation, model building, and model deployment. Most of the tools reviewed
offer support for all the major phases of the ML workflow, with the notable exception of Google
Colab, which does not provide any explicit support for model deployment.

2.2.3. Languages
CLI tools tend to be language-agnostic. Instead, API tools and Cloud IDEs are, for the most part,
language-specific. API tools have to be backed by a language-specific library, whereas Cloud
IDEs, almost always based on Jupyter notebooks, inherit at least the language limitations of the
Jupyter ecosystem, and are often even more restrictive (see Google Colab). The most widely
supported languages are by far R and Python, supported by all the reviewed tools. Along with
Python comes the support for the major machine learning and deep learning frameworks, such
as TensorFlow, PyTorch, Keras, and Scikit-Learn.
Table 2
The tool sample classified according to the features of the Analysis Support category.

 Tool             Notebook        Data              Web             Collab.        Computational
                  Support         Visualization     Dashboard       Mode           Resources

 DVC              No              No                Yes             Async          Local
                                                    (remote)        (push/pull
                                                                    commands)

 MLflow           Yes             No                Yes             No             Local +
                  (on-premise)                      (local)                        On-premise

 Google           Yes             No                No              Async          Local +
 Colab            (hosted)                                          (co-editing,   Remote
                                                                    comments)      (in-house or
                                                                                   third-party)

 H2O              No              Yes               Yes             No             Remote
 Driverless AI                                      (remote)                       (in-house or
                                                                                   third-party)


2.2.4. License
Another relevant basic aspect characterizing the tools is their license. A fine-grained license
classification is out of the scope of this article. We broadly distinguish between FLOSS (Free/Libre
Open-Source Software) and proprietary solutions, which we find to be much more common.

2.3. Analysis Support
The Analysis Support category groups the tool features designed to help data scientists with
their analytical tasks, including collaborative data analysis and data modeling tasks.
  In Table 2, we report a classification of our tool sample with reference to the features covered
by this category.

2.3.1. Notebook Support
During their everyday work, data scientists make use of many tools. Among these, computa-
tional notebooks play a leading role in enabling fast and interactive prototyping while offering a
lightweight and integrated form of experiment documentation. Moreover, computational note-
books facilitate collaboration among stakeholders, allowing professionals to report analytical
results to diverse kinds of audiences effortlessly.
   Computational notebooks enable the so-called ‘computational narratives,’ i.e., the embedding
of complex computations into easy-to-follow narratives that make notebooks clearer than bare
source code, especially with the intrinsic complexity of ML scripts. Jupyter Notebook and its
direct descendant, JupyterLab, are by far the most widely-adopted computational notebook
systems [7]. Given the enormous spread of such tools, many of the software solutions for
reproducible ML experiments have been built around them or at least designed to fully support
their daily use.
   With notebook support, we intend the possibility to use the main features of a tool from
within a computational notebook. In general, because they are operated through simple library
functions calls, API tools can be leveraged from scripts as well as notebooks. For the same
reason, notebooks lend themselves well to be adopted with Cloud IDEs. On the contrary,
CLI tools do not offer native support for computational notebooks (unless they are treated as
scripts, leveraging external utilities such as nbconvert ). When notebook support is present, we
distinguish the case in which notebooks are executed on local resources (on-premise) from that
in which notebook instances are hosted on cloud resources (as in the case of Google Colab).

2.3.2. Data Visualization
The concept of data visualization refers to the availability of built-in data visualization features,
which are accessible from the tool GUI. Features that merely rely on language-specific plotting
frameworks (e.g., matplotlib) and require users to write code do not qualify. We have found
this to be quite a rare feature offered exclusively by AutoML platforms, which tend to provide
users with data visualizations proactively, in order to guide them in the choice of basic AutoML
parameters.

2.3.3. Web Dashboard
Many solutions (especially API tools) offer an integrated web dashboard to show records of
experiment runs. Often such dashboards automatically display summary plots and statistics
that enable immediate evaluation and comparison of experimental results. Web dashboards
can be hosted on a local or remote server. In the latter case, they allow data scientists to share
results with the team in real-time.

2.3.4. Collaboration mode
Collaboration mode refers to the presence and type of collaborative features supported. When
available, we distinguish between synchronous, indicating co-editing capabilities, and asyn-
chronous, covering mainly the ability to leave comments and reactions to the work of other
colleagues, and the presence of push/pull commands to share content. Synchronous and asyn-
chronous collaboration modes appear to be quite rare.

2.3.5. Computational Resources
With the growing popularity of the SaaS delivery model, it is not surprising that many of the
software solutions reviewed are powered by remote computational resources. These cloud
resources can be either in-house, i.e., offered by the same company that provides the ML tool,
or third-party, i.e., supplied by an external cloud-computing provider such as Amazon Web
Services, Microsoft Azure, and Google Cloud Platform. In some cases, tools are designed to be
Table 3
The tool sample classified according to the features of the Reproducibility Support category.

   Tool             Code             Data             Data            Experiment       Pipeline
                    Versioning       Access           Versioning      Logging          Creation

   DVC              Yes              Local +          Yes             Yes              Yes
                    (external,       Remote                           (manual)         (automatic)
                    git-based)       (third-party)

   MLflow           Yes              Local +          No              Yes              Yes
                    (external,       Remote                           (hybrid)         (config. file)
                    git-based)       (third-party)

   Google           Yes              Remote           Yes             No               No
   Colab            (file-sharing    (internal or
                    services -       third-party)
                    Google Drive)

   H2O              Yes              Remote           Yes             Yes              Yes
   Driverless AI    (integrated)     (internal or                     (automatic)      (built-in)
                                     third-party)


executed locally on the user’s machine. In other cases, they can be deployed on-premise (e.g.,
on company servers).

2.4. Reproducibility Support
The category of Reproducibility Support groups the features that reinforce the repeatability of
ML experiments.
   To ensure reproducibility, it is essential to track both the operations that are performed during
each phase of the ML workflow and the artifacts that are produced after each operation. Indeed,
an ML workflow is usually represented in the form of a pipeline, i.e., a sequence of processing
steps in which the output of one step becomes the input of the next.
   To ensure end-to-end reproducibility of an ML experiment, it is necessary to have access to
the specific version of data that was used when the model was trained in the first place, along
with the specific version of code. Moreover, the frameworks used to build ML models often
depend on complex development environments that are hard to reproduce too. Hence, for ML
experiments to be fully repeatable, dedicated software should support the versioning of the
entire working environment. This is a complex matter; nevertheless, during the last few years,
modern containerization technologies are making it easier to accomplish.
   In Table 3, we report the classification of our tool sample according to the features pertaining
to this category.
2.4.1. Code Versioning
Code versioning is an established practice of software engineering. Code versioning in ML can
be a feature integrated with the software solution for experiment reproducibility or within an
external tool; in the latter case, it either leverages git-related technologies (e.g., GitHub or the
local git installation itself, as in the case of DVC) - or is based on other services such as Google
Drive, which often come with their built-in versioning technology.

2.4.2. Data Access and Versioning
Although versioning data may seem like a natural extension of code versioning, it comes with
its peculiar challenges. Data requires much more space to be stored, and deltas can be harder
to represent. Furthermore, versioning data may entail providing detailed information about
the full provenance of data, starting from the original source all the way through the pipeline
processing steps.
   Ensuring access to the specific version of data involved in a model training session entails a
two-fold problem: (1) data access, i.e., ensuring that data is somehow made available to the team,
which is not always the case, especially when a dataset is stored locally in the experimenter
machine and never backed-up in a cloud storage service; (2) data versioning, i.e., ensuring that
the data is versioned every time it gets processed. Many software solutions guarantee the
possibility to store data remotely, leveraging cloud storage services. Some of them support data
versioning, sometimes in lightweight flavors (e.g., tracking of SHA digests), sometimes offering
complex versioning systems that track the full data provenance/lineage.

2.4.3. Experiment Logging
Model building generally requires many experiment runs based on a cyclic process of data
cleaning, feature engineering, and hyperparameters tuning in which plenty of decisions are made.
In the case of unsatisfying results, decisions can be revised, resulting in a new experiment run.
Thus, the number of experimental iterations can quickly escalate, making it difficult to keep track
of the results without proper tooling. Most of the software solutions reviewed offer experiment
logging. Data scientists can log relevant information about the specific experiment run, such as
dataset sources and versions, project dependencies, hyperparameters, data visualizations, and
metrics. This allows retrieving the rationale behind each decision and enables the comparison
across the experiment runs.
   Tools supporting this feature usually demand manual interventions by the user, requiring CLI
commands to be issued or the invocation of API functions in the scripts. In some cases, logging
can be automatic (e.g., Driverless AI) or hybrid (e.g., MLflow, Neptune). Ultimately, experiment
logging can be further enriched with automatic recording of system performance stats.

2.4.4. Pipeline Creation
Last but not least, pipeline creation is a fundamental concept that - together with the others
collected in this category - enables the actual reproducibility of ML experiments. A pipeline can
be defined in many ways. Sometimes the pipeline is automatically detected (as in the case of
DVC); otherwise, users need to define it imperatively via scripts (e.g., Spell.run), declaratively
via configuration files (e.g., MLflow), or even visually (e.g., RapidMiner).


3. Conclusions
We reviewed a large and varied set of tools for reproducible ML experiments. The landscape
for these tools is right now very active – new alternatives keep popping out, and key features
missing from one solution might just be added to it as one reads. Still, by highlighting the
key features, the taxonomy can be used to narrow down the search space to a few candidate
solutions and draw some guidelines for final selection.
   For data science teams with good software engineering experience, consider using a CLI-based
solution, such as DVC. The familiarity with git and its push and pull model may drastically
decrease the learning curve; plus, CLI-based solutions are very flexible and easy to fit within a
pre-existing infrastructure via shell scripting.
   If the crucial feature is the graphical comparison of experiment runs, consider using an API-based
solution, such as MLflow. In general, API-based solutions come with a web dashboard that
makes it easier and more pleasant to check for the optimal configurations. API-based solutions
are also easy to integrate and ‘just work’ whenever one can make their code run.
   If no solution checks all the needed boxes, consider adopting more than one. While the reviewed
solutions are competitors, their approach to ML experiment reproducibility can vary drastically,
to the point that they can co-exist in a work environment. For example, one can use DVC for
managing the reproducible pipeline in combination with MLflow for logging and graphically
comparing results across each experiment run. With so many potential tool combinations, the
sky is the limit.
   Our GitHub repository provides environment templates to speed up the configuration process
as well as a realistic case study that was taken from Kaggle to start fiddling with the tools.


References
[1] D. Sculley, G. Holt, D. Golovin, E. Davydov, T. Phillips, D. Ebner, V. Chaudhary, M. Young,
    J.-F. Crespo, D. Dennison, Hidden technical debt in machine learning systems, in: Advances
    in neural information processing systems, 2015, pp. 2503–2511.
[2] D. Sato, A. Wider, C. Windheuser, Continuous Delivery for Machine Learning - Automating
    the end-to-end lifecycle of Machine Learning applications, 2019. URL: https://martinfowler.
    com/articles/cd4ml.html.
[3] J. Humble, D. Farley, Continuous delivery: reliable software releases through build, test,
    and deployment automation, Pearson Education, 2010.
[4] M. Hutson, Artificial intelligence faces reproducibility crisis, Science 359 (2018) 725–726.
    URL: https://www.sciencemag.org/lookup/doi/10.1126/science.359.6377.725. doi:10.1126/
    science.359.6377.725 .
[5] M. Kim, T. Zimmermann, R. DeLine, A. Begel, The Emerging Role of Data Scientists on
    Software Development Teams, in: Proceedings of the 38th International Conference on
    Software Engineering, ACM, 2016, pp. 96–107. Tex.ids: kimEmergingRoleData2016a.
[6] K.-J. Stol, P. Ralph, B. Fitzgerald, Grounded theory in software engineering research: a
    critical review and guidelines, in: Proceedings of the 38th International Conference on
    Software Engineering, ACM, Austin Texas, 2016, pp. 120–131. URL: https://dl.acm.org/doi/
    10.1145/2884781.2884833. doi:10.1145/2884781.2884833 .
[7] J. M. Perkel, Why Jupyter is data scientists’ computational notebook of choice, Nature 563
    (2018) 145–147.

</pre>