<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>F. Vanhoenshoven);</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>bupaRflow: A Workflow Interface for bupaR</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Brecht Steukers</string-name>
          <email>brecht.steukers@student.uhasselt.be</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gert Janssenswillen</string-name>
          <email>gert.janssenswillen@uhasselt.be</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gerhardus A. W. M. van Hulzen</string-name>
          <email>gerard.vanhulzen@uhasselt.be</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Frank Vanhoenshoven</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Benoît Depaire</string-name>
          <email>benoit.depaire@uhasselt.be</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Agoralaan</institution>
          ,
          <addr-line>3590 Diepenbeek</addr-line>
          ,
          <country country="BE">Belgium</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>UHasselt - Hasselt University, Faculty of Business Economics</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0003</lpage>
      <abstract>
        <p>In recent years, the open-source process analytics tool bupaR has seen a significant increase in usage. Among the advantages are its functional programming design - making it inherently suitable for interactive data analysis - and its reproducibility. However, writing scripts is still out of the comfort zone for many professionals who might benefit from the insights of process analysis. In order to make bupaR accessible to a wider audience, this paper presents bupaRflow, a graphical interface on top of bupaR that combines the workflow paradigm with an analytical building block architecture.</p>
      </abstract>
      <kwd-group>
        <kwd>process mining</kwd>
        <kwd>event data</kwd>
        <kwd>process analytics</kwd>
        <kwd>functional programming</kwd>
        <kwd>visual programming</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Since the publication of the first R-package for exploratory and descriptive analysis of event data
in 2016 [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], the ecosystem of business process analytics in R has steadily grown in functionalities
as well as user base [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ]. In general, the use of script-based tools for process analytics such as
bupaR and PM4Py [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] has several advantages. Firstly, the product of the analysis is not just the
results, but also the script that has led to these results, thereby making sure the analyses are
perfectly reproducible. Secondly, scripts bring transparency to the table, as the steps undertaken
in the analysis are explicitly made clear. Finally, it provides flexibility and extensibility, as the
aforementioned tools are embedded within the data analytics ecosystems of R and Python.
      </p>
      <p>These advantages, together with the fact that bupaR is available open-source, have contributed
to its widespread use. Since the bupaR packages are freely available, they provide a perfect
starting point for professionals to experiment with process mining and discover its value.
However, the use of a programming language is still often regarded a considerable adoption
barrier and can lead to steep learning curves. This makes script-based process analysis tools
a viable option for professionals with programming experience, but less so for professionals
without a programming background — or even a background in data analysis — who might also
nEvelop-O</p>
      <p>CEUR
benefit from the insights delivered by process analysis.</p>
      <p>In this paper, we present bupaRflow — a prototype graphical user interface built on top of
bupaR to create process analysis workflows. By using the concept of functional building blocks
— where each block represents a function, taking an input and turning it into an output —
bupaRflow preserves the transparency provided using functional programming. The user of
bupaRflow is able to perform process analysis using the core bupaR toolset, without the need
for any programming. In addition, users session are saved so that analyses can be revisited and
repeated at later moments.</p>
      <p>Section 2 discusses the design principles and major features of bupaRflow . Section 3 describes
its maturity, while Section 4 points to additional materials accompanying this demo, including
a screencast, tutorial and instructions on how to access the tools. Section 5 concludes the paper
and discusses avenues for future work.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Features</title>
      <p>
        In the following paragraphs, we discuss the functionality (Sec 2.1), conceptual design (Sec 2.2)
and architecture (Sec 2.3) of bupaRflow .
2.1. bupaR functionality
bupaRflow currently supports all functionalities provided by the core bupaR packages: bupaR
(for main event log handling), processmapR (for creating directly-follow graphs and other
visualizations), and edeaR (for calculating descriptive measures and event log filtering). Extensions
towards other functionalities provided by the wider bupaR-ecosystem are planned to be added
in the future. The architecture (see Section 2.3) is conceived in such a way that adding packages
that comply with the design philosophy in tidyverse [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], can be integrated straightforwardly.
2.2. Conceptual Design
The starting point for the design of bupaRflow was to preserve the aforementioned unique
qualities of script-based process analysis as much as possible, i.e. reproducibility, transparency,
and flexibility. Coupled with the functional programming paradigm that is used by bupaR
it followed naturally to take a visual programming approach, where each function forms an
analytical building block. When connected, these blocks form analytical workflows.
      </p>
      <p>The set of workflows illustrated in Figure 1 perform several analysis on the example patients
dataset. Two diferent process maps are made, with diferent configurations (cannot be observed
in the screenshot). Furthermore, the data is filtered on trace frequency, after which throughput
times are calculated and plotted. Furthermore, the filtered traces are shown using the trace
explorer. For more information on these workflows, we refer to the tutorial and screencast.</p>
      <p>
        In the field of data science, this visual programming approach is mostly known from tools as
KNIME [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and RapidMiner [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. It should be noted that an extension to RapidMiner for process
mining, called RapidProM [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], exists. However, as the main focus of bupaRflow is to make
process analysis more accessible to professionals without a programming or even data analysis
background, it was specifically decided to create a standalone, dedicated application rather
than an extension to one of the existing tools, as the latter might by unfamiliar and thus form
another barrier to be overcome.
      </p>
      <p>
        Using this visual programming approach preservers the transparency that comes with
scriptbased process analysis. User management allows the analysis to be saved and revisited later.
However, some flexibility and extensibility is sacrificed. Adding new blocks by users is not
possible, while combining bupaR- functionalities with other libraries is only possible if these
are explicitly included in the applications. Currently, this is only done for functionalities of
the tidyverse [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], the usage of which can be seamlessly integrated with bupaR. In contrast, an
advantage that bupaRflow has over bupaR itself is that it allows parts of analysis workflows to
be reused by creating of several branches after a specific block, as can be seen in Fig. 1.
      </p>
      <p>
        It should be noted that this approach is diferent from PMTK, the web-based process mining
tool on top of PM4Py , which does not use a visual programming approach but provides an
analysis toolkit using a dashboard approach, not unlike existing commercial tools.
2.3. Architecture
bupaRflow is conceived as a web application using an API to bupaR in the back-end. A
conceptual overview of the architecture can be seen in Figure 2. The interactive web interface was
created using Vue.js [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] while the API was created using plumber [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. In order to enhance
performance, the app allows users to indicate whether a specific block should be treated as
persistent, i.e. so that it will not be recomputed at each run. Firebase is used to store the data. It
should be noted that the back-end is made in such a way that new functions can be added with
minimal efort — i.e. by adding them to a configuration. The app is currently hosted on Azure.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Maturity</title>
      <p>
        The bupaRflow tool presented in this paper should be regarded as a first prototype. It has not
been made publicly available before, and as such case studies using the tool are not available
yet. Nonetheless, it stands upon the foundation of the bupaR-ecosystem. Since its conception,
the bupaR-ecosystem has amassed more than 800K downloads in 158 countries across the globe,
thereby encouraging the further adoption of process mining. The user base of bupaR is highly
varied, ranging from both service and product industries, governmental agencies, as well as
NGOs. Over the years, a considerable amount of research papers and case studies using bupaR
have been published. [
        <xref ref-type="bibr" rid="ref11 ref12 ref13 ref14">11, 12, 13, 14</xref>
        ]
      </p>
    </sec>
    <sec id="sec-4">
      <title>4. Further materials</title>
      <p>For reviewing purposes, bupaRflow has been made available via this link: https://buparflow.
azurewebsites.net/. It can be tested anonymously by using the Proceed without an account
option. A 4-minute screencast is available here: https://tinyurl.com/bpmbuparflow. A tutorial
can be found here: https://gertjanssenswillen.github.io/bpmbuparflowdemo</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusions and Future Work</title>
      <p>This paper presented bupaRflow, a web application that allows the use of bupaR- functionalities
using visual programming. It is targeted to professionals without any background in data
analysis or programming, who want to discover how process mining can bring additional
insights to their conventional analyses.</p>
      <p>The tool as presented in this paper is a prototype, and several improvements are foreseen for
the future. While user management is in place, it currently only allows saving a single canvas.
In order to improve the user experience, the design of the interface needs further improvement,
and proper error handling needs to be provided. Next to the further addition of functionalities
beyond the bupaR- core, also functionalities to export data and save outputs need to be provided.
Additional functionalities outside of the bupaR- ecosystem, for instance for data import, can be
considered as well.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Swennen</surname>
          </string-name>
          , G. Janssenswillen,
          <string-name>
            <given-names>M.</given-names>
            <surname>Jans</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Depaire</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Vanhoof</surname>
          </string-name>
          ,
          <article-title>Capturing process behavior with log-based process metrics</article-title>
          ,
          <source>in: Proceedings of the 5th International Symposium on Data-driven Process Discovery and Analysis, CEUR Workshop Proceedings</source>
          , RWTH Aachen University,
          <year>2015</year>
          , pp.
          <fpage>141</fpage>
          -
          <lpage>144</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>G.</given-names>
            <surname>Janssenswillen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Mannhardt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Creemers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Depaire</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Jans</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Jooken</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Martin</surname>
          </string-name>
          ,
          <string-name>
            <surname>G. Van Houdt</surname>
          </string-name>
          ,
          <article-title>Extensions to the bupaR ecosystem: An overview</article-title>
          , in
          <source>: Proceedings of the ICPM Doctoral Consortium and Tool Demonstration Track, CEUR Workshop Proceedings</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>43</fpage>
          -
          <lpage>46</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>G.</given-names>
            <surname>Janssenswillen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Depaire</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Swennen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Jans</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Vanhoof</surname>
          </string-name>
          , bupaR:
          <article-title>Enabling reproducible business process analysis</article-title>
          ,
          <source>Knowledge-Based Systems</source>
          <volume>163</volume>
          (
          <year>2019</year>
          )
          <fpage>927</fpage>
          -
          <lpage>930</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Berti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. J. Van</given-names>
            <surname>Zelst</surname>
          </string-name>
          ,
          <string-name>
            <surname>W. van der Aalst</surname>
          </string-name>
          ,
          <article-title>Process mining for python (pm4py): bridging the gap between process-and data science</article-title>
          , arXiv preprint arXiv:
          <year>1905</year>
          .
          <volume>06169</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>H.</given-names>
            <surname>Wickham</surname>
          </string-name>
          ,
          <article-title>The tidyverse</article-title>
          ,
          <source>R package ver 1</source>
          (
          <year>2017</year>
          )
          <article-title>1</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M. R.</given-names>
            <surname>Berthold</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Cebron</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Dill</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. R.</given-names>
            <surname>Gabriel</surname>
          </string-name>
          , T. Kötter,
          <string-name>
            <given-names>T.</given-names>
            <surname>Meinl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Ohl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Thiel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Wiswedel</surname>
          </string-name>
          ,
          <article-title>Knime-the konstanz information miner: version 2.0 and beyond</article-title>
          ,
          <source>AcM SIGKDD explorations Newsletter</source>
          <volume>11</volume>
          (
          <year>2009</year>
          )
          <fpage>26</fpage>
          -
          <lpage>31</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>I.</given-names>
            <surname>Mierswa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wurst</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Klinkenberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Scholz</surname>
          </string-name>
          , T. Euler, Yale:
          <article-title>Rapid prototyping for complex data mining tasks</article-title>
          ,
          <source>in: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining</source>
          ,
          <year>2006</year>
          , pp.
          <fpage>935</fpage>
          -
          <lpage>940</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>W. M. van der Aalst</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Bolt</surname>
            ,
            <given-names>S. J. van Zelst</given-names>
          </string-name>
          ,
          <article-title>Rapidprom: mine your processes and not just your data</article-title>
          ,
          <source>arXiv preprint arXiv:1703.03740</source>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>E.</given-names>
            <surname>You</surname>
          </string-name>
          , Vuejs framework,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>B.</given-names>
            <surname>Schloerke</surname>
          </string-name>
          , J. Allen,
          <string-name>
            <given-names>plumber: An</given-names>
            <surname>API Generator for</surname>
          </string-name>
          <string-name>
            <surname>R</surname>
          </string-name>
          ,
          <year>2022</year>
          . Https://www.rplumber.io, https://github.com/rstudio/plumber.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>N. A.</given-names>
            <surname>Uzir</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Gašević</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Jovanović</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Matcha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.-A.</given-names>
            <surname>Lim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Fudge</surname>
          </string-name>
          ,
          <article-title>Analytics of time management and learning strategies for efective online learning in blended environments</article-title>
          ,
          <source>in: Proceedings of the tenth international conference on learning analytics &amp; knowledge</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>392</fpage>
          -
          <lpage>401</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>K. K. Larsson</surname>
          </string-name>
          ,
          <article-title>Digitization or equality: When government automation covers some, but not all citizens</article-title>
          ,
          <source>Government Information Quarterly</source>
          <volume>38</volume>
          (
          <year>2021</year>
          )
          <fpage>101547</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>H.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. Y.</given-names>
            <surname>Lim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. L.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Fischer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Warschauer</surname>
          </string-name>
          , “
          <article-title>we're looking good”: Social exchange and regulation temporality in collaborative design</article-title>
          ,
          <source>Learning and Instruction</source>
          <volume>74</volume>
          (
          <year>2021</year>
          )
          <fpage>101443</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>J.</given-names>
            <surname>González-García</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Tellería-Orriols</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Estupiñán-Romero</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Bernal-Delgado</surname>
          </string-name>
          ,
          <article-title>Construction of empirical care pathways process models from multiple real-world datasets</article-title>
          ,
          <source>IEEE journal of biomedical and health informatics 24</source>
          (
          <year>2020</year>
          )
          <fpage>2671</fpage>
          -
          <lpage>2680</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>