<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>MPVIS - A Python Library for Multi-Perspective Visualization in Process Mining</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Nicolás Abarca-Quiroga</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ignacio Velásquez</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marcos Sepúlveda</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>tools</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, School of Engineering, Pontificia Universidad Católica de Chile</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>MPVIS 1.0.3 GNU Afero General Public License v3.0 Python3, PM4Py, Graphviz Microsoft Windows, GNU/Linux, macOS https://bit.ly/mpvis-tutorial https://bit.ly/mpvis-doc https://github.com/nicoabarca/mpvis https://bit.ly/mpvis-tutorial-video</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Process models are commonly used to conduct exploratory analyses during the initial stages of BPM projects. Models depicting the process behavior can be discovered through process mining. These models allow visualizing the control-flow or performance perspectives (usually through the time dimension). However, current implementations typically allow only one perspective/dimension to be displayed at the same time. This work introduces MPVIS, a Python library that allows generating process models, particularly directly-follows graphs and directed rooted trees, that visualize multiple perspectives and performance dimensions simultaneously. The library is expected to be useful for conducting exploratory analyses during the initial phases of the BPM lifecycle by enabling the identification of multidimensional insights and improvement opportunities.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Process mining</kwd>
        <kwd>Process discovery</kwd>
        <kwd>Performance analysis</kwd>
        <kwd>Multi-perspective visualization</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Value</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction &amp; Motivation</title>
      <p>
        Process mining (PM) is a discipline that enables the analysis of processes based on event logs
recording their executions [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. One of its main tasks is process discovery, in which models
depicting the observed process behavior are derived from event data [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. These models typically
allow the visualization of the control-flow perspective, revealing alternate execution paths and
rework cycles, and can be extended to incorporate additional performance information such as
activity frequency, duration [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], or cost [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
      <p>
        Although multiple process performance dimensions, like time, cost, quality, and flexibility,
are well established in the literature [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], most existing tools and methods allow only a single
perspective or performance dimension to be displayed at a time. Some systems enable secondary
metrics to be included in nodes and arcs [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], but the visualization still emphasizes a primary
metric, with color encoding and visual emphasis determined solely by it. This limitation impedes
the ability to identify trade-ofs and interactions between diferent perspectives.
      </p>
      <p>
        To address this gap, we present MPVIS, a Python library for the simultaneous visualization
of multiple perspectives and performance dimensions within a single process model. MPVIS
implements both Directly-Follows Graphs (DFGs) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and Directed Rooted Trees (DRTs) [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ],
extending them to allow multi-perspective and multidimensional analyses. By enabling this
form of integrated visualization, MPVIS supports richer exploratory analyses during the early
phases of the BPM lifecycle, potentially revealing improvement opportunities that might remain
hidden if perspectives are analyzed separately.
      </p>
    </sec>
    <sec id="sec-3">
      <title>2. Related Work</title>
      <p>
        Multi-perspective process discovery has been addressed in several studies, including clustering
approaches that integrate the control-flow, data, and time perspectives [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], and dashboards
that summarize multiple perspectives based on process stages [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. In process model
visualization, some works have proposed models incorporating both the control-flow and data
perspectives [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], annotated with performance and frequency information [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], or enriched
with context-aware operators [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
      </p>
      <p>
        Despite these advances, existing process mining tools such as PM4Py [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] , Celonis [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ],
Apromore [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], and bupaR [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], either restrict the visualization to a single primary metric or
limit secondary metrics to auxiliary textual annotations. Moreover, while DFGs are widely
implemented, DRTs remain a recently proposed model and are not supported in most tools,
with the only known implementation focusing solely on cost metrics [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
    </sec>
    <sec id="sec-4">
      <title>3. Library Overview</title>
      <p>MPVIS is implemented as a Python library compatible with PM4Py. Its architecture is structured
around three main modules: preprocessing, model discovery, and visualization. The overall
architecture is illustrated in Figure 1.</p>
      <p>Event logs can first be preprocessed using MPVIS’s own grouping, filtering and pruning
functions or using PM4Py’s native filters, which reduce the complexity of the resulting models
by aggregating activities or reducing variants.</p>
      <p>From these event logs, MPVIS can generate two types of visualizations. The Multi-Perspective
DFG (MP-DFG) extends the traditional DFG by splitting each node into multiple sections, each
encoded with a color range representing a diferent perspective or performance dimension, such
as control-flow frequency, activity duration, and activity cost. Similarly, arcs display numerical
annotations corresponding to the frequency and waiting times of arcs. Figure 2 (b) visualizes
a discovered Multi-Perspective DFG, containing control-flow, time, and cost information of
activities for the event log in Figure 2 (a).</p>
      <p>The Multi-Dimensional DRT (MD-DRT) applies a similar strategy but focuses on representing
all process variants within a single acyclic graph, with state nodes segmented into sections
representing time, cost, quality (measured as the number of rework activities in cases), and
lfexibility (measured as the number of optional activities in cases). This approach enables the
joint analysis of variant-level behavior across dimensions. MPVIS also includes functionality
to aggregate non-bifurcating paths and prune the tree’s depth to manage complexity, which
is particularly valuable for large or highly variable logs. Figure 2 (c) visualizes a discovered
MD-DRT containing information for the time, cost, quality, and flexibility dimensions for the
event log in Figure 2 (a).</p>
      <p>To facilitate their use, both the MP-DFG and MD-DRT discovery and visualization functions
were created to the likeness of PM4Py’s functionalities, as outlined in Listing 1.
# Read event log using pandas utilities, then format it using MPVIS
event_log = pd.read_csv(’event_log.csv’, sep = ’;’)
log_format = {’case:concept:name’: ’case_id’, ’concept:name’: ’activity’,
’time:timestamp’: ’complete’, ’start_timestamp’: ’start’, ’cost:total’: ’cost’}
event_log = mpvis.log_formatter(event_log, log_format)
# Discover and then view MP-DFG (sa = start activities, ea = end_activities)
dfg, sa, ea = mpvis.mpdfg.discover_multi_perspective_dfg(event_log)
mpvis.mpdfg.view_multi_perspective_dfg(dfg, sa, ea)
# Discover and then view MD-DRT
drt = mpvis.mddrt.discover_multi_dimensional_drt(event_log)
mpvis.mddrt.view_multi_dimensional_drt(drt)</p>
      <p>Listing 1: Minimal code for process model discovery and visualization</p>
    </sec>
    <sec id="sec-5">
      <title>4. Features &amp; Innovations</title>
      <p>MPVIS’s novelty lies in its ability to visualize multiple process perspectives and performance
dimensions simultaneously, while maintaining the interpretability of single perspective process
mining visualizations. For the MP-DFG, this is achieved through a multi-section coloring
scheme for nodes and arcs. Nodes are split into three sections, each represented by a diferent
color range: blue is used for representing the control-flow perspective (activities’ frequency),
red is used for representing the time dimension (activities’ duration), and green is used for
representing the cost dimension (activities’ cost). Arcs contain blue numbers representing the
control-flow perspective (frequency of directly-follows activity relations) and red numbers
representing the time dimension (activities’ waiting times). Distinct aggregation metrics can be
configured for every perspective: absolute and relative frequencies at the activity or case level
for the control-flow perspective, and mean, median, total, maximum, minimum, or standard
deviation for the time and cost dimensions.</p>
      <p>
        For the MD-DRT, a multi-section coloring scheme is also used for nodes. Each section of
the nodes aggregates the information (total, accumulated, remaining values) of cases that flow
through it and is represented by a diferent color range: red is used for representing the time
dimension (lead time of cases), green is used for representing the cost dimension (total cost
of cases), blue is used for the quality dimension (number of rework activities of cases), and
purple is used for the flexibility dimension (number of optional activities of cases). An activity
execution is considered as rework if it has been previously executed during a case [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. An
activity is optional if it does not occur in at least one case [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Arcs contain numbers indicating
the information of activities flowing through them: their frequency, their service time, their
cost, and whether they are considered rework and/or optional activities.
      </p>
      <p>Compatibility with the PM4Py library was taken in consideration so that event logs loaded
and filtered using PM4Py can be visualized using MPVIS. This allows incorporating the
functionalities of MPVIS into existing PM4Py workflows to enhance process model visualization.</p>
      <p>Additional grouping and pruning functions have been implemented in MPVIS, which allow
reducing the complexity of the resulting visualizations. Specifically, activity grouping functions
that facilitate dealing with parallelism or reducing the length of a DRT’s non-bifurcating paths,
and pruning functions that allow limiting visualizations to a certain number of execution
variants or a DRT’s depth, have been implemented.</p>
    </sec>
    <sec id="sec-6">
      <title>5. Library Maturity &amp; Limitations</title>
      <p>The library is currently in a stable version and no major bugs are known. Its functionalities
have been tested through its application in several synthetic and publicly-available real-life
event logs. The resulting process models can be found at https://bit.ly/mpvis-examples, and a
table summarizing the considered real-life event logs and statistics related to their discovery
and visualization times can be found at https://bit.ly/mpvis-statistics.</p>
      <p>The maturity of the library can also be discussed in terms of its current limitations. An
inherent limitation of DFGs and DRTs is that they have dificulties representing parallelism.
Another inherent limitation of process models is their dificulty to visualize processes with
several process variants. The grouping and pruning functions of MPVIS allow to alleviate
these limitations as they reduce the resulting process model’s complexity or group activities
that might occur in parallel. Future work considers possibly extending the multi-perspective
visualization of processes to notations that more adequately support complex behavior, such as
BPMN. Another limitation is that specific process perspectives and performance dimensions
were considered for both MP-DFG and MD-DRT. Future versions of the library will consider the
definition of custom perspectives through user-selected aggregation metrics and color ranges.</p>
    </sec>
    <sec id="sec-7">
      <title>6. Conclusion</title>
      <p>This work presents a library for visualizing multiple process perspectives and performance
dimensions in a single process model. Specifically, it provides a MP-DFG implementation where
nodes and arcs are decorated with multiple colors to denote the control-flow perspective and
the time and cost dimensions of a process, and a MD-DRT implementation where states and
transitions are decorated with multiple colors to denote process performance for the time,
cost, quality, and flexibility dimensions. These multidimensional visualizations allow analyzing
process behavior while considering multiple perspectives in conjunction, without requiring the
generation of process models for every perspective or performance dimension. This facilitates
their side-to-side comparison for identifying trade-ofs and other multidimensional insights.</p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgments</title>
      <p>This work was funded by the Agencia Nacional de Investigación y Desarrollo de Chile [grant
numbers ANID FONDECYT 1230697, and ANID-Subdirección de Capital Humano/Doctorado
Nacional/2021-21210022].</p>
    </sec>
    <sec id="sec-9">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors used ChatGPT to: Grammar and spelling
check, Improve writing style, Paraphrase and reword. After its use, the authors reviewed
and edited the content as needed and take full responsibility for the publication’s content.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>W. Van der Aalst</surname>
          </string-name>
          ,
          <article-title>Process mining: data science in action</article-title>
          , Springer,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S. J.</given-names>
            <surname>Leemans</surname>
          </string-name>
          , E. Poppe, M. T. Wynn,
          <article-title>Directly follows-based process mining: Exploration &amp; a case study</article-title>
          ,
          <source>in: 2019 International Conference on Process Mining</source>
          , IEEE,
          <year>2019</year>
          , pp.
          <fpage>25</fpage>
          -
          <lpage>32</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>R.</given-names>
            <surname>Cao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Ni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Zeng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Lu</surname>
          </string-name>
          , C. Liu,
          <string-name>
            <given-names>H.</given-names>
            <surname>Duan</surname>
          </string-name>
          ,
          <article-title>Remaining time prediction for business processes with concurrency based on log representation</article-title>
          ,
          <source>China Communications</source>
          <volume>18</volume>
          (
          <year>2021</year>
          )
          <fpage>76</fpage>
          -
          <lpage>91</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Wynn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. Z.</given-names>
            <surname>Low</surname>
          </string-name>
          , A. ter
          <string-name>
            <surname>Hofstede</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          <string-name>
            <surname>Nauta</surname>
          </string-name>
          ,
          <article-title>A framework for cost-aware process management: cost reporting and cost prediction</article-title>
          ,
          <source>Journal of Universal Computer Science</source>
          <volume>20</volume>
          (
          <year>2014</year>
          )
          <fpage>406</fpage>
          -
          <lpage>430</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M.</given-names>
            <surname>Dumas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. La</given-names>
            <surname>Rosa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Mendling</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. A.</given-names>
            <surname>Reijers</surname>
          </string-name>
          , Fundamentals of business process management, Springer,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>G.</given-names>
            <surname>Janssenswillen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Depaire</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Swennen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Jans</surname>
          </string-name>
          ,
          <string-name>
            <surname>K.</surname>
          </string-name>
          <article-title>Vanhoof, bupar: Enabling reproducible business process analysis</article-title>
          ,
          <source>Knowledge-Based Systems</source>
          <volume>163</volume>
          (
          <year>2019</year>
          )
          <fpage>927</fpage>
          -
          <lpage>930</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>C.</given-names>
            <surname>Ullrich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Lata</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Geyer-Klingeberg</surname>
          </string-name>
          ,
          <article-title>Celonis studio-a low-code development platform for citizen developers</article-title>
          , in: BPM (PhD/Demos),
          <year>2021</year>
          , pp.
          <fpage>102</fpage>
          -
          <lpage>105</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>I.</given-names>
            <surname>Velásquez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sepúlveda</surname>
          </string-name>
          ,
          <article-title>A tool for visualizing costs of process variants through directed rooted trees</article-title>
          ,
          <source>in: BPM (Demos/Resources Forum)</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>72</fpage>
          -
          <lpage>76</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Bertrand</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. De Weerdt</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Serral</surname>
          </string-name>
          ,
          <article-title>A novel multi-perspective trace clustering technique for iot-enhanced processes: A case study in smart manufacturing</article-title>
          ,
          <source>in: International Conference on Business Process Management</source>
          , Springer,
          <year>2023</year>
          , pp.
          <fpage>395</fpage>
          -
          <lpage>412</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>H. H.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          ,
          <article-title>Stage-aware business process mining</article-title>
          ,
          <source>Thesis</source>
          , Queensland University of Technology,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>F. M.</given-names>
            <surname>Maggi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dumas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>García-Bañuelos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Montali</surname>
          </string-name>
          ,
          <article-title>Discovering data-aware declarative process models from event logs</article-title>
          ,
          <source>in: Business Process Management: 11th International Conference</source>
          , Beijing, China. Proceedings, Springer,
          <year>2013</year>
          , pp.
          <fpage>81</fpage>
          -
          <lpage>96</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>A.</given-names>
            <surname>Berti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. van Der</given-names>
            <surname>Aalst</surname>
          </string-name>
          ,
          <article-title>Extracting multiple viewpoint models from relational databases</article-title>
          ,
          <source>in: International Symposium on Data-Driven Process Discovery and Analysis</source>
          , Springer,
          <year>2018</year>
          , pp.
          <fpage>24</fpage>
          -
          <lpage>51</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>R.</given-names>
            <surname>Shraga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Schumacher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Senderovich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Weidlich</surname>
          </string-name>
          ,
          <article-title>Process discovery with context-aware process trees</article-title>
          ,
          <source>Information Systems</source>
          <volume>106</volume>
          (
          <year>2022</year>
          )
          <fpage>101533</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>A.</given-names>
            <surname>Berti</surname>
          </string-name>
          , S. van Zelst,
          <string-name>
            <given-names>D.</given-names>
            <surname>Schuster</surname>
          </string-name>
          ,
          <article-title>Pm4py: a process mining library for python</article-title>
          ,
          <source>Software Impacts</source>
          <volume>17</volume>
          (
          <year>2023</year>
          )
          <fpage>100556</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>M. La Rosa</surname>
            ,
            <given-names>H. A.</given-names>
          </string-name>
          <string-name>
            <surname>Reijers</surname>
          </string-name>
          ,
          <string-name>
            <surname>W. M. Van Der Aalst</surname>
            ,
            <given-names>R. M.</given-names>
          </string-name>
          <string-name>
            <surname>Dijkman</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Mendling</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Dumas</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>García-Bañuelos</surname>
          </string-name>
          ,
          <article-title>Apromore: An advanced process model repository</article-title>
          ,
          <source>Expert Systems with Applications</source>
          <volume>38</volume>
          (
          <year>2011</year>
          )
          <fpage>7029</fpage>
          -
          <lpage>7040</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>