<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>BPMN-Redrawer: From Images to BPMN Models</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Alessandro Antinori</string-name>
          <email>alessandro.antinori@unicam.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Riccardo Coltrinari</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Flavio Corradini</string-name>
          <email>lfavio.corradini@unicam.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fabrizio Fornari</string-name>
          <email>fabrizio.fornari@unicam.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Barbara Re</string-name>
          <email>barbara.re@unicam.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marco Scarpetta</string-name>
          <email>marco.scarpetta@unicam.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Camerino</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Italy</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Camerino, School of Science and Technology, Computer Science Department</institution>
          ,
          <addr-line>Via Madonna delle Carceri, 7</addr-line>
        </aff>
      </contrib-group>
      <fpage>107</fpage>
      <lpage>111</lpage>
      <abstract>
        <p>BPMN models are often used by researchers to illustrate and validate new approaches that operate over such models. However, not always models are distributed in their source format but they are distributed as images within a document (e.g., a scientific contribution). To actually reuse those models, one has to manually redraw them. Manually redrawing a BPMN model is a time-consuming and error-prone activity. In this work, we present BPMN-Redrawer a tool that makes use of machine learning techniques for supporting the redrawing of BPMN models from images to actual models in .bpmn format. The tool is made available open source and it is open to contribution from the community.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Process Images</kwd>
        <kwd>Machine Learning</kwd>
        <kwd>BPMN</kwd>
        <kwd>Process Model</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>.png</p>
      <p>Upload
Model Image</p>
      <p>Detect
BPMN Node
Elements</p>
      <p>Detect
Connecting
Objects</p>
      <p>Detect
Labels</p>
      <p>Connect
BPMN Nodes</p>
      <p>Assign Labels
to BPMN
Nodes and
Connecting
Objects</p>
      <p>Generate the
BPMN Model</p>
      <p>Adjust the</p>
      <p>BPMN Model
.bpmn</p>
      <p>Download
the BPMN
Model .bpmn</p>
      <p>In this paper we present BPMN-Redrawer, an approach, and the corresponding open source
tool, to support the activity of “redrawing” BPMN models from images. BPMN-Redrawer
comes in the form of a web application that provides a user interface to request the automatic
redrawing of a BPMN model. The conversion of images in actual BPMN models can be seen
as an object detection problem in which the to-be-detected objects are BPMN elements. Our
approach makes usage of supervised machine learning algorithms and tools for the training of
models capable of detecting BPMN elements. This approach could foster the reuse of BPMN
models reported within documents, converting them from images to .bpmn files that can be
used for conducting further activities.</p>
      <p>
        Recently, a tool that shares our objective of speeding up the redrawing of BPMN models
has been developed [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. The tool focuses on the digitalization of hand-drawn BPMN models.
Despite the source code is not made available, which hinders the possibility for the community
to contribute to its improvement, the tool provides great results over models manually designed
by university students. In our case we focus instead on images of BPMN models originally
designed with BPMN editors. We believe that in a realistic scenario no one would choose to
manually design a BPMN model when there are more than 70 BPMN modeling tools2 that could
be used. In addition, we open BPMN-Redrawer to the research community so that anyone can
access it, take inspiration, apply changes and enhancements.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. BPMN-Redrawer Main Functionalities</title>
      <p>The functionalities of BPMN-Redrawer are made available through a web application that allows
users to upload .png images and request their conversion into actual BPMN models stored in
.bpmn format. We refer to such a process with the term “model redrawing”. A schematization
of such a process together with the technology involved, is reported in Fig. 1. The process of
redrawing a BPMN model could be divided into three main phases: BPMN element detection,
BPMN element linking, and BPMN model generation. In the following we report a detailed
description of each phase and step.</p>
      <sec id="sec-2-1">
        <title>2BPMN Modeling Tools: https://bpmnmatrix.github.io/</title>
        <p>
          Detection Phase. It is the main phase of the approach, that starts after a user uploads an
image. It is composed of three steps: BPMN nodes detection, BPMN connecting object detection,
and BPMN label detection. For detecting BPMN nodes and connecting objects we use the
well-established framework Detectron23 which provides state-of-the-art detection algorithms
with already trained baselines.BPMN nodes can be detected by means of their bounding boxes.
To do so, we fine-tuned a pre-trained Faster Region Based Convolutional Neural Networks
(R-CNN) using Stochastic Gradient Descent with a batch size of 4 and a decreasing learning
rate starting from 0.0025. As a Convolutional Neural Network (CNN) backbone, we chose the
ResNet-50 with Feature Pyramid Network (FPN). The detection of BPMN connecting objects
implies not only the discovery of its bounding box but also the prediction of two keypoints:
one for the head and one for the tail. For this reason, we fine tuned a pre-trained Keypoint
R-CNN network using Stochastic Gradient Descent with batch size at 4 and a learning rate
ifxed at 0.00025. In this case, as a backbone, we use a ResNet-101 with FPN. For detecting BPMN
labels, we used the Optical Character Recognition (OCR) engine Tesseract [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] which analyses
the image from left to right and from top to bottom. BPMN elements and labels are stored in
dedicated data structures that will be input for the next steps.
        </p>
        <p>Linking Phase. It is the second phase of the redrawing process and it comprehends two
steps: Connect BPMN Nodes and Assign Labels to BPMN elements. For such a connection we
relied on the euclidean distance between elements: the target and the source of a connecting
object are associated with the elements that are respectively closer to its head and tail, while
each label is assigned to the closest element.</p>
        <p>BPMN Model Generation Phase. It is the third and last phase of the approach. It is composed
of two steps: BPMN model generation and BPMN model adjustment. The first step consists in
parsing the results of the previous phases into a BPMN model (in .bpmn format). The resulting
ifle is obtained by populating a BPMN template, that we created with the Jinja templating
engine.4 Once the BPMN model is obtained, the user can either download it or adjust it using
the integrated bpmn-js editor. This operation is facilitated by the display of the original image
beside the editor as reported in Fig. 2.</p>
      </sec>
      <sec id="sec-2-2">
        <title>3Detectron2: https://github.com/facebookresearch/detectron2 4Jinja: https://jinja.palletsprojects.com/en/3.1.x/</title>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Maturity of the Tool</title>
      <p>
        BPMN-Redrawer is able to recognise up to thirty-two BPMN nodes and three BPMN connecting
objects (e.g., Sequence Flow, Message Flow, Data Association). All the elements are reported in
Fig. 3 together with their name and the average precision (AP) with which BPMN-Redrawer
recognises them. For conducting the training activities we started from a dataset of 663 BPMN
models images derived from models stored on the RePROSitory platform.5 However, considering
that the usage of the BPMN elements varies between models [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], not all the BPMN elements
where present in a suficient number in such a dataset. Therefore, we extended the dataset with
the addition of 165 images of models designed ad-hoc for increasing the amount of instances
of those BPMN elements. The resulting dataset is composed of 828 BPMN models images; the
images and the dataset in COCO format are available online.6 With a larger dataset we could
reach a higher precision and we could also enable the redrawing of elements that have not been
considered in this version of the tool such as lanes and complex gateway.
      </p>
      <p>
        Some BPMN elements may present graphical markers (e.g., send/receive tasks, manual tasks,
script tasks etc.). In the actual version of the tool, elements are redrawn without markers leaving
to the user the possibility to adjust the model by means of the available editor and original
image (see Fig. 2). Some BPMN models may also report customised elements due to BPMN
extensions [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]; those elements to be recognised must be included in the training dataset.
      </p>
      <p>Diferent BPMN editors can have diferent ways of graphically representing the same BPMN
element (e.g., adding a coloured background, having a diferent size of the element, etc.). In this
ifrst version of BPMN-Redrawer we started from images of BPMN models designed with the
bpmn-js editor therefore more accurate results are get when we ask BPMN-Redrawer to redraw
a model originally designed with bpmn-js. We plan to improve our tool by extending the dataset
we used for the training activities by also including images of BPMN models designed with
5Models Collection used: https://pros.unicam.it:4200/guest/collection/bpmn_redrawer
6BPMN-Redrawer Dataset: https://huggingface.co/PROSLab/BPMN-Redrawer-Dataset
other editors such as: Signavio, Eclipse BPMN Modeler, etc.</p>
      <p>Since BPMN-Redrawer operates starting from images of BPMN models, the quality of the
image afects the capability of properly recognising and redrawing the elements. We are working
to define possible indicators of the image quality that can impact the redrawing of the model.
As future work we also plan to add functionalities to the platform in such a way to provide,
together with the redrawn model also information about the quality of the obtained result,
reporting whether the model is a valid BPMN model, and the amount of elements that have
been redrawn. In addition, we plan to improve the results of the connecting objects and labels
recognition steps by investigating additional techniques to be used.</p>
      <p>BPMN-Redrawer is open source, this enables anyone to access the code, apply changes, train
new machine learning models starting from diferent datasets of BPMN images, and easily
deploy them. The same approach can be used for automatising the redrawing of other types of
models such as: Petri Nets, Event Process Chain, UML Diagrams, etc.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Screencast and Website</title>
      <p>BPMN-Redrawer tool is accessible at http://pros.unicam.it/bpmn-redrawer-tool. The screencast
available at https://youtu.be/0e2qnbSp9XY shows a typical user experience. The source code is
available at https://github.com/PROSLab/BPMN-Redrawer. A Docker image to easily deploy the
tool is also made available at https://hub.docker.com/repository/docker/proslab/bpmn-redrawer.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>I.</given-names>
            <surname>Compagnucci</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Corradini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Fornari</surname>
          </string-name>
          ,
          <string-name>
            <surname>B. Re,</surname>
          </string-name>
          <article-title>Trends on the Usage of BPMN 2.0 from Publicly Available Repositories</article-title>
          ,
          <source>in: BIR</source>
          <year>2021</year>
          , Vienna, Austria,
          <source>September 22-24</source>
          ,
          <year>2021</year>
          , Proceedings, volume
          <volume>430</volume>
          <source>of LNBIP</source>
          , Springer,
          <year>2021</year>
          , pp.
          <fpage>84</fpage>
          -
          <lpage>99</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>F.</given-names>
            <surname>Corradini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ferrari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Fornari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gnesi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Polini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Re</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. O.</given-names>
            <surname>Spagnolo</surname>
          </string-name>
          ,
          <article-title>A Guidelines framework for understandable BPMN models, Data Knowl</article-title>
          .
          <source>Eng</source>
          .
          <volume>113</volume>
          (
          <year>2018</year>
          )
          <fpage>129</fpage>
          -
          <lpage>154</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>F.</given-names>
            <surname>Corradini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Fornari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Polini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Re</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Tiezzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Vandin</surname>
          </string-name>
          ,
          <article-title>A formal approach for the analysis of BPMN collaboration models</article-title>
          ,
          <source>J. Syst. Softw</source>
          .
          <volume>180</volume>
          (
          <year>2021</year>
          )
          <fpage>111007</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>F.</given-names>
            <surname>Corradini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Fornari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Polini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Re</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Tiezzi</surname>
          </string-name>
          ,
          <article-title>RePROSitory: a Repository Platform for Sharing Business PROcess modelS</article-title>
          , volume
          <volume>2420</volume>
          <source>of CEUR Workshop Proceedings</source>
          , CEURWS.org,
          <year>2019</year>
          , pp.
          <fpage>149</fpage>
          -
          <lpage>153</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>B.</given-names>
            <surname>Schäfer</surname>
          </string-name>
          , H. van der Aa, H. Leopold, H. Stuckenschmidt,
          <article-title>Sketch2BPMN: Automatic Recognition of Hand-Drawn BPMN Models</article-title>
          , volume
          <volume>12751</volume>
          <source>of LNCS</source>
          , Springer,
          <year>2021</year>
          , pp.
          <fpage>344</fpage>
          -
          <lpage>360</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>R.</given-names>
            <surname>Smith</surname>
          </string-name>
          ,
          <article-title>An overview of the tesseract OCR engine</article-title>
          ,
          <source>in: 9th International Conference on Document Analysis and Recognition (ICDAR</source>
          <year>2007</year>
          ),
          <fpage>23</fpage>
          -
          <lpage>26</lpage>
          September, Curitiba, Paraná, Brazil, IEEE Computer Society,
          <year>2007</year>
          , pp.
          <fpage>629</fpage>
          -
          <lpage>633</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>I.</given-names>
            <surname>Compagnucci</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Corradini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Fornari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Polini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Re</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Tiezzi</surname>
          </string-name>
          ,
          <article-title>Modelling Notations for IoT-Aware Business Processes: A Systematic Literature Review</article-title>
          , in: BPM 2020
          <string-name>
            <given-names>International</given-names>
            <surname>Workshops</surname>
          </string-name>
          , Seville, Spain,
          <source>September 13-18</source>
          ,
          <year>2020</year>
          , volume
          <volume>397</volume>
          <source>of LNBIP</source>
          , Springer,
          <year>2020</year>
          , pp.
          <fpage>108</fpage>
          -
          <lpage>121</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>