BPMN-Redrawer: From Images to BPMN Models
Alessandro Antinori1,† , Riccardo Coltrinari1,† , Flavio Corradini1,† , Fabrizio Fornari1,*,† ,
Barbara Re1,† and Marco Scarpetta1,†
1
University of Camerino, School of Science and Technology, Computer Science Department, Via Madonna delle Carceri, 7,
Camerino, Italy


                                         Abstract
                                         BPMN models are often used by researchers to illustrate and validate new approaches that operate over
                                         such models. However, not always models are distributed in their source format but they are distributed
                                         as images within a document (e.g., a scientific contribution). To actually reuse those models, one has
                                         to manually redraw them. Manually redrawing a BPMN model is a time-consuming and error-prone
                                         activity. In this work, we present BPMN-Redrawer a tool that makes use of machine learning techniques
                                         for supporting the redrawing of BPMN models from images to actual models in .bpmn format. The tool
                                         is made available open source and it is open to contribution from the community.

                                         Keywords
                                         Process Images, Machine Learning, BPMN, Process Model


1. Introduction
BPMN is the de facto standard for modeling business processes1 and models designed with such
a notation are often used to conduct research activities in the BPM field. From the most common
scenario where a BPMN model is used to illustrate the proposal of a new approach, to more
systematic and complex activities such as studies on modeling practices [1, 2], and validation
of techniques and tools related to the various phases of the BPM life cycle [3]. Those BPMN
models are then reported, as images, in scientific works (e.g., BPM conference proceedings).
However, models reported in scientific works, are rarely made available in their source format
(i.e., .bpmn). This requires those who would like to experiment on those specific models, to
actually “redraw” them from scratch. This is the procedure that the authors of [4] conducted to
harvest BPMN models from BPM conference proceedings. Redrawing a BPMN model from an
image is a manual and error-prone activity that requires a considerable amount of time and
effort.

Proceedings of the Demonstration Resources Track, Best BPM Dissertation Award, and Doctoral Consortium at BPM
2022, BPM 2022, Münster, Germany, September 11-16, 2022
*
  Corresponding author.
†
  These authors contributed equally.
$ alessandro.antinori@unicam.it (A. Antinori); riccardo.antinori@unicam.it (R. Coltrinari);
flavio.corradini@unicam.it (F. Corradini); fabrizio.fornari@unicam.it (F. Fornari); barbara.re@unicam.it (B. Re);
marco.scarpetta@unicam.it (M. Scarpetta)
 0000-0003-3670-0415 (A. Antinori); 0000-0002-2137-6731 (R. Coltrinari); 0000-0001-6767-2184 (F. Corradini);
0000-0002-3620-1723 (F. Fornari); 0000-0001-5374-2364 (B. Re); 0000-0002-9659-7264 (M. Scarpetta)
                                       © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR

           CEUR Workshop Proceedings (CEUR-WS.org)
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073


1
    BPMN official page: https://www.bpmn.org/


                                                                                                        107
        .png


                                                                      Assign Labels
                     Detect      Detect                                 to BPMN                                             Download
       Upload                               Detect         Connect                    Generate the            Adjust the
                   BPMN Node   Connecting                              Nodes and                                            the BPMN
     Model Image                            Labels       BPMN Nodes                   BPMN Model             BPMN Model
                    Elements    Objects                                Connecting                                          Model .bpmn
                                                                         Objects


                                                                                                     .bpmn                   .bpmn
                                                                                                                             [final]


Figure 1: BPMN-Redrawer Approach Schematization.


   In this paper we present BPMN-Redrawer, an approach, and the corresponding open source
tool, to support the activity of “redrawing” BPMN models from images. BPMN-Redrawer
comes in the form of a web application that provides a user interface to request the automatic
redrawing of a BPMN model. The conversion of images in actual BPMN models can be seen
as an object detection problem in which the to-be-detected objects are BPMN elements. Our
approach makes usage of supervised machine learning algorithms and tools for the training of
models capable of detecting BPMN elements. This approach could foster the reuse of BPMN
models reported within documents, converting them from images to .bpmn files that can be
used for conducting further activities.
   Recently, a tool that shares our objective of speeding up the redrawing of BPMN models
has been developed [5]. The tool focuses on the digitalization of hand-drawn BPMN models.
Despite the source code is not made available, which hinders the possibility for the community
to contribute to its improvement, the tool provides great results over models manually designed
by university students. In our case we focus instead on images of BPMN models originally
designed with BPMN editors. We believe that in a realistic scenario no one would choose to
manually design a BPMN model when there are more than 70 BPMN modeling tools2 that could
be used. In addition, we open BPMN-Redrawer to the research community so that anyone can
access it, take inspiration, apply changes and enhancements.


2. BPMN-Redrawer Main Functionalities
The functionalities of BPMN-Redrawer are made available through a web application that allows
users to upload .png images and request their conversion into actual BPMN models stored in
.bpmn format. We refer to such a process with the term “model redrawing”. A schematization
of such a process together with the technology involved, is reported in Fig. 1. The process of
redrawing a BPMN model could be divided into three main phases: BPMN element detection,
BPMN element linking, and BPMN model generation. In the following we report a detailed
description of each phase and step.
2
    BPMN Modeling Tools: https://bpmnmatrix.github.io/


                                                     108
Figure 2: BPMN-Redrawer User Interface After Model Redrawing.


   Detection Phase. It is the main phase of the approach, that starts after a user uploads an
image. It is composed of three steps: BPMN nodes detection, BPMN connecting object detection,
and BPMN label detection. For detecting BPMN nodes and connecting objects we use the
well-established framework Detectron23 which provides state-of-the-art detection algorithms
with already trained baselines.BPMN nodes can be detected by means of their bounding boxes.
To do so, we fine-tuned a pre-trained Faster Region Based Convolutional Neural Networks
(R-CNN) using Stochastic Gradient Descent with a batch size of 4 and a decreasing learning
rate starting from 0.0025. As a Convolutional Neural Network (CNN) backbone, we chose the
ResNet-50 with Feature Pyramid Network (FPN). The detection of BPMN connecting objects
implies not only the discovery of its bounding box but also the prediction of two keypoints:
one for the head and one for the tail. For this reason, we fine tuned a pre-trained Keypoint
R-CNN network using Stochastic Gradient Descent with batch size at 4 and a learning rate
fixed at 0.00025. In this case, as a backbone, we use a ResNet-101 with FPN. For detecting BPMN
labels, we used the Optical Character Recognition (OCR) engine Tesseract [6] which analyses
the image from left to right and from top to bottom. BPMN elements and labels are stored in
dedicated data structures that will be input for the next steps.
   Linking Phase. It is the second phase of the redrawing process and it comprehends two
steps: Connect BPMN Nodes and Assign Labels to BPMN elements. For such a connection we
relied on the euclidean distance between elements: the target and the source of a connecting
object are associated with the elements that are respectively closer to its head and tail, while
each label is assigned to the closest element.
   BPMN Model Generation Phase. It is the third and last phase of the approach. It is composed
of two steps: BPMN model generation and BPMN model adjustment. The first step consists in
parsing the results of the previous phases into a BPMN model (in .bpmn format). The resulting
file is obtained by populating a BPMN template, that we created with the Jinja templating
engine.4 Once the BPMN model is obtained, the user can either download it or adjust it using
the integrated bpmn-js editor. This operation is facilitated by the display of the original image
beside the editor as reported in Fig. 2.

3
    Detectron2: https://github.com/facebookresearch/detectron2
4
    Jinja: https://jinja.palletsprojects.com/en/3.1.x/


                                                       109
Figure 3: BPMN elements recognized by BPMN-Redrawer and Average Precision (AP).


3. Maturity of the Tool
BPMN-Redrawer is able to recognise up to thirty-two BPMN nodes and three BPMN connecting
objects (e.g., Sequence Flow, Message Flow, Data Association). All the elements are reported in
Fig. 3 together with their name and the average precision (AP) with which BPMN-Redrawer
recognises them. For conducting the training activities we started from a dataset of 663 BPMN
models images derived from models stored on the RePROSitory platform.5 However, considering
that the usage of the BPMN elements varies between models [1], not all the BPMN elements
where present in a sufficient number in such a dataset. Therefore, we extended the dataset with
the addition of 165 images of models designed ad-hoc for increasing the amount of instances
of those BPMN elements. The resulting dataset is composed of 828 BPMN models images; the
images and the dataset in COCO format are available online.6 With a larger dataset we could
reach a higher precision and we could also enable the redrawing of elements that have not been
considered in this version of the tool such as lanes and complex gateway.
   Some BPMN elements may present graphical markers (e.g., send/receive tasks, manual tasks,
script tasks etc.). In the actual version of the tool, elements are redrawn without markers leaving
to the user the possibility to adjust the model by means of the available editor and original
image (see Fig. 2). Some BPMN models may also report customised elements due to BPMN
extensions [7]; those elements to be recognised must be included in the training dataset.
   Different BPMN editors can have different ways of graphically representing the same BPMN
element (e.g., adding a coloured background, having a different size of the element, etc.). In this
first version of BPMN-Redrawer we started from images of BPMN models designed with the
bpmn-js editor therefore more accurate results are get when we ask BPMN-Redrawer to redraw
a model originally designed with bpmn-js. We plan to improve our tool by extending the dataset
we used for the training activities by also including images of BPMN models designed with
5
    Models Collection used: https://pros.unicam.it:4200/guest/collection/bpmn_redrawer
6
    BPMN-Redrawer Dataset: https://huggingface.co/PROSLab/BPMN-Redrawer-Dataset


                                                       110
other editors such as: Signavio, Eclipse BPMN Modeler, etc.
   Since BPMN-Redrawer operates starting from images of BPMN models, the quality of the
image affects the capability of properly recognising and redrawing the elements. We are working
to define possible indicators of the image quality that can impact the redrawing of the model.
As future work we also plan to add functionalities to the platform in such a way to provide,
together with the redrawn model also information about the quality of the obtained result,
reporting whether the model is a valid BPMN model, and the amount of elements that have
been redrawn. In addition, we plan to improve the results of the connecting objects and labels
recognition steps by investigating additional techniques to be used.
   BPMN-Redrawer is open source, this enables anyone to access the code, apply changes, train
new machine learning models starting from different datasets of BPMN images, and easily
deploy them. The same approach can be used for automatising the redrawing of other types of
models such as: Petri Nets, Event Process Chain, UML Diagrams, etc.


4. Screencast and Website
BPMN-Redrawer tool is accessible at http://pros.unicam.it/bpmn-redrawer-tool. The screencast
available at https://youtu.be/0e2qnbSp9XY shows a typical user experience. The source code is
available at https://github.com/PROSLab/BPMN-Redrawer. A Docker image to easily deploy the
tool is also made available at https://hub.docker.com/repository/docker/proslab/bpmn-redrawer.


References
[1] I. Compagnucci, F. Corradini, F. Fornari, B. Re, Trends on the Usage of BPMN 2.0 from
    Publicly Available Repositories, in: BIR 2021, Vienna, Austria, September 22-24, 2021,
    Proceedings, volume 430 of LNBIP, Springer, 2021, pp. 84–99.
[2] F. Corradini, A. Ferrari, F. Fornari, S. Gnesi, A. Polini, B. Re, G. O. Spagnolo, A Guidelines
    framework for understandable BPMN models, Data Knowl. Eng. 113 (2018) 129–154.
[3] F. Corradini, F. Fornari, A. Polini, B. Re, F. Tiezzi, A. Vandin, A formal approach for the
    analysis of BPMN collaboration models, J. Syst. Softw. 180 (2021) 111007.
[4] F. Corradini, F. Fornari, A. Polini, B. Re, F. Tiezzi, RePROSitory: a Repository Platform for
    Sharing Business PROcess modelS, volume 2420 of CEUR Workshop Proceedings, CEUR-
    WS.org, 2019, pp. 149–153.
[5] B. Schäfer, H. van der Aa, H. Leopold, H. Stuckenschmidt, Sketch2BPMN: Automatic
    Recognition of Hand-Drawn BPMN Models, volume 12751 of LNCS, Springer, 2021, pp.
    344–360.
[6] R. Smith, An overview of the tesseract OCR engine, in: 9th International Conference on
    Document Analysis and Recognition (ICDAR 2007), 23-26 September, Curitiba, Paraná,
    Brazil, IEEE Computer Society, 2007, pp. 629–633.
[7] I. Compagnucci, F. Corradini, F. Fornari, A. Polini, B. Re, F. Tiezzi, Modelling Notations for
    IoT-Aware Business Processes: A Systematic Literature Review, in: BPM 2020 International
    Workshops, Seville, Spain, September 13-18, 2020, volume 397 of LNBIP, Springer, 2020, pp.
    108–121.


                                              111