Model Reader Preferences for Semantically Duplicate
Elements in BPMN
Daniel Lübke1,2,∗ , Volker Stiehl3
1
  Digital Solution Architecture GmbH, Hannover, Germany
2
  Leibniz Universität Hannover, FG Software Engineering, Hannover, Germany
3
  TH Ingolstadt, Ingolstadt, Germany


                                         Abstract
                                         BPMN, which is the underlying modeling notation of many BPM endeavours and business information
                                         system development projects, is a rich modeling language, which also oﬀers redundant constructs, i.e.,
                                         diﬀerent syntax can express the same semantics. We want to investigate which syntactical constructs
                                         are preferred by model readers if diﬀerent ways to model message exchanges are oﬀered by BPMN.
                                         In an empirical study we asked 77 participants which BPMN model they prefer for expressing eight
                                         situations. We found that send tasks and intermediate message catch events are signiﬁcantly preferred.
                                         Also, event-based gateways are preferred over boundary events for many variants of the Deferred Choice
                                         pattern.

                                         Keywords
                                         BPMN, Empirical Study, Gateway, Boundary Event, Message, Subjective Preference, Event-based Gateway


1. Motivation
BPMN is THE standard for modeling business processes. Nowadays, business-critical appli-
cations based on BPMN and modern architectures [1, 2] are developed to digitize important
business processes. Consequently, BPMN is used to communicate between a variety of stake-
holders, e.g., developers and business analysts, and thus understandability is very important.
While BPMN oﬀers a wide set of modeling options for expressing many process details, it con-
tains redundant constructs. For example, modeling message arrival time-outs can be modeled in
diﬀerent ways as explained in this paper. Allowing ambiguity how to model a certain situation
allows for confusion and misunderstandings. Consequently, clarifying the usage of redundant
syntax could standardize the current use of BPMN, streamline future versions of BPMN and
thus make the notation easier to learn and understand. This paper presents a ﬁrst step into this
direction by investigating the subjective preferences of a) modeling message-based commu-
nication and b) representations of the deferred choice workﬂow pattern [3], when messages
are involved. This paper is structured as follows. Within the next Section we present related

ZEUS’2023: S15th Central European Workshop on Services and their Composition, February 16–17, 2023, Hannover,
Germany
∗
  Corresponding author.
$ daniel.luebke@digital-solution-architecture.com (D. Lübke); volker.stehl@thi.de (V. Stiehl)
 https://www.digital-solution-architecture.com (D. Lübke)
 0000-0002-1557-8804 (D. Lübke)
                                       © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)


aX "º?K M/ .X GC#F2 U1/bXV, R8i? w1la qQ`Fb?QT- w1la kykj- >MMQp2`- :2`KMv-
ReRd 62#`m`v kykj- Tm#HBb?2/ i ?iiT,ff+2m`@rbXQ`;
GC#F2 M/ aiB2?H, JQ/2H _2/2` S`272`2M+2b 7Q` a2KMiB+HHv .mTHB+i2 1H2K2Mib BM
"SJLL

work. In Section 3 the design of our empirical study is presented. The results are presented in
Section 4 and an interpretation of those are given in Section 5. Finally, we conclude and give an
outlook.


2. Related Work
Quality of business process models is multi-faceted. Lindland et al. [4] speciﬁed a framework
that can be used to categorize diﬀerent quality aspects of models, in which they distinguish
between syntactic, semantic, and pragmatic qualities. This paper is concerned with subjective
preference of certain model constructs. Because “[i]n general, researchers associate aesthetics
with readability, and readability with understanding” [5] subjective preference is a part of
understandability and thus a pragmatic quality. Or as Lindland et al. put it: Understandability is
the main concern of pragmatic model quality, which “aﬀects how to choose from among the
many ways to express a single meaning” [4]. Comprehension of BPMN models is a vast research
area: For example, there are studies concerning the inﬂuence of layout on understandability.
Figl provides a good overview [6]. Scholz & Lübke [7] investigated subjective layout preferences
and used the same research design as we do: By using a quiz-like study, in which participants
choose one of the presented options, they have analyzed subjective preferences of diﬀerent
choices for BPMN layouts. Moody [8] has critiqued BPMN in general for failing to adhere to his
“Physics of Notations” [9] – especially that BPMN has considerable semantic redundancy, e.g.,
the Exclusive OR Gateway has two visual representations. Genon et al. [10] found the same. The
eCH-0158 modeling guidelines for BPMN [11] recognize the redundancy between send/receive
tasks and message catching/throwing events. They standardize on send tasks and message catch
events.


3. Study Design
3.1. Goals, Hypothesis & Variables
By following the Goal-Question-Metric (GQM) approach [12] we are deﬁning our goal as

                             Understand the Subjective Preference
                 with regard to Semantically Equivalent Elements in BPMN 2.0
                            from the viewpoint of a Model Reader.
  This goal is reﬁned into (research) questions. While BPMN has many redundancies, we
concentrate on the ones below. We want to answer, which construct for each of the following
pairs of semantically equivalent BPMN constructs are preferred:

RQ1 Send Task vs. Intermediate Message Throw Event: BPMN oﬀers two elements for
    sending messages: The send task and the intermediate message throw event both send a
    message.

RQ2 Receive Task vs. Intermediate Message Catch Event: Similarily to sending a mes-
    sage, BPMN also oﬀers a receive task and an intermediate message catch event for receiving


                                                                                                k
GC#F2 M/ aiB2?H, JQ/2H _2/2` S`272`2M+2b 7Q` a2KMiB+HHv .mTHB+i2 1H2K2Mib BM
"SJLL

      a message.

RQ3 Send Task vs. End Message Throw Event: For modeling the sending of a message at
    the end of a process execution, a send task and a none end event can be used. Alternatively,
    an message throw end event can be used.

RQ4 Deferred Choice between two messages (diﬀ. prob.): A Deferred Choice [3] be-
    tween two incoming messages can be modelled via an event-based gateway or a receive
    task with an interrupting message boundary event. Because one participant in [7] indicated
    that he/she would model splits and joins diﬀerently depending on the probability of
    the branch taken, we diﬀerentiate between the probability of events. This question is
    concerned with messages that have diﬀerent probabilities, i.e., the top event after the
    event-based gateway and the message caught by the receive task are more likely to occur
    than the bottom event, which is more exceptional, after the event-based gateway and the
    message caught by the boundary event.

RQ5 Deferred Choice between two messages (same prob.): This question is similar to
    RQ4. However, the incoming messages have the same probability, i.e., both events
    following the event-based gateway and both messages occur equally often.

RQ6 Deferred Choice between message and timer (diﬀ. prob.): This question is similar
    to RQ4 but this time the Deferred Choice is not between two messages but instead
    resembles a deadline situation with a message event and a timer event. It is more probable
    to receive the message than to time-out. This pattern is presented as an event-based
    gateway with two following events or with a receive task with an interrupting timer
    boundary event.

RQ7 Deferred Choice between message and timer (same prob.): This question is simi-
    lar to RQ6. However, the incoming message and the time-out have the same probability.

RQ8 Deferred Choice between two messages and a timer: The last question is con-
    cerned with a Deferred Choice between two messages and a timer, i.e., a scenario in
    which one of two messages must be received within a certain time. This can – again – be
    modeled as an event-based gateway followed by two message events and one timer event,
    or by a receive task with two boundary events.

3.1.1. Measurements & Hypothesis
We measure the subjective preferences of study participants as the only metric for all research
questions. For all research questions the null hypothesis H0 is that there is no preference for
one of the two alternatives. Accordingly, H1 is that one of the two alternatives is preferred.

3.2. Objects
The study setup is similar to a previous study by Scholz & Lübke [7]: Participants take part
in an online survey in which two diagrams modeling the same process are shown which only


                                                                                              j
GC#F2 M/ aiB2?H, JQ/2H _2/2` S`272`2M+2b 7Q` a2KMiB+HHv .mTHB+i2 1H2K2Mib BM
"SJLL


Table 1
Description of the Participants Groups of our Study
           Group     Experience      Description                               Count
           LUH1        Students      MSc./CS, Software Architecture Lecture        20
           LUH2        Students      BSc./CS, Software Engineering Seminar          3
           LUH3        Students      MSc./CS, Software Methodologies Lecture        4
           THI1        Students      BSc./IS, 4th semester                         11
           THI2        Students      BSc./IS, 6th semester                          6
           THI3        Students      MSc./IE, 2nd semester                          9
           THI4        Students      BSc./IE, 6th semester                         11
           Prof.     Professionals   recruited from diﬀerent organizations         13
           Total                                                                   77


diﬀer in one point. In this study diﬀerent but semantically equivalent BPMN diagrams were
used as shown in Appendix A. Both options were shown side by side and participants had to
choose the preferred one by clicking it. Descriptive text was shown to convey the probablity
of some branches. Since branching probabilities cannot be modeled in BPMN directly, it was
necessary to convey this information textually.

3.3. Participants
Participants were a) recruited from lectures of the authors and b) professionals were asked to
participate. We tracked the group to which a participant belongs to by using diﬀerent invitation
links. Participation was voluntary and no incentives were given. The number of participants
per group and a more detailed description is shown in Table 1. All in all, we had 87 participants
in total. After removing those, who did not complete the quiz or changed their answers in
between, 77 participants remained.

3.4. Validity Procedure
As a ﬁrst step we performed a power test: For a two-sided hypothesis test with α = p = 0.05
and conﬁdence β = 0.95 for a medium eﬀect of h = 0.5 yields that we required at least 52
participants. As described above we recruited more participants than required. For eliminating
extraneous variables we took following measures: We randomized the order in which questions
(i.e., diagram pairs) were shown. Thereby, we try to eliminate learning and fatigue eﬀects. We
also randomized the order in which diagrams are shown.


4. Analysis
The statistical evaluation of the gathered data is shown in Table 2. The statistical signiﬁcance
indicated by the p-values is marked by asterisks (*: p ≤ 0.05, **: p ≤ 0.01, ***: p ≤ 0.001).
Similarily, the eﬀect is denoted by pluses (+: h ≥ 0.2, ++: h ≥ 0.5, +++: h ≥ 0.8).


                                                                                               9
GC#F2 M/ aiB2?H, JQ/2H _2/2` S`272`2M+2b 7Q` a2KMiB+HHv .mTHB+i2 1H2K2Mib BM
"SJLL


Table 2
Results and Hypothesis Test Results for all Questions
       Question                                                 #A   #B      p      *      h     +
 Q1    Send Task vs. Message Throw Event                        53   24    0.0013   **    0.39   +
 Q2    Receive Task vs. Message Catch Event                     29   48    0.0395   *     0.25   +
 Q3    Send Task vs. Message End Event                          39   38    1.0000         0.01
 Q4    Deferred Choice, 2 messages, diﬀ. prob.                  53   24    0.0013   **    0.39   +
       Gateway vs. Boundary Event
 Q5    Deferred Choice, 2 messages, same prob.                  69   8     0.0000   ***   0.91   +++
       Gateway vs. Boundary Event
 Q6    Deferred Choice, message+timer, diﬀ. prob.               36   41    0.6488         0.06
       Gateway vs. Boundary Event
 Q7    Deferred Choice, message+timer, same prob.               43   34    0.3620         0.12
       Gateway vs. Boundary Event
 Q8    Deferred Choice, 2 messages+timer                        58   19    0.0000   ***   0.53   ++
       Gateway vs. Boundary Events


5. Interpretation
5.1. Evaluation of Results & Implications
The send task is signiﬁcantly preferred over a message throw event (RQ1). It seems that partici-
pants see the sending of a message more as a task, i.e., an active action, and therefore prefer the
task instead of an event.
   In contrast to RQ1, participants signiﬁcantly prefer a message catch event for waiting on a
message receive (RQ2). Interestingly, it is inconsistent to use diﬀerent syntax for sending and
receiving messages. This can mean that perhaps participants diﬀerentiate between active and
passive/waiting elements.
   There is no signiﬁcant diﬀerence for sending a message at the process end (RQ3). In contrast
to a signiﬁcant preference for a send task during the process, there is no clear preference for
a send task with an end event or a message end event. It seems that the additional penalty of
a second symbol and its associated space requirements is not worth to keep up the semantic
diﬀerence experienced in RQ1.
   When modeling a Deferred Choice between two messages which arrive with diﬀerent proba-
bilities, participants prefer the use of an event-based gateway (RQ4). It may be that the visuals of
two white envelopes – one in the receive task and one in the boundary event – is not attractive.
Participants have an even stronger preference for the gateway if the probability of the messages
are the same (RQ5).
   When modeling a time-out, i.e., a Deferred Choice between a message and a timer, neither
the gateway nor the boundary event is preferred – regardless of whether the timer is as likely to
occur (RQ6) or is only triggered as an exception (RQ7). This contrasts with the results from
RQ4/5, which are structurally the same but use a diﬀerent second event. While more participants
liked the gateway for same probabilities of events and more participants liked the boundary
event for exceptional cases, these diﬀerences were not signiﬁcant. More research has to further


                                                                                                  8
GC#F2 M/ aiB2?H, JQ/2H _2/2` S`272`2M+2b 7Q` a2KMiB+HHv .mTHB+i2 1H2K2Mib BM
"SJLL

clarify whether there is a diﬀerence with a small eﬀect or not.
   If the Deferred Choice is between two messages and a timer event (RQ8) there is a strong,
signiﬁcant preference to the event-based gateway. However, we cannot attribute to why this is:
While in our study planning we wanted to examine the eﬀect of a larger number of boundary
events, another possible explanation is that a send task with a message boundary event is
disliked as RQ4 and RQ5 have shown.

5.2. Limitations of Study
Because we only measured subjective preferences no quantative data on model comprehension
could be measured. This study still gives insights into model perception, especially with diﬀerent
variants of the Deferred Choice pattern. Like all studies which include students, the question of
generalizability arises. However, we have seen that no diﬀerences between our groups exist –
this also means that the group of professionals does not behave signiﬁcantly diﬀerent from the
students. While we had a considerable amount of participants, some research questions gave
non-signiﬁcant results with a small eﬀect size in the range of 0.1 ≤ h ≤ 0.2. To have adequate
power in the statistical tests, more participants (approx. 350) are required.


6. Implications for Practitioners
Following from these results practitioners should amend existing modeling guidelines by the
following rules: 1) Use Send Tasks for sending messages during process execution, 2) use Message
Catch Events for receiving messages during process execution, and 3) use Event-based Gateways
when implementing the Deferred Choice pattern when receiving multiple messages. Modelers
should keep in mind that this is the ﬁrst study to examine these constructs. Hopefully, future
studies will strengthen or refute these results and thus these proposed modeling guidelines.


7. Conclusions & Outlook
Within this paper we presented our empirical study with students from two universities and
professionals on the subjective preference of syntactically redundant, message-related constructs
in BPMN. We found a strong subjective preference for send tasks over message throw events
within the process-ﬂow and for message catch events over receive tasks. We also found that
Deferred Choices in event-based gateways are preferred over boundary events in the case of
two message events or three events. We could ﬁnd no signiﬁcant preference for Deferred
Choices with a message and a timer (“time-outs”) or for the sending of a message on process
completion. While the results are interesting in themselves, this study lays the foundation
for further empirical inqueries: Follow up studies, especially experiments, can investigate
and compare understandability of redundant BPMN message-related constructs. This way,
especially eye-tracking experiments, can be used to gather quantative data to evaluate whether
the subjective preferences match the diﬀerences in objective understandability in the future,
and further developing modeling guidelines for BPMN.


                                                                                                e
GC#F2 M/ aiB2?H, JQ/2H _2/2` S`272`2M+2b 7Q` a2KMiB+HHv .mTHB+i2 1H2K2Mib BM
"SJLL

Acknowledgments
We’d like to thank all participants who took part in our study. Additionally, we like to thank
Kurt Schneider, Dieter Kähny, and Barbara Ulrich for distributing the quiz within their classes
and organizations.


References
 [1] V. Stiehl, Implementing the Basic Architecture of Process-Driven Applications, Springer,
     2014. URL: http://link.springer.com/chapter/10.1007/978-3-319-07218-0_4.
 [2] B. Rücker, 3 common pitfalls in microservice integration and how to avoid them, WWW:
     https://berndruecker.io/3-pitfalls-in-microservice-integration/, last access: 2021-02-18,
     2018.
 [3] W. M. van Der Aalst, A. H. Ter Hofstede, B. Kiepuszewski, A. P. Barros, Workﬂow patterns,
     Distributed and parallel databases 14 (2003) 5–51.
 [4] O. I. Lindland, G. Sindre, A. Solvberg, Understanding quality in conceptual modeling, IEEE
     Software 11 (1994) 42–49. doi:10.1109/52.268955.
 [5] C. Bennett, J. Ryall, L. Spalteholz, A. Gooch, The aesthetics of graph visualization, Proceed-
     ings of Computational Aesthetics in Graphics, Visualization, and Imaging (2007) 57–64.
     doi:10.2312/COMPAESTH/COMPAESTH07/057-064.
 [6] K. Figl, J. Recker, Exploring cognitive style and task-speciﬁc preferences for process
     representations, Requirements Engineering 21 (2016) 63–85. URL: http://dx.doi.org/10.
     1007/s00766-014-0210-2. doi:10.1007/s00766-014-0210-2.
 [7] T. Scholz, D. Lübke, Improving automatic bpmn layouting by experimentally evaluating
     user preferences, in: Á. Rocha, H. Adeli, L. P. Reis, S. Costanzo (Eds.), New Knowledge in
     Information Systems and Technologies, Springer International Publishing, Cham, 2019, pp.
     748–757.
 [8] D. L. Moody, Why a Diagram is Only Sometimes Worth a Thousand Words: An Analysis
     of the BPMN 2.0 Visual Notation, 2011.
 [9] D. L. Moody, The physics of notations: toward a scientiﬁc basis for constructing visual
     notations in software engineering, Software Engineering, IEEE Transactions on 35 (2009)
     756–779.
[10] N. Genon, P. Heymans, D. Amyot, Analysing the cognitive eﬀectiveness of the bpmn 2.0 vi-
     sual notation, in: B. Malloy, S. Staab, M. van den Brand (Eds.), Software Language
     Engineering, Springer Berlin Heidelberg, Berlin, Heidelberg, 2011, pp. 377–396.
[11] A. Birchler, E. Bosshart, M. Märki, P. Opitz, J. Pauli, B. Rigert, Y. San-
     doz, M. Schaﬀroth, N. Spöcker, C. Tanner, K. Walser, T. Widmer, eCH-0158
     BPMN-Modellierungskonventionen für die öﬀentliche Verwaltung, WWW:
     https://www.ech.ch/dokument/fb5725cb-813f-47dc-8283-c04f9311a5b8, 2014. URL:
     https://www.ech.ch/dokument/fb5725cb-813f-47dc-8283-c04f9311a5b8.
[12] V. R. Basili, Applying the goal/question/metric paradigm in the experience factory, Software
     Quality Assurance and Measurement: A Worldwide Perspective (1993) 21–44.


                                                                                                 d