=Paper= {{Paper |id=Vol-1805/Elahmar2016HuFaMo |storemode=property |title=Empirical Activity: Assessing the Perceptual Properties of the Size Visual Variation in UML Sequence Diagram |pdfUrl=https://ceur-ws.org/Vol-1805/Elahmar2016HuFaMo.pdf |volume=Vol-1805 |authors=Yosser El Ahmar,Xavier Le Pallec,Sébastien Gérard |dblpUrl=https://dblp.org/rec/conf/models/AhmarPG16 }} ==Empirical Activity: Assessing the Perceptual Properties of the Size Visual Variation in UML Sequence Diagram== https://ceur-ws.org/Vol-1805/Elahmar2016HuFaMo.pdf

Empirical Activity: Assessing the Perceptual Properties of
the Size Visual Variation in UML Sequence Diagram

Yosser El Ahmar Xavier Le Pallec Sébastien Gérard
CEA, LIST, Laboratory of University of Lille, CRIStAL CEA, LIST, Laboratory of
Model Driven Engineering for Lab UMR 9189 Model Driven Engineering for
Embedded Systems 59650 Villeneuve d’Ascq, Embedded Systems
P.C. 174, Gif-sur-Yvette, France P.C. 174, Gif-sur-Yvette,
91191, France xavier.le-pallec@univ- 91191, France
yosser.ELAHMAR@cea.fr lille1.fr Sebastien.GERARD@cea.fr

ABSTRACT intensive systems. Recent empirical studies about UML in
Recent empirical studies about UML showed that software practice [14] [4] showed that UML artefacts are mostly used
practitioners often use it to communicate. When they use for communications. Stakeholders of these communications
diagram(s) during a meeting with clients/users or during an might be familiar with UML (e.g. members of the technical
informal discussion with their architect, they may want to team) or not (e.g. clients) [9]. In such situations, modelers
highlight some elements to synchronise the visual support to may need to highlight information that they deem relevant
their discourse. To that end, they are free to use color, size, for the discussion (e.g. the main class in a class diagram;
brightness, grain and/or orientation. The mentioned free- model, view and controller elements; a modeler’s own sub-
dom is due to the lack of formal specifications of their use in system; distribution of tasks between technical members;
the UML standard and refers to what is called the secondary project progression). This is to synchronize the visual sup-
notation, by the Cognitive dimensions framework. Accord- port with their discourse. In that context, while the UML
ing to the Semiology of Graphics (SoG), one of the main specification describes exhaustively its primary notation, its
references in cartography, each mean of visual annotation is semantics, it lacks highlighting abilities for such contextual
characterized by its perceptual properties. information. The secondary notation, defined by the Cog-
Being under modeler’s control, the 5 means of visual an- nitive Dimensions framework [11], may deal with such con-
notations can differently be applied to UML graphic com- cerns. It refers to the free use and change of the possible
ponents: to the border, text, background and to the re- means of visual annotations: size, brightness, grain, color
lated other graphic nodes. In that context, the goal of this and orientation. The previously mentioned five means of
research is to study the effective implementations, which visual annotations are relatively rapidly perceived because
maintain the perceptual properties of, especially, the size the reader’s eye can detect their variation without moving
visual variation. This latter has been chosen because it is the visual brush. According to the Semiology of Graph-
considered as the ”strongest” among the other visual means, ics (SoG), one of the main references in cartography, each
having all the perceptual properties. mean of visual annotation is characterized by its perceptual
The present proposal consists of a quantitative methodol- properties. In fact, it can be selective: allows readers to dis-
ogy using an experiment as strategy of inquiry. The partic- tinguish groupings (e.g. all green marks), ordered: allows
ipants will be the ˜ 20 attendees of the HuFaMo workshop. readers to perceptually order marks (e.g. from dark to light
They must be experts on modeling and they know UML. The or from light to dark but never in another order) and/or
treatment is the reading and the visual extraction of infor- quantitative: allows readers to visually quantify ratio be-
mation from a set of UML sequence diagrams, provided via tween marks (e.g. three times larger).
a web application. The dependent variables we study are In UML, as the means of visual annotations are under
the responses and the response times of participants, that modeler’s control, there exist different ways to vary their val-
will be validated based on the SoG principles. ues into a UML graphic component: graphic node or graphic
path.
The combinatorial explosion of the possible implemen-
CCS Concepts tations is due to four reasons. First, UML graphic nodes
•Software Engineering → UML modeling; •Software mostly include: a border, a text and a background. Second,
visualization → Semiology of Graphics; some UML graphic nodes are composed of multiple shapes
(e.g. a lifeline is composed of 3 components: a rectangle,
a dashed line and sometimes an execution specification).
Keywords Then, graphic nodes might be related to other nodes via
UML, Secondary notation, Size visual variable, Empirical graphic paths, forming the diagram. Finally, a UML graphic
activity. component might contain/be contained in other graphic nodes
(e.g. a fragment in a sequence diagram can contain one or
more messages).
1. INTRODUCTION It may seem obvious that some implementations of varia-
The Unified Modeling Language (UML) is the visual lan- tions are more effective in highlighting elements than others.
guage for specifying, constructing and documenting software
But what we can gain in effectiveness might be anecdotal.
To be sure that there exist (or not) implementations that
are more effective, we have to dress an exhaustive list of
implementations and test them. This means that we have
to rigorously decompose all UML graphic components and
see, for each sub-element, if the value of a mean of visual
annotation can vary and how. Consequently, the purpose
of this research is to study the effective implementations, Figure 1: Delimitation of the study.
which allow viewers to fully benefit from the performances
of a mean of visual annotation. This is a purpose for which
the number of related works is small. As the field of study
Size, brightness, grain, color and orientation represent
is wide, we propose to focus here only on the variation of
powerful means to highlight information, to make it rela-
the size mean of visual annotation and on one type of di-
tively rapidly perceived in a third dimension [5]. Each mean
agram : UML sequence diagram. The size visual variation
of visual annotation is characterized by its perceptual prop-
has been selected in this study because it is the only mean
erties. The SoG distinguishes three perceptive attitudes that
of visual annotation, belonging to the UML secondary no-
viewers can take in front of a mean of visual annotation.
tation, which has all the perceptual properties. In addition,
Selectivity: the reader can perceive groupings (e.g. all red
we propose to target, especially, the UML sequence diagram
colors, all marks having the same size).
because it belongs to the three first mostly used UML dia-
Order: The human eye can perceive order (e.g. from dark
grams in practice.
to light, from the smallest mark to the biggest).
For each type of graphic component, being composed of
Quantity: The viewer can perceive ratio between marks
multiple shapes or a component itself, container of other
(e.g. this mark is 5 times bigger than another).
graphic nodes or contained in other graphic nodes, this re-
The size is the only mean of visual annotation allowing the
search aims at finding patterns of effective implementations
three perceptive attitudes. To benefit from its interesting
of the size visual variation. In this study, we assume that
performances, we made the choice to begin by studying its
the latter patterns depend on the information to be high-
effective implementation in UML. We chose to be limited
lighted. It can concern only one graphic component (e.g. a
to three categories of size. This number can be extended
lifeline) or more than one (e.g. two or more than two life-
to more than three categories in a future empirical study.
lines). For the first assumption, the size variation will surely
We argue that exceeding three categories of size in UML
highlight the concerned graphic component [5]. But, we aim
diagrams will overload the diagram, especially if it contains a
at finding the effective implementations, which allow viewers
lot of graphic components (i.e. large diagram). In addition,
to relatively rapidly perceive all significant details about the
we note that sizes of graphic nodes depend on the contained
concerned graphic component. For the second assumption,
text (eg. the width of a UML class varies depending on the
the size visual mean is selective, ordered and quantitative [5].
length of its name, its attributes or its methods). Therefore,
In this case, we want to find the effective implementations
we will assume that all graphic nodes, in a diagram, have the
which maintains valid the selective, ordered and quantitative
same initial size (i.e. the size of the biggest node, containing
perceptive attitudes of its variation.
the largest text).
To that end, we want to study the impact of the possible
According to [9][8], sequence diagram is ranked among
implementations on the perceptual properties of the size vi-
the three first frequently used UML diagrams in practice. It
sual mean. The latter impact will be controlled by the size
is mostly used for clarifying understanding among technical
of the UML diagram containing the implementation. It can
members of the project team [8]. In such informal meet-
be small, medium or large. The studied impact will also be
ings, highlighting information might be promising to ease
controlled by the layout of the diagram. We will especially
the communication [10]. In addition, contrarily to class di-
focus on the horizontal and vertical distance between the
agrams, we note a lack of works in the literature, studying
related graphic components.
the effective visualization of sequence diagrams. Those are
This paper presents a proposal of a quantitative method-
the main reasons behind the specific choice to begin by the
ology using an experiment as strategy of inquiry. The par-
UML sequence diagram.
ticipants will be the ˜ 20 attendees of the HuFaMo work-
In practice, the graphic nodes will be connected to each
shop. They must be experts on modeling and they know
others, forming the diagram. The resulting diagram can
UML. The treatment is the reading and the visual extrac-
be small, medium or large. We chose to cover all the 3
tion of information from a set of provided UML sequence
alternatives in the present study.
diagrams, via a web application. The outcome variables we
The graphic notation of the sequence diagram is described
study are responses and response times of participants, that
by 11 graphic nodes and 4 graphic paths [1](p. 594-596). As
will be validated based on the SoG principles.
we chose to exhaustively study the UML sequence diagram.,
we will take into account all of them in the present experi-
2. EXPERIMENT DEFINITION ment.
This section reports on the delimitation of the study, the We observe that information to highlight might concern
research questions that it attempts to answer and its hy- only one graphic component (e.g. one message, one lifeline,
pothesis. one coregion). It can also concern more than one graphic
component (e.g. multiple lifelines, multiple coregions, mul-
2.1 Delimitation of the study tiple execution specifications). The present study will cover
Figure 1 resumes the delimitation of the experiment. Fol- both alternatives.
lowing are justifications for each choice. Finally, we observed that distance between related graphic
components can vary in two directions, horizontally and ver- Implementation I (alternatives: Effective Implementation
tically for instance. We will experiment with both possibil- I, Other Implementation I’).
ities. Size of the sequence diagram S (alternatives: small, medium,
large).
2.2 Research questions Its layout L (alternatives: Horizontal distance HD, Verti-
After delimiting the study, we will define the research cal distance VD).
questions for the resulting scope. In fact, we observe that, Type of information to highlight TI (alternatives: con-
for a single graphic component of the sequence diagram, cerns only one graphic component TI1, more than one graphic
there are different possible implementations of the size mean component TIn).
of visual annotation. This is due to the following facts. Dependent variables
UML graphic nodes are mostly made of a border, a back- Responses of participants R (alternatives: true, false, com-
ground and a text. Changing only its area can be seen as ob- plete, incomplete).
vious, but we want to explore the effectiveness of varying the Response time of participants T.
size of its border and text also. Moreover, some graphic com-
ponents include multiple shapes. Lifelines include a rectan- 2.3.2 Hypothesis
gle and a dashed line. LostMessages and FoundsMessages
include an edge and a black point at the extremity. Varying
the size of such graphic components might consist of chang- Table 1: Hypothesis
ing the size of all its elementary shapes or some of them. Dependent Null hypothesis Alternative hypothe-
We wonder about the most effective implementation. variables sis
In addition, some graphic nodes can be embedded in other Response ∀ (S, TI, L); H0: T(I) ∀(S, TI, L) H1: T(I)
graphic nodes. An execution specification, a coregion, Dura- time T > T(I’) < T(I’)
tionConstraint, a DurationObservation and a StateInvariant Response R ∀ (S, TI, L); H0: T(I) ∀ (S, TI, L) H1: T(I)
are always embedded to a lifeline. Continuations might be > T(I’) < T(I’)
embedded to more than one lifeline. Changing their size can
affect the size of graphic nodes to which they are embedded. The hypothesis for assessing the effectiveness of the I size
We want to infer the most effective implementation. variations with the independent variables are given in table
Furthermore, graphic components are semantically linked 1. The alternative hypothesis H states that the proposed
to each others. Lifelines are linked via graphic paths, graphic effective implementations take less time to let participants
paths having source and destination graphic nodes. High- give the right and complete answer to a given question. The
lighting them with the size variation might mean highlight- experimented effective implementation I is proposed for each
ing its semantically related graphic components also. possible combination of (S, TI, L). Figures in appendices il-
Finally, some graphic nodes may contain other graphic lustrate the different implementations that we deem effective
components. A Frame, an InteractionUse, a CombinedFrag- and the experiment aims at validating. They also illustrate
ment and a coregion can contain executionSpecifications, an example of a question that concern one graphic path (a
messages. They may also contain each others. Applying the message) with different implementations.
the size to such graphic nodes might concern the contained
other graphic nodes and vice versa.
As a result, the following research questions arise. 3. EXPERIMENT DESIGN
RQ1: What are the effective implementations of the size 3.1 Population, sample, and participants
visual variation to all types of graphic components of the The sampling method used in this study is the conve-
UML sequence diagram (i.e. container, contained, embed- nience sampling [6]. In fact, the target population of this
ded to a graphic node, complex graphic node (composed of study is the community of UML users: practitioners, re-
multiple shapes))? searchers, students. The HuFaMo attendees are a naturally
Where effectiveness can be measured by the capability of formed and might be a representative sample of the target
each implementation to preserve all the perceptual proper- population. They include students, researchers, UML prac-
ties of the size, allowing viewers to relatively rapidly detect titioners and maybe some tool vendors. They are a part
the accurate information that they are searching for. of the MoDELS community, interested in modeling and/or
RQ2: How the effectiveness of each implementation can contributors on MDE. We assume that we will have ˜ 20
be controlled by the type of information to highlight (i.e. participants, considered as experts on UML.
concerns only one graphic component, more than one com-
ponent). 3.2 Data collection and materials
RQ3: How the effectiveness of each implementation can be A web application will be used in the present experiment.
controlled by the size of the diagram containing the imple- This is to be aware of the complexity of modeling tools
mentation and its layout. (i.e. not all participants are familiar with the same mod-
eling tool). Moreover, installing the same modeling tool to
2.3 Hypothesis formulation all participants will be time consuming, especially in the
workshop (same timeslot as a presentation). If accepted,
2.3.1 Variables the web application will be developed between the accep-
The experiment has 4 independent variables and two de- tance notification and the workshop date. It will be coded
pendent variables. by the first author and tested before its use in the experi-
Independent variables ment. The web application will first ask participants about
their gender, level of experience and if they have visual defi- In addition, as mentioned before, participants might have
ciency(ies). Then, it will display a question on a white page. some visual deficiencies. This additional input will be men-
After its reading and comprehension, the participant is able tioned before beginning the task, so that we can take into
to click a button to switch to the next page. A sequence account its influence on the results. We also note that each
diagram, visually annotated with an implementation of the participant will have a different screen with different char-
size will appear, along with its corresponding question on acteristics. We will ensure that at least the same value of
the bottom. Parallelly, the application will trigger a time luminosity is set up and that the same web navigator is used
counter. When the response to the question is found by the to open the web application. Finally, one of the outcomes
participant, he can click a button to navigate to another of the study is the response time of participants. It is au-
white page (without the sequence diagram), where he will tomatically saved when the participant finds the response
be able to enter his response. At the same time, the appli- by clicking a button. Late clicking the button will bias the
cation will stop the chronometer and save the time spent to results. We will stress on the importance of this step to par-
answer. It will also save the corresponding response entered ticipants in the introduction phase. We will also try to add a
by the participant. Sequence diagrams that will be used in voice recorder, so that participants can speak out loud when
the experiment will be extracted from a models repository finding the response. Then, we will have to find mechanisms
[12] [2]. Visual annotations using implementations of the to manage the simultaneous voices of participants, placed in
size variations and questions will be manually proposed. the same setting.

3.2.1 Method 4.2 External validity
One day before the experiment, the HuFaMo participants The HuFaMo participants are not only experts on model-
will receive an e-mail requesting them to bring their laptops. ing but also interested in Human Factors in Modeling. So,
The first author will ensure the availability of an internet they may know about the scope of this research, especially
connection during the experiment day. The experiment will the perceptual properties of the means of visual annotations,
begin by an introduction phase and a training session re- which can bias the study. To limit the latter threat to valid-
lated to the experimental task. The first author will present ity, we will not inform them about the research questions of
the web application that will be used in the experiment, the study nor its hypotheses. Moreover, the participants are
for which details are mentioned in the previous subsection. not in a natural setting, using their own modeling tool and
Then, the link of the web application will be sent to the Hu- moving naturally to their UML sequence diagrams. As a re-
FaMo attendees via the workshop mailing list. The second sult, we will perform further additional empirical study (e.g
step consists of the experiment’s task. This latter will indi- a case study in a natural setting) in order to be sure that the
vidually be performed by each participant. The main treat- obtained result can be generalized to the whole population.
ment will consist on the reading and the visual extraction
of information from a visually annotated sequence diagram.
The estimated time for the whole experiment is 30 minutes. 5. LITERATURE REVIEW
The free use of additional means of visual annotations
3.3 Data analysis procedures in software engineering has been recognized as theoretically
In the analysis procedure, we will report on the number of advantageous. This is via the secondary notation by the
the HuFaMo attendees who didn’t participate to the study. cognitive dimensions framework [11]. A few empirical stud-
We also plan to give a descriptive analysis of data for all ies aiming at assessing its benefits in UML visual notation
independent and dependent variables of the study. At the have been conducted. However, if they considered the need
end of the experiment, we want to analyse the relationship of empirical validations, they focus only on two axis: lay-
between the independent and dependent variables. This outs and colors. The other means of visual annotations (i.e.
is to find patterns of effective implementations, depending size, brightness, grain and orientation) have not been yet
on a combination of (S, TI, L). For each combination of discussed, despite of their great performances on highlight-
the three independent variables, we will determine the ef- ing information, known in cartography [5] and psychology
fective implementation I, which has a minimum T and a [13] [18].
complete and true values of R. Therefore, we select the cor- Concerning layouts, there exist several empirical studies aim-
relation/regression statistic tests. ing at finding effective layouts in UML diagrams. [19] [16]
[15] use experiments to assess effective layouts for diagram
comprehensions, user preferences, program understanding,
4. ANTICIPATED ETHICAL ISSUES IN THE etc. [20] uses eye tracking in an experiment involving 12 par-
STUDY ticipants to identify the impact of layout, color and stereo-
This section will report the internal and external threats types on comprehension of UML diagram. Most of the men-
to validity. tioned researches [16] [15] [20] focus on UML class diagram.
[17] focusses further on UML activity diagram and use case
4.1 Internal validity diagram. While the sequence diagram belongs to the three
The first internal threat to validity is the possible gain of most used UML artefacts in practice, we note few works on
maturity by the participants during the study. That may it [7] [3].
happen because of the unicity of the studied type of UML
diagram: sequence diagrams. As well as the uniqueness of 6. REFERENCES
the studied visual variation. Therefore, we will ensure that
diagrams will be randomly proposed so that questions con- [1] Object management group. http://www.omg.org/.
cerning the same graphic component will not be successive. [2] UML repository. www.models-db.com.
[3] S. Abrahao, C. Gravino, E. Insfran, G. Scanniello, and [18] A. Treisman. Preattentive processing in vision.
G. Tortora. Assessing the effectiveness of sequence Computer vision, graphics, and image processing,
diagrams in the comprehension of functional 31(2):156–177, 1985.
requirements: Results from a family of five [19] K. Wong and D. Sun. On evaluating the layout of
experiments. IEEE Transactions on Software UML diagrams for program comprehension. Software
Engineering, 39(3):327–342, 2013. Quality Journal, 14(3):233–259, 2006.
[4] S. Baltes and S. Diehl. Sketches and diagrams in [20] S. Yusuf, H. Kagdi, and J. I. Maletic. Assessing the
practice. In Proceedings of the 22Nd ACM SIGSOFT comprehension of UML class diagrams via eye
International Symposium on Foundations of Software tracking. In 15th IEEE International Conference on
Engineering, FSE 2014, pages 530–541, New York, Program Comprehension (ICPC’07), pages 113–122.
NY, USA, 2014. ACM. IEEE, 2007.
[5] J. Bertin. Semiology of graphics: diagrams, networks,
maps. 1983. APPENDIX
[6] J. W. Creswell. Research design: Qualitative,
quantitative, and mixed methods approaches. Sage Considering all independent variables, 12 sequence diagrams
publications, 2013. are required for each implementation of the size variation.
We argue that at least two diagrams are needed for each
[7] J. A. Cruz-Lemus, M. Genero, D. Caivano,
implementation. Therefore, for all 14 graphic components
S. Abrahão, E. Insfrán, and J. A. Carsı́. Assessing the
of the UML sequence diagram, at least 336 diagrams are
influence of stereotypes on the comprehension of UML
required for this study.
sequence diagrams: A family of experiments.
Information and Software Technology,
53(12):1391–1403, 2011. A. EFFECTIVE IMPLEMENTATIONS AND
[8] B. Dobing and J. Parsons. How UML is used. AN EXAMPLE OF A QUESTION WITH
Commun. ACM, 49(5):109–113, May 2006. DIFFERENT IMPLEMENTATIONS
[9] W. J. Dzidek, E. Arisholm, and L. C. Briand. A
realistic empirical evaluation of the costs and benefits
of UML in software maintenance. Software
Engineering, IEEE Transactions on, 34(3):407–432,
2008.
[10] Y. El Ahmar, S. Gérard, C. Dumoulin, and
X. Le Pallec. Enhancing the communication value of
UML models with graphical layers. In Model Driven
Engineering Languages and Systems (MODELS), 2015
ACM/IEEE 18th International Conference on, pages
64–69. IEEE, 2015.
[11] T. R. G. Green and M. Petre. Usability analysis of
visual programming environments: a cognitive
dimensions framework. Journal of Visual Languages &
Computing, 7(2):131–174, 1996.
[12] B. Karasneh and M. R. Chaudron. Online img2UML
repository: An online repository for UML models. In
EESSMOD@ MoDELS, pages 61–66, 2013.
[13] K. Koffka. Principles of Gestalt psychology, volume 44.
Routledge, 2013.
[14] M. Petre. UML in practice. In Proceedings of the 2013
International Conference on Software Engineering,
ICSE ’13, pages 722–731, Piscataway, NJ, USA, 2013.
IEEE Press.
[15] H. C. Purchase, L. Colpoys, D. Carrington, and
M. McGill. UML class diagrams: an empirical study of
comprehension. In Software Visualization, pages
149–178. Springer, 2003.
[16] B. Sharif and J. I. Maletic. An empirical study on the
comprehension of stereotyped UML class diagram
layouts. In Program Comprehension, 2009. ICPC’09.
IEEE 17th International Conference on, pages
268–272. IEEE, 2009.
[17] H. Störrle, N. Baltsen, H. Christoffersen, and
A. Maier. On the impact of diagram layout: How are
models actually read? In International Conference on
Model Driven Engineering Languages and Systems
(MoDELS) 2014, pages 31–35, 2014.
Figure 2: Effective implementation of a ”lifeline” I, (TI=TI1, T=S)

Figure 3: Effective implementations of ”message” I, (TI=TI1, T=S)
Figure 4: Effective implementations of a ”fragment” I, (TI=TI1, T=S)
Figure 5: Response to the question: What happens if the controller sends first hello? with the implementation
I
Figure 6: Response to the question: What happens if the controller sends first hello? with an implementation
I’
Figure 7: Response to the question: What happens if the controller sends first hello? with an implementation
I’
Figure 8: Response to the question: What happens if the controller sends first hello? with an implementation
I’