=Paper= {{Paper |id=Vol-2019/mdetools_7 |storemode=property |title=Visual Variables in UML: a First Empirical Assessment |pdfUrl=https://ceur-ws.org/Vol-2019/mdetools_7.pdf |volume=Vol-2019 |authors=Yosser El Ahmar,Xavier Le Pallec,Sébastien Gérard,Truong Ho-Quang |dblpUrl=https://dblp.org/rec/conf/models/AhmarPGH17 }} ==Visual Variables in UML: a First Empirical Assessment== https://ceur-ws.org/Vol-2019/mdetools_7.pdf
            Visual Variables in UML: a First Empirical
                            Assessment
                  Yosser El Ahmar∗ § , Xavier Le Pallec § , Sébastien Gérard∗ and Truong Ho-Quang¶
   ∗ CEA, LIST, Laboratory of Model Driven Engineering for Embedded Systems, P.C. 174, Gif-sur-Yvette, 91191, France

                                           {yosser.ELAHMAR, Sebastien.GERARD}@cea.fr
                        § University of Lille, CRIStAL Lab UMR 9189, 59650 Villeneuve d’Ascq, France

                                                  Email: xavier.le-pallec@univ-lille1.fr
                                          ¶ Chalmers, Göteborg University, Göteborg, Sweden

                                                       Email: truongh@chalmers.se


   Abstract—This paper presents results of an empirical research         the variations of the visual aspects into six variables called
study of the Unified Modeling Language (UML) use in practice.            retinal variables: Size, brightness, texture/grain, color, shape
We employed a selective range of research methodologies includ-          and orientation. X, Y planar axis (position) and the retinal
ing in-depth semi structured interviews and quantitative analysis
of > 3500 UML diagrams related to open source projects in                variables are called visual variables [3]. The retinal variables
GitHub. The aim of the study is to provide greater understanding         are very significant in highlighting information. They are
about the use of UML and to particularly shed light on the use           rapidly perceived because the reader’s eye can detect their
of the visual variables (i.e., color, size, brightness, texture/grain,   variation without moving the visual brush, signals received
shape and orientation) in practice. The theoretical perspective          on the retina are sufficient. The use of the retinal variables
of the study is to explore the usefulness of the visual variables
in UML. These latter are highly significant in reducing the              allow the human eye to perceive in a third dimension (i.e.,
cognitive load of human beings, when effectively employed. As            depth perception). The depth perception does not require
with all qualitative study, findings should be carefully interpreted,    cognitive processing neither in the working memory nor in
they should be seen as providing better understanding about              the long term memory (pre-attentive perception). Hence, it
the aforementioned scopes. We conclude by discussions of the             reduces considerably the cognitive load of human beings.
obtained results and lessons learned for future researches.
                                                                         UML mainly uses the shape visual variable to visually encode
                                                                         semantics (e.g., ellipses, rectangles, circles). The other visual
                        I. I NTRODUCTION
                                                                         variables are less employed, despite their aforementioned great
   By the 90s, numerous graphical modeling languages were                performances. That leaves the opportunity to explore the
used by the software engineering community. This is due to               usefulness of the other visual variables as a possible mean
the increased acceptance of modeling and the emergence of                to enhance the UML use in practice. This possibility has been
Object Oriented systems. Each graphical modeling language                already recognized as advantageous in software engineering
uses its own graphic signs and meanings. That helped de-                 via the Cognitive Dimensions framework [4] and the Physics
creasing intra-communities ambiguities but led to problems of            of Notations framework [5].
interoperability between tool-vendors. Consequently, the Ob-             The exploration of the visual variables in UML requires
ject Management Group (OMG) [1] standardized the Unified                 understanding about; (i) Details about the situations of the
Modeling Language (UML) [2] as an attempt to resolve the                 use of UML in practice. (ii) Details about the actual state
latter interoperability issue. It has taken the advantages of the        of use (or not) of the visual variables in UML. If numerous
graphical representations and has defined UML as a visual lan-           empirical studies treat the first scope (i) by investigating the
guage for specifying, constructing and documenting software              use of UML in practice, less researches investigate the use
intensive systems. The OMG exhaustively describes the UML                of the visual variables (ii). These latter mainly focus only on
graphic signs via a concrete syntax and their meanings (i.e.,            the position visual variable to find effective layouts [6] [7] [8]
semantics) via an abstract syntax.                                       with sometimes studies on colors [9]. (i) and (ii) are strongly
UML takes advantage of the high performances of the graph-               related to each other and we deem that they have to be treated
ical system. This bi-dimensional system has a major interest             concomitantly. In fact, the use of the visual variables might
compared to linear ones like the audio system or the textual             depend on the way of the UML use by each practitioner. To
one. In the audio system, there are two variations: the sound            fill this gap, we conducted a qualitative exploratory empirical
and the time. In a same time unit, the human ear hears only one          study using interviews as strategy of inquiry. The purpose of
variation: one sound. Whereas, in the graphical system, there            the present study is to create more and better understanding
are three possible variations: the X, Y planar dimensions and            about the situations of the UML use in practice. A situation
the visual aspect of a graphic sign like its color or its shape.         refers to the activities, the stakeholders that are involved in
In the same time unit, the human eye might perceive all the              each activity, the practices of UML users in employing UML
relationships between the three variations. [3] has subdivided           and the purposes of such usages. In the captured situations,
the study aims at discovering the need for the visual variables       verify hypothesis or generalize findings. It mainly serves as
in practice. If such need exists, we want to gain a great             an exploratory study to help ongoing researches around UML.
understanding about the kinds of visual annotation that UML           Analysis of our interview data has been carried out using
practitioners perform, the purposes and the ways to do so.            the ‘grounded theory’ approach [12]. For that, we began by
As a triangulation method, we analyzed the use of the visual          manually transcribing the interviews from audio to textual
variables in > 3500 diagrams related to open source projects          form. We read throughout the data and identified themes and
in GitHub [10][11]. That aims at finding quantitative data that       descriptions. We tried to interrelate them using the grounded
might reinforce the results of the interviews. Obtained results       theory approach then we interpreted the findings. The analysis
might help ongoing researches exploring the benefits of the           of the UML models related to open source projects involved
visual system as a mean of resolving problems that the present        some basic enumerations and simple statistical calculations to
study might reveal. They can also help studying the usefulness        get overall sense about the use of the visual variables in UML.
of the visual variables in enhancing the effectiveness of UML         The major effort was spent on the manual classification of the
in the captured situations of use. Finally, they might help tool      different diagrams based on the different usages (or not) of
vendors enhancing the usability of their tools by making more         each visual variable. That helped us reporting on the state of
ergonomic visual automation.                                          practices of UML modelers in using the visual variables.
This paper presents results from the qualitative study that
we have conducted with 8 experts and practitioners of UML.            B. Data collection procedures
Then, it describes results of the analysis of the use of the visual      1) Interviews: We conducted a series of semi-structured in-
variables in > 3500 UML diagrams from the models repository           depth interviews with 8 participants (6 from industry and 2
[11]. The study takes a deliberately broad interpretation of          researchers). 7 interviews have been carried out by phone and
results from both methods, as it is meant to be exploratory.          one face to face interview. As the first intent of the present
                                                                      work is to understand in depth the use of UML in practice,
                 II. D ESIGN METHODOLOGY                              we were particularly interested by practitioners of UML. They
   We used a selective range of research techniques to gather         come from a variety of backgrounds and with a range of
data for our study. We used both qualitative study via in-depth       expertise in UML. The interviews lasted approximately 30-
interviews and quantitative study via the analysis of UML             60 minutes and began with a brief announcement of the goal
models related to open source projects in GitHub [11]. Such           of the study. We also introduced the fact that interviews will
use of a variety of types of data helps us ensure a better            be anonymous and asked permission to record them. Then,
coverage and a greater understanding about our aforemen-              we asked participants about their current position and level
tioned and following two research questions: (i) What are the         of experience with modeling using UML. We followed with
details about the situations of the use of UML in practice and        questions about the situations of UML use in practice to
particularly the information that practitioners need to visualize.    answer our first research question. That included the purposes
(ii) what are the details about the actual state of use (or not) of   of the use of UML, the activities done with UML diagrams, the
the visual variables in the previously captured situations. The       employed diagrams, the reasons of using a particular diagram,
qualitative in-depth interviews with 8 experts and practitioners      the sought information and the ways of using UML in a
of UML help us gain understanding about the use of UML                project from the beginning until the final steps. Then, we
and the use of the visual variables in practice. It allows us to      asked questions about their current use (or not) of the visual
understand the relationships between both kinds of use. The           variables in practice. That concerned the identification of the
analysis of the UML models related to open source projects            utility of the visual variables in practice, the most used visual
helps us gather quantitative data, particularly, about the use of     variables and the manners of their use in practice. As for any
the visual variables. It allows us mainly to answer the second        semi-structured interviews, we have identified a number of
research question (ii). Conclusions about the amount of the           topics that have to be covered in each interview. But, we
visual variables use were calculated in a sample of > 3500            strongly encouraged participants to explain details of their
UML diagrams. That also enables us to build conclusions               claims by pointing out that the minor detail is very important
about the effectiveness (or not) of such usages based on              for our study. All the interviews have been conducted in the
existing theories [3]. The present study is conducted as an           form of a discussion where the interviewer followed the logic
attempt to help us achieving our mid-term goal in exploring           and the reasoning of the participants. Finally, the interviews
the high performances of the visual variables to enhance the          were recorded and transcribed with the permission of the
effectiveness of UML in practice.                                     participants.
                                                                         2) Models database analysis: We manually analyzed the
A. Interpretation of results                                          use of the visual variables in > 3500 UML models related
   We need to be particularly careful about how we analyze            to open source projects in GitHub [11] [10]. Most of these
the results of our study and the conclusions that will be             diagrams are class diagrams, exactly 3328 class diagrams,
drawn from it. In fact, the first intent of this work is to           392 are sequence diagrams (The models repository is already
create better understanding about the use of UML and the              biased towards structural (class) models [11]). This aims at
use of the visual variables in practice. It is not meant to           gathering quantitative data that might reinforce the interviews
results. To that end, we first began by identifying the visual            a) Purposes of the use of UML: Results of the present
variables that we will study, notably: the size, brightness,        qualitative study revealed that communication is the first
color, texture and orientation. Then, we manually classified        purpose of using UML in practice. All the 8 participants have
the UML diagrams based on their containment of a particular         confirmed their use of UML diagrams as a communication
visual variable (In case of two or more visual variables, we        vehicle. Communications might be held internally within the
created a new dedicated folder). For each visual variable           project teams or with costumers. This finding is also confirmed
(i.e., for each folder), we classified the diagrams based on        by the previous empirical studies in this field [13][14]. The
the nature of its use. In fact, we observed that each visual        next paragraph further focuses on our results about UML use
variable might be differently applied to UML elements: on the       in communications. The second purpose of using UML is code
border, text, background, edges, heads and/or compartments.         generation. The latter finding is contradictory with previous
We also differentiated significant visual variables variations      empirical studies [15] where code generation generally appears
and non significant ones. In fact, a visual variable variation      in the last ranges. However, that seems logic in our case
is considered as significant if there are different categories      because most of our interviewees are practitioners of MDE
of this latter in a same diagram (e.g., blue, green, red are        approaches. They use models from early design steps until
different categories of the color visual variable). The latter      maintenance tasks. The third purpose of using UML is to draw
kind of variations is very important for our study. In fact,        the participants own understanding in an informal way where
they mean that authors of the corresponding diagrams wanted         UML diagrams are considered as a “map of the system”. That
to highlight information using a particular visual variation.       might be done using a pen and paper or on a white board. In
We will further concentrate on their analysis to understand in      such kind of use, participants do not care about the conformity
depth their use and answer our second research question. Non        of their diagrams to the UML standard. Their goal mainly
significant variations refer to the use of a single category of a   concerns the comprehension of the system to be built and its
visual variable (e.g., all the classes are yellow).                 conformity to the clients needs. Finally, UML diagrams are
                                                                    less employed for model execution and model analysis.
                       III. I NTERVIEWS                                   b) UML and communications: We asked our practi-
                                                                    tioners about their practices of using UML diagrams for
   We have identified interviewees by focusing on their levels
                                                                    communications. We distinguished two types of audiences:
of experience in modeling using UML and by ensuring their
                                                                    persons who are familiar with UML (e.g., technical team)
practice of UML. We asked all our contacts in order to identify
                                                                    and non-familiars with UML (e.g. customers). We found out
industrial practitioners who might be willing to be interviewed.
                                                                    that all our practitioners don’t modify (i.e., contextualize) their
We have first done an announcement on mailing lists contain-
                                                                    diagrams for communications with persons who are familiar
ing potential practitioners of UML: Papyrus tool developers
                                                                    with UML (Figure 1). They argue that all the stakeholders
and users community. We have received two answers to that
announcement. The first one has been discarded because the
corresponding profile did not match with our target population.                      8                                    3

The second one was retained because he had the adequate                              6                                  2.5
target profile: practitioner and UML expert. Then, we sent
                                                                                     4
direct mails to industrial experienced practitioners in the MDE                                                           2

community. We asked them to participate in our study or ask                          2                                  1.5
other potential persons that might be interested and interesting                     0                                    1
for our study. We contacted 11 persons where: 6 persons                           agrams                   diagram
                                                                                                                  s
                                                                                                                                fo.                                   ML
                                                                            ify di                 odify              Filter inAdapt the spee
                                                                                                                                              ch         info
                                                                     Do mod                Don’t m                                                textual Don’t use U
accepted our request, one person suggested another one that he                                                                            Include

deemed more interesting for our study and who was retained,         (a) Communication with familiars (b) Communications with non-
two persons have not answered to our mails and finally two          with UML                         familiars with UML
indirect contacts have not accepted to participate to our study     Fig. 1: The need to contextualize UML diagrams for commu-
because they are not experts and practitioners of UML. In total,    nications
we have carried out 8 interviews with 8 participants all experts
and practitioners of UML. Roles of the interviewees range
                                                                    already know and understand the language. However, when
from the requirement manager, software architect, software
                                                                    it is about discussing with customers, they react differently.
designer, consultants, and software engineers. They work on
                                                                    Most of the practitioners don’t modify their diagrams but try
different domains: transportation, aerospace engineering and
                                                                    to adapt their speech to the audience. Following are two claims
defense, avionics, telecommunication, E-commerce, insurance,
                                                                    from our practitioners:
banking, etc. 5.5 hours of interviews have been recorded and
                                                                    “. . . We kind of read the diagram to them then we say our
manually transcribed.
                                                                    interpretation and they just hear what we say and they agree
                                                                    or not with that. . . ” (Transcript 3)
A. Analysis
                                                                    “I didn’t ask him to learn all of UML but like for the class
  1) Situations of UML use in practice:                             diagram I would explain the class you know what the class is,
the attributes and relationships that takes only a few minutes     requirement manager justifies his use of the use case diagrams
and then... the subject matter he is really familiar when he       by the fact that such use is recommended by the safety require-
sees that these boxes as you know class called solution or         ment standard. Use cases are also used to drive our software
column or pump and types of things that are easier to work         engineer thinking then they will be part of the documentation.
with...” (Transcript 4)                                            State machines are mostly used to design the behavior of
Other interviewees would prefer filter some information from       the systems to be built or as an executable model. Activity
their diagrams to keep only those interesting for their commu-     and structure diagrams are the fourth most used diagrams
nications. To that end, they omit technical details that don’t     by our practitioners. Activity diagrams are mostly seen as
really matter to their customers. They try to keep diagrams        an elaboration of the use cases and a representation of the
simple to better communicate.                                      systems features. They are also used for the business process
“. . . we actually try to simplify as much as possible in our      modeling and as a communication vehicle with customers.
... UML model because they aren’t UML experts so we try            Then come the interface, component and interaction diagrams
to filter out all. . . We try not to overload our diagrams with    as less used ones. These findings are coherent with previous
labels everywhere that non UML experts will not understand”        empirical studies in this area [15][14].
(Transcript 7)                                                          d) Pattern of UML use in practice: We asked the inter-
Finally, one practitioner prefers not using UML when dis-          viewees to describe in detail their practices in using UML to
cussing with non-familiars with UML.                               build a system or a project. We analyzed the answers to this
Generally, all our practitioners were aware of the unsuitability   question and were able to identify a pattern of the use of UML
of UML for all types of communications. They try to find dif-      by our interviewees. All our practitioners begin by gathering
ferent manners to facilitate such use. Rare of the practitioners   the requirements from the customer. That might be in a textual
has mentioned the recurrent use of the visual variables to adapt   form or via a modeling session.
the diagrams to communications. This fact is mainly due to         “The three people working on the project for example, we
problems with tools (see Section 6).                               interview users who want the system and we understand
       c) Used UML diagrams: Interviews showed that class          from them what the requirements are, then we translate these
diagrams and sequence diagrams are the most used ones in           requirements. It is like we have a modeling session, we sit
practice (Figure 2). Different reasons are given to justify the    with them the three of us and we interview that, what do you
choice of such particular diagrams. A software engineer argues     imagine blablabla. And then we capture the use cases and we
that the class diagram is the most expressive notation in UML      start populating a use case diagram...” (Transcript 3)
for modeling data. A software designer uses the class diagram      At this level, the models serve as a support of communication
to have a design of the database. A software architect pointed     with the customer and within the technical team members.
out that class diagram is used to divide the work among the        This step allows our participants to draw the big picture of the
different teams involved in a same project. Class diagrams are     systems to be built. One interviewee mentioned the advantages
also employed to draw the business entities of the systems         of representing the system in a visual form instead of text.
and to represent the functional relationships between these        “Drawing the system instead of writing is a good tool to
latter. Concerning the sequence diagrams, they are most used       communicate and share mind viewpoint. The vision goes more
to define the interaction between the classes and interactions     quickly, we can decide more quickly about the architecture,
between users and the solution (i,e., the common definition        the architecting stuff.” (Transcript 6)
of a sequence diagram). Sequence diagrams are also used            Then, participants move to an understanding session where
                                                                   they review and check the requirements of the customer to
                   Interaction
                                                                   ensure they correspond to their needs.
                 Components                                        “We want to represent the system as it is and we want to
                     Interface                                     understand the needs may be to understand the way to go
                     Structure
                    Activities                                     to the system to be. So we used different diagrams offered,
              State machines                                       provided by UML to draw the big picture of the – context to
                    Use cases
                    Sequence
                                                                   deeply understand what is the need.” (Transcript 6)
                         Class                                     To that end, they might need to go back to the customer and
                                 1   2   3   4   5
                                                                   review the requirements in another modeling session. Once
           Fig. 2: UML diagrams used in practice                   ensured that their models match well with the requirements of
                                                                   the customer, they split the work among the persons involved
in the definition of the white box part of a solution and to       in that project. To that end, UML might be employed as
realize specific use cases. The use case diagrams and the          a discussion vehicle via the class and use case diagrams.
state machines diagrams are ranked second among the most           Finally, each participant continue using UML for his particular
used UML diagrams in practice. The purpose of creating use         needs: model simulation and execution where the models
case diagrams is to enumerate the functions to develop and to      represent the code. They might generate code from them or
specify actors and the interactions between them. Eventually,      continue coding the system and keep the created models in
that refers to the definition itself of a use case diagram. Our    the documentation. In all cases, the models will populate
the documentation that describes each project or system. To           “We have discussed and said that we should avoid coloring. At
these ends, most of our practitioners use a modeling tool in          least if the colors have a specific semantic I mean you should
their practice. One interviewee pointed out that the use of a         be able to understand the diagram without the colors we can’t
modeling tool depends on his needs. If it is about gaining his        put any semantic meaning into the colors because if you lose
own understanding, he settles for a pen and paper. Otherwise,         the colors when you print into black and white printers I mean
if it is about automation, he do use a modeling tool.                 it is pretty fundamental to still have the same semantic of the
      e) Searched information: We asked our practitioners             diagram”. (Transcript 5)
about information that they need to visualize in practice.            Furthermore, we observe that most of the examples of high-
We distinguish two types of information. First, we find the           lighted information using colors are “selective” information:
semantic information (i.e., what is modeled in a diagram).            Practitioners want to highlight UML nodes belonging to a
Following are examples of semantic information mentioned              same group (e.g., MVC elements, elements that have the same
by our interviewees: Input and output statements for the              semantics) together. They also need to highlight “ordered”
requirements, to see the communication in a sequence diagram          information (e.g., progress of implementation, important fea-
to understand the logic, to see which system does what,               tures).
to search the across functions, to see the interactions of a
practitioner own system, to search for references for specific                                           No
signals or events in the model, specific signal in a specific
protocol or interface that trigger a state machine. Second, we                   Yes but problems with tools
find what we called extra-semantic information. It consists
in non-semantic information but that can be extracted from
                                                                                                        Yes
a UML model. Examples of extra-semantic information are:                                                       1   1.5   2   2.5   3


level of implementation of the classes, bugs in the model in           Fig. 3: Utility of colors in practice (Were colors helpful?)
the case of model execution.
We observe that practitioners need to visualize information                 b) Utility of colors in practice: We asked our practition-
on their diagrams. Before going to the documentation, UML             ers if the previously mentioned use of colors has been helpful.
diagrams are subjects of many visualizations where practition-        We found out that most of them agree on the added value of
ers need to search for important information to accomplish            colors and that their use was helpful in practice. Figure 3
their tasks. If we link this finding to the previous results about    details the answers of the participants. 3 interviewees totally
the purposes of using UML in practice, most of the searched           agree on the utility of colors in practice. The same number
information belong to the “drawing of understanding” purpose.         of interviewees confirm that colors are helpful in practice but
Practitioner visually navigate in their diagrams to find accurate     there are problems with modeling tools that deteriorate such
information to build the mental map of their systems or               use. Besides, they express their need for an automatic and
projects.                                                             efficient tool and propose some recommendations that will be
   2) Visual variables in practice:                                   discussed in Section 6. One interviewee stresses that colors
      a) Color: We asked our practitioners about their need           are helpful but only for communications.
for colors in practice and about examples of information they               c) How to use colors? : In case of the use of colors,
needed to highlight using them. Again, we distinguish two             we wanted to understand how practitioners do chose colors.
types of information: semantic information and extra-semantic         We found out that only two practitioners use some internal
information. Only two semantic information were mentioned             conventions of their companies. Following are examples of
by one single practitioner: Important features like inheritance       conventions used within two different companies:
or interface and elements that have the same semantic. Most           “Non-tested functions: Blue; safety functions: yellow. . . ”
of the interviewees used color to highlight extra-semantic            (Transcript 1)
information. The progress of the implementation of classes has        “To communicate the green means we have it, yellow means
been mentioned by three practitioners. They want to visualize         in progress, red means we –“ (Transcript 3)
the progress of the development of their classes directly on          The majority of interviewees do not have internal conventions,
the diagrams. Examples of extra-information mentioned are:            they follow their own tastes.
   • Role in the design (criticality, parts of patterns (especially   “In my domain which are in general embedded systems we can
       MVC), parts of layers, levels of security).                    use blue for that software functional related, I use orange for
   • Status in development (progress in implementation, test-         everything that software platform related to framework system,
       ing, execution).                                               drivers, etc and red for everything that is material, hardware
   • Distribution of tasks between the stakeholders (ownership        related.” (Transcript 7)
       of each class).                                                “I avoid red because red means mistake and green is nice
Besides, one practitioner mentioned that colors must not be           because it means correct.” (Transcript 1)
used to highlight semantic information. He argues that the            One practitioner says that they have internal conventions but
diagram should be understood without coloring which might             are used in an ad-hoc manner.
disappear in case of the use of black and white printers.             “Unfortunately, this (internal reference documents) is used in
an ad-hoc manner. We have just documents to follow but no                “Texture: The diagrams are printed and stuck in the wall so
body follows them in a formal manner.” (Transcript 6)                    using texture... to me it is making the model less readable...
Then, we wanted to know if practitioners add legends/keys                it could be more beautiful for business people. For technical
when they use colors. We found out that the majority of                  people I don’t think it will be added value.” (Transcript 8)
practitioners do add legends or would like to do so: 2 practi-           The use of the visual variables depends also on the size of the
tioners confirm that if they use color, they add keys. Two other         working teams. In large organizations, the use the different
practitioners would like to add keys but there are limitations           visual variables might create a mess.
in the used modeling tools (Figure 4).                                   “In smaller teams, probably they are perfectly well where you
 At this level, we observe that most of the practitioners neither        can align and decide the coloring rules and so on but as long
                                                                         as get a little bit bigger, then going and using different visual
                                   No                                    variables... just creates a mess.” (Transcript 5)
                                                                         Ongoing researches about the use of the visual variables
                                                                         in UML should take into account these claims and provide
                                                                         effective material (i.e., via theories and convenient tools) to
                                                                         handle the aforementioned problems.
           Yes/ would like to add them
                                         2   2.5       3       3.5   4
                                                                                     IV. M ODELS REPOSITORY ANALYSIS
            Fig. 4: The need for keys in practice

follow internal conventions nor add keys when they use color.
Such behavior is non effective because keys are primordial if
at least one visual variations does exist [3]. That might create
ambiguities to understand the diagrams in question (e.g., by
the author himself after a long time or another team member
who might need to read it while maintenance tasks).
      d) The other visual variables: We asked an open ended
question about the utility of the other visual variables (i.e.;
size, brightness, texture/grain and orientation) in practice. The
majority of our practitioners confirms that the use of the visual        Fig. 6: Analysis of the visual variations in the models reposi-
variables might help using UML in practice (Figure 5). In                tory.

                                                                            As mentioned in Section 3, we distinguish significant vari-
                             Problems
                                                                         ations of a particular visual variable and non significant ones.
                                                                         In that context, results of the analysis of the models database
           Yes only for communication                                    show that 22% of the diagrams present a significant visual
                                                                         variation (Figure 6.1). That means that modelers did need to
            Yes but problem with tools                                   highlight information and used the visual variables to that end.
                                         1   2     3       4     5   6
                                                                         As depicted in Figure 6.2, color, brightness and size are the
     Fig. 5: The need for the visual variables in practice               three most used visual variables. We found out that only one
                                                                         diagram is using the texture visual variable and the orientation
parallel, they stressed on the effectiveness and usability of            is never used. 67 % of the diagrams present non significant
the employed tools for that purpose. The utility of the visual           variations (Figure 6.1). Such non-significant variations refer
variables directly depends on the efficiency and usability of            to the default configurations of the used modeling tools (e.g.,
the tools. One interviewee argues that these visual variables            by default, all the classes might be yellow, blue, green, gray,
might be helpful only for communications. If it is about                 etc,.). 11% of the diagrams are purely black and white and do
understanding or using his own diagrams (i.e.; that he creates),         not present visual variations.
he will not use them.
“If I have to model a function, I AM the designer, I am                  A. color
modeling this function, so I don’t see how I should use visual              Color is the most used visual variable to express significant
annotations.” (Transcript 1)                                             visual variations by 80% (Figure 6.2). We analyzed details
Another interviewee thinks that the use of the visual variables          about the use of colors and observed that they are differently
might create a mess. The size visual variable might make big             applied to UML elements: background, borders, edges, text,
diagrams less readable. Texture also might create problems of            heads and compartments. We found out that colors are applied
readability and printing issues.                                         to the background of the UML elements in 57% of the
“Size: No because most of the time, the models are so complex.           diagrams that present significant color variations (i.e., classes
So having classes bigger than others make the diagram less               or lifelines) (Figure 7). 10% of the diagrams present a color
readable.” (Transcript 8)                                                variation of the contained text of an UML element: class name,
                                                                      they want to highlight (e.g., attributes, methods or just a part
                              Annotations Combinations
                              3%             2%                       of them). Once, the different sizes of the text have been used.
                           Head
                           6%
                    Text                                                                  V. S TATE OF THE ART
                    10%

                                                                      A. UML use in practice
                Borders
                9%
                                                                         Several empirical studies investigating the use of UML in
                                                         Background
                                                                      practice do exist in the literature. [16] evaluates the costs
                  Edges
                                                         63%
                  7%                                                  and benefits of modeling in practice via discussions with 38
                                                                      professionals at a developer-community meeting. The authors
                                                                      found out that the three main advantages of UML are: the
                                                                      ability to handle the growing complexity of software develop-
     Fig. 7: How are colors applied to UML elements?                  ment by working at higher levels of abstraction, traceability
                                                                      from requirements to low level design and more efficient
                                                                      communications. They pointed out problems that should be
attributes, methods or even text related to comments. 9% of
                                                                      addressed. In fact, participants claimed the need to efficiently
the diagrams present color variations of the borders of UML
                                                                      communicate using UML diagrams. They emphasized the
elements. Modelers vary only the border of packages and
                                                                      importance of keeping diagrams focused and as simple as
classes. Finally, we observed that modelers add information
                                                                      possible. They also pointed out the difficulty to perform a
in their diagrams using colored text or arrows. We looked
                                                                      research to find relevant occurrences, spacial layout problems
further into detail to find out the information that modelers
                                                                      and other issues. [13] presents results of a survey of 113
wanted to highlight in the UML diagrams. However, it was
                                                                      software practitioners that studied the motivations of using
difficult to identify them. The latter difficulty is due to the
                                                                      code centric versus modeling centric approaches. They found
lack of keys or any information that designate the meaning
                                                                      out that UML is the most used notation in practice and
each color variation. In fact, only 4% of the diagrams that
                                                                      quality of generated code is one of the biggest problems with
present a color variation do contain keys or simply meanings
                                                                      modeling tools. [13] Wojciech and al. conducted a controlled
of the applied visual variations (Figure 8). 14% of these
                                                                      experiment to assess the benefits and costs of using UML
keys are not up-to-date with the corresponding diagram. That
                                                                      particularly while maintenance tasks. They showed that UML
might occur because the used tool does not automatically
                                                                      diagrams helped participants fixing changes but increased the
update the keys. This finding joins the interviews results where
                                                                      development time due to the overhead of updating the UML
practitioners have raised their need to add keys and pointed
                                                                      diagrams. [15] discusses results of a web survey with analysts
out the limitations of tools to add these latter. We analyzed the
                                                                      that are familiar with UML. It investigates the purposes of us-
                                              Present Keys            ing UML in practice, the used diagrams for each purpose and
                                              4%                      the degree of success UML has in facilitating communications
                                                                      within development teams. All of these works gather different
                                                                      kinds of data (i.e., quantitative and/or qualitative) to explore
                                                                      the UML use in practice. They try to answer diverse initially
                No keys                                               fixed research questions about UML. They analyze data from
                96%
                                                                      different angles/perspectives. However, no work has attacked
           Fig. 8: Do modelers add keys/legends?                      the angle of investigating the situations where practitioners
                                                                      need to visualize information, the sought information and
highlighted information when keys are available. 28% of the           eventually the use of the visual variables in practice. We deem
diagrams present the Model, View and Controller elements              inescapable to conduct the present empirical research to further
as highlighted information. They use the following sets of            study the usefulness of the visual variables in UML.
colors: (pink, yellow and mauve), (green, yellow and mauve),
(blue, orange and green) and (yellow, green and red). All             B. The visual variables in practice
the highlighted information are selective ones where different           If some works study the impact of the visual variables
groups of elements can be visually grouped together.                  use on UML, they focus only on the following two visual
                                                                      variables: position via layouts and colors. The impact of the
B. Brightness and size                                                other visual variables has not been studied, despite their great
   As mentioned above, the brightness is the second used              performances in reducing the cognitive load of human beings.
visual variable. As colors, brightness is mostly employed to          A lot of researches aiming at finding effective layouts have
highlight selective information. Modelers chose different levels      been conducted. Finding the effective layouts was and still is
of brightness of a particular color or ranges of white and            an important topic in software engineering field. [6] [7] [8]
gray. Brightness is always applied to the background of UML           aim at finding effective layouts based on diagram comprehen-
elements. For the size variations, significant ones are mostly        sions (i.e., ask comprehension questions about diagrams with
applied to text. Modelers change the thickness of the text that       different layouts) and user preferences (i.e., ask participants to
mention their preferences on diagrams with different layouts).      and sometimes, they are not up-to-date with the corresponding
The use of colors has been less investigated. [17] evaluates        diagrams. The latter finding confirms the interviews results
effective layouts based on class diagrams comprehension via         about the need to an automatic tool. Existing theories like
an experiment. It uses colors to highlight information on the       [3] prove that keys are mandatory when at least one visual
diagrams. The authors found out that color helped participants      variation does exist in a graphical representation. It helps
to answer questions. [9] uses eye tracking to evaluate the use of   reading and understanding the meanings of such variations.
colors, layouts and stereotypes in the comprehensions of UML        Indeed, when we tried to analyze the models in the repository
class diagrams. The latter experiments showed that colors           [11], we encountered problems to understand the meanings
were helpful for participants. However, all of these works          of the visual variations applied by modelers. That might be
are quantitative studies. No qualitative research have been         problematic in practice when modelers want to understand
conducted to present better understanding about the actual          the diagrams containing some significant visual variations.
state of use of the visual variables in practice. They are all      In addition, we observed that the used visual variables are
controlled experiments, they don’t reflect the practices of UML     differently applied to UML elements: border, background, text,
users and their opinions about such usages. No exploratory          etc. In that context, there are absolutely implementations that
filed study does exist in this area.                                are more effective than others. Ongoing researches should
                                                                    better explore the effective ones [18].
         VI. D ISCUSSION AND LESSONS LEARNED                        Via both research methods, we observed that colors are mostly
   Results of the interviews show that UML diagrams are             employed to express selective information. Based on [3], such
employed in several situations (e.g., communication, drawing        use is effective. Selectivity is one of the perceptive properties
of understanding, analysis) using different diagrams (e.g.,         of colors, the human eye can rapidly select groups of elements
classes, activities, state machines). These situations involve      having the same color together. However, we also noticed that
many visualization tasks where practitioners need to research       practitioners use colors to express ordered information (e.g.,
information important to accomplish their work. These infor-        the progress of the implementation of a project). Such use is
mation might be semantic or extra-semantic ones. Interviews         non effective [3]. In fact, the human eye can not order colors
also show that colors are sometimes used in practice. Such use      but it can spontaneously and rapidly order different levels of
has been recognized as helpful when used by our practitioners.      brightness (i.e., from dark to light and vice versa).
Concerning the other visual variables (i.e., size, brightness,
texture/grain and orientation), practitioners do not actually                             VII. C ONCLUSION
use them but deem that they might be helpful and useful                The present empirical study provides understanding about
in practice. However, such usefulness directly depends on           the use of UML and the visual variables in practice. 8
the usability of modeling tools. To reinforce their claims,         interviews have been carried out with experts and practitioners
practitioners mention recommendations about effective ones.         of UML. In addition, + 3500 UML diagrams were analyzed to
First, they express the need to an automatic tool that updates      discover the employed visual variables and discuss the ways of
the visual variations in case a highlighted information does        their usages. Interviews show that UML diagrams are used in
change or evolve. In the extra-semantic information about the       different situations where practitioners need to visualize infor-
progress of the implementation, classes should automatically        mation. Results from both research methods show that color
be updated when a class status moves from in progress               is the most used visual variable. It is differently employed
status to implemented. Practitioners have also raised the need      (borders, text, edges, compartment, etc,.) to express selective
to add keys when they use colors. They pointed out that             information. But, it is also employed to express ordered infor-
not all modeling tools present such feature. In that context,       mation which is not effective based on [3]. Furthermore, keys
they suggest to have an interactive legend that enables, for        are primordial when at least one visual variation does exist
instance, the possible update of the visual variations in the       [3]. However, our practitioners and the analysis of the UML
UML diagram directly in the keys and vice versa. They               models show that they are not often added. This is mainly due
also recommended the possibility to define rules that map           to problems with modeling tools. These results might help
the information to highlight and the corresponding visual           ongoing researches providing theories to effectively employ
variables. Furthermore, practitioners stress on the subtlety of     the visual variables (e.g., effective implementations, rules of
the used visual variations. The visual variables have to be         efficiency to map information to the most adequate visual
associated to particular meanings. They also stress on the          variable). Second, effective tools, that respect the practitioners
necessity to consider large organizations where a big number        recommendations for instance, must be provided. Besides, we
of persons collaborate on the same models: the tool should          begun developing such tool in Papyrus [19].
handle the conflicts that might appear.                             In the future, other empirical studies should be held to rein-
As with the interviews, results of the quantitative analysis        force the findings of the present research. That might be done
of the UML models show that color is the most used visual           via surveys, experiments or discussion panels. The contexts of
variable. But, concerning the other visual variables, it shows      the use of the visual variables in the models repository might
that brightness and size are also used to highlight information.    be collected to further link the performed visual variations to
In the models repository, only 4% of the diagrams present keys      the real situations of use.
                        ACKNOWLEDGEMENT
  We would like to gratefully thank all the practitioners who
have accepted to participate to the interviews.
                             R EFERENCES
 [1] “Object management Group, howpublished= http://www.omg.org/.”
 [2] “UML specification, howpublished= http://www.omg.org/spec/uml/.”
 [3] J. Bertin, “Semiology of graphics: diagrams, networks, maps,” 1983.
 [4] T. R. G. Green and M. Petre, “Usability analysis of visual programming
     environments: a cognitive dimensions framework,” Journal of Visual
     Languages & Computing, vol. 7, no. 2, pp. 131–174, 1996.
 [5] D. L. Moody, “The physicss of notations: toward a scientific basis
     for constructing visual notations in software engineering,” Software
     Engineering, IEEE Transactions on, vol. 35, no. 6, pp. 756–779, 2009.
 [6] K. Wong and D. Sun, “On evaluating the layout of UML diagrams for
     program comprehension,” Software Quality Journal, vol. 14, no. 3, pp.
     233–259, 2006.
 [7] B. Sharif and J. I. Maletic, “An empirical study on the comprehension of
     stereotyped UML class diagram layouts,” in Program Comprehension,
     2009. ICPC’09. IEEE 17th International Conference on. IEEE, 2009,
     pp. 268–272.
 [8] H. C. Purchase, L. Colpoys, D. Carrington, and M. McGill, “UML
     class diagrams: an empirical study of comprehension,” in Software
     Visualization. Springer, 2003, pp. 149–178.
 [9] S. Yusuf, H. Kagdi, and J. I. Maletic, “Assessing the comprehension
     of UML class diagrams via eye tracking,” in 15th IEEE International
     Conference on Program Comprehension (ICPC’07). IEEE, 2007, pp.
     113–122.
[10] R. Hebig, T. Ho-Quang, G. Robles, M. Fernandez, and M. R. V.
     Chaudron, “The quest for open source projects that use uml: mining
     github,” in Proceedings of the ACM/IEEE 19th International Conference
     on Model Driven Engineering Languages and Systems. ACM, 2016,
     pp. 173–183.
[11] “UML repository,” http://oss.models-db.com/.
[12] F. Shull, J. Singer, and D. I. Sjøberg, Guide to advanced empirical
     software engineering. Springer, 2007.
[13] A. Forward, T. C. Lethbridge, and O. Badreddin, “Perceptions of
     Software Modeling: A Survey of Software Practitioners,” University of
     Ottawa, Tech. Rep., 2010.
[14] W. J. Dzidek, E. Arisholm, and L. C. Briand, “A realistic empirical
     evaluation of the costs and benefits of UML in software maintenance,”
     Software Engineering, IEEE Transactions on, vol. 34, no. 3, pp. 407–
     432, 2008.
[15] B. Dobing and J. Parsons, “How uml is used,” Commun. ACM,
     vol. 49, no. 5, pp. 109–113, May 2006. [Online]. Available:
     http://doi.acm.org/10.1145/1125944.1125949
[16] M. R. V. Chaudron, W. Heijstek, and A. Nugroho, “How effective is uml
     modeling?” Software & Systems Modeling, vol. 11, no. 4, pp. 571–580,
     2012.
[17] O. Andriyevska, N. Dragan, B. Simoes, and J. I. Maletic, “Evaluating
     uml class diagram layout based on architectural importance,” in Visual-
     izing Software for Understanding and Analysis, 2005. VISSOFT 2005.
     3rd IEEE International Workshop on. IEEE, 2005, pp. 1–6.
[18] Y. El Ahmar, X. Le Pallec, and S. Gérard, “Empirical activity: Assessing
     the perceptual properties of the size visual variation in uml sequence
     diagram.”
[19] S. Gérard, C. Dumoulin, P. Tessier, and B. Selic, “19 papyrus: A
     uml2 tool for domain-specific language modeling,” in Model-Based
     Engineering of Embedded Real-Time Systems. Springer, 2010, pp. 361–
     368.