=Paper=
{{Paper
|id=Vol-1805/Elahmar2016HuFaMo
|storemode=property
|title=Empirical Activity: Assessing the Perceptual Properties of the Size Visual Variation in UML Sequence Diagram
|pdfUrl=https://ceur-ws.org/Vol-1805/Elahmar2016HuFaMo.pdf
|volume=Vol-1805
|authors=Yosser El Ahmar,Xavier Le Pallec,Sébastien Gérard
|dblpUrl=https://dblp.org/rec/conf/models/AhmarPG16
}}
==Empirical Activity: Assessing the Perceptual Properties of the Size Visual Variation in UML Sequence Diagram==
Empirical Activity: Assessing the Perceptual Properties of the Size Visual Variation in UML Sequence Diagram Yosser El Ahmar Xavier Le Pallec Sébastien Gérard CEA, LIST, Laboratory of University of Lille, CRIStAL CEA, LIST, Laboratory of Model Driven Engineering for Lab UMR 9189 Model Driven Engineering for Embedded Systems 59650 Villeneuve d’Ascq, Embedded Systems P.C. 174, Gif-sur-Yvette, France P.C. 174, Gif-sur-Yvette, 91191, France xavier.le-pallec@univ- 91191, France yosser.ELAHMAR@cea.fr lille1.fr Sebastien.GERARD@cea.fr ABSTRACT intensive systems. Recent empirical studies about UML in Recent empirical studies about UML showed that software practice [14] [4] showed that UML artefacts are mostly used practitioners often use it to communicate. When they use for communications. Stakeholders of these communications diagram(s) during a meeting with clients/users or during an might be familiar with UML (e.g. members of the technical informal discussion with their architect, they may want to team) or not (e.g. clients) [9]. In such situations, modelers highlight some elements to synchronise the visual support to may need to highlight information that they deem relevant their discourse. To that end, they are free to use color, size, for the discussion (e.g. the main class in a class diagram; brightness, grain and/or orientation. The mentioned free- model, view and controller elements; a modeler’s own sub- dom is due to the lack of formal specifications of their use in system; distribution of tasks between technical members; the UML standard and refers to what is called the secondary project progression). This is to synchronize the visual sup- notation, by the Cognitive dimensions framework. Accord- port with their discourse. In that context, while the UML ing to the Semiology of Graphics (SoG), one of the main specification describes exhaustively its primary notation, its references in cartography, each mean of visual annotation is semantics, it lacks highlighting abilities for such contextual characterized by its perceptual properties. information. The secondary notation, defined by the Cog- Being under modeler’s control, the 5 means of visual an- nitive Dimensions framework [11], may deal with such con- notations can differently be applied to UML graphic com- cerns. It refers to the free use and change of the possible ponents: to the border, text, background and to the re- means of visual annotations: size, brightness, grain, color lated other graphic nodes. In that context, the goal of this and orientation. The previously mentioned five means of research is to study the effective implementations, which visual annotations are relatively rapidly perceived because maintain the perceptual properties of, especially, the size the reader’s eye can detect their variation without moving visual variation. This latter has been chosen because it is the visual brush. According to the Semiology of Graph- considered as the ”strongest” among the other visual means, ics (SoG), one of the main references in cartography, each having all the perceptual properties. mean of visual annotation is characterized by its perceptual The present proposal consists of a quantitative methodol- properties. In fact, it can be selective: allows readers to dis- ogy using an experiment as strategy of inquiry. The partic- tinguish groupings (e.g. all green marks), ordered: allows ipants will be the ˜ 20 attendees of the HuFaMo workshop. readers to perceptually order marks (e.g. from dark to light They must be experts on modeling and they know UML. The or from light to dark but never in another order) and/or treatment is the reading and the visual extraction of infor- quantitative: allows readers to visually quantify ratio be- mation from a set of UML sequence diagrams, provided via tween marks (e.g. three times larger). a web application. The dependent variables we study are In UML, as the means of visual annotations are under the responses and the response times of participants, that modeler’s control, there exist different ways to vary their val- will be validated based on the SoG principles. ues into a UML graphic component: graphic node or graphic path. The combinatorial explosion of the possible implemen- CCS Concepts tations is due to four reasons. First, UML graphic nodes •Software Engineering → UML modeling; •Software mostly include: a border, a text and a background. Second, visualization → Semiology of Graphics; some UML graphic nodes are composed of multiple shapes (e.g. a lifeline is composed of 3 components: a rectangle, a dashed line and sometimes an execution specification). Keywords Then, graphic nodes might be related to other nodes via UML, Secondary notation, Size visual variable, Empirical graphic paths, forming the diagram. Finally, a UML graphic activity. component might contain/be contained in other graphic nodes (e.g. a fragment in a sequence diagram can contain one or more messages). 1. INTRODUCTION It may seem obvious that some implementations of varia- The Unified Modeling Language (UML) is the visual lan- tions are more effective in highlighting elements than others. guage for specifying, constructing and documenting software But what we can gain in effectiveness might be anecdotal. To be sure that there exist (or not) implementations that are more effective, we have to dress an exhaustive list of implementations and test them. This means that we have to rigorously decompose all UML graphic components and see, for each sub-element, if the value of a mean of visual annotation can vary and how. Consequently, the purpose of this research is to study the effective implementations, Figure 1: Delimitation of the study. which allow viewers to fully benefit from the performances of a mean of visual annotation. This is a purpose for which the number of related works is small. As the field of study Size, brightness, grain, color and orientation represent is wide, we propose to focus here only on the variation of powerful means to highlight information, to make it rela- the size mean of visual annotation and on one type of di- tively rapidly perceived in a third dimension [5]. Each mean agram : UML sequence diagram. The size visual variation of visual annotation is characterized by its perceptual prop- has been selected in this study because it is the only mean erties. The SoG distinguishes three perceptive attitudes that of visual annotation, belonging to the UML secondary no- viewers can take in front of a mean of visual annotation. tation, which has all the perceptual properties. In addition, Selectivity: the reader can perceive groupings (e.g. all red we propose to target, especially, the UML sequence diagram colors, all marks having the same size). because it belongs to the three first mostly used UML dia- Order: The human eye can perceive order (e.g. from dark grams in practice. to light, from the smallest mark to the biggest). For each type of graphic component, being composed of Quantity: The viewer can perceive ratio between marks multiple shapes or a component itself, container of other (e.g. this mark is 5 times bigger than another). graphic nodes or contained in other graphic nodes, this re- The size is the only mean of visual annotation allowing the search aims at finding patterns of effective implementations three perceptive attitudes. To benefit from its interesting of the size visual variation. In this study, we assume that performances, we made the choice to begin by studying its the latter patterns depend on the information to be high- effective implementation in UML. We chose to be limited lighted. It can concern only one graphic component (e.g. a to three categories of size. This number can be extended lifeline) or more than one (e.g. two or more than two life- to more than three categories in a future empirical study. lines). For the first assumption, the size variation will surely We argue that exceeding three categories of size in UML highlight the concerned graphic component [5]. But, we aim diagrams will overload the diagram, especially if it contains a at finding the effective implementations, which allow viewers lot of graphic components (i.e. large diagram). In addition, to relatively rapidly perceive all significant details about the we note that sizes of graphic nodes depend on the contained concerned graphic component. For the second assumption, text (eg. the width of a UML class varies depending on the the size visual mean is selective, ordered and quantitative [5]. length of its name, its attributes or its methods). Therefore, In this case, we want to find the effective implementations we will assume that all graphic nodes, in a diagram, have the which maintains valid the selective, ordered and quantitative same initial size (i.e. the size of the biggest node, containing perceptive attitudes of its variation. the largest text). To that end, we want to study the impact of the possible According to [9][8], sequence diagram is ranked among implementations on the perceptual properties of the size vi- the three first frequently used UML diagrams in practice. It sual mean. The latter impact will be controlled by the size is mostly used for clarifying understanding among technical of the UML diagram containing the implementation. It can members of the project team [8]. In such informal meet- be small, medium or large. The studied impact will also be ings, highlighting information might be promising to ease controlled by the layout of the diagram. We will especially the communication [10]. In addition, contrarily to class di- focus on the horizontal and vertical distance between the agrams, we note a lack of works in the literature, studying related graphic components. the effective visualization of sequence diagrams. Those are This paper presents a proposal of a quantitative method- the main reasons behind the specific choice to begin by the ology using an experiment as strategy of inquiry. The par- UML sequence diagram. ticipants will be the ˜ 20 attendees of the HuFaMo work- In practice, the graphic nodes will be connected to each shop. They must be experts on modeling and they know others, forming the diagram. The resulting diagram can UML. The treatment is the reading and the visual extrac- be small, medium or large. We chose to cover all the 3 tion of information from a set of provided UML sequence alternatives in the present study. diagrams, via a web application. The outcome variables we The graphic notation of the sequence diagram is described study are responses and response times of participants, that by 11 graphic nodes and 4 graphic paths [1](p. 594-596). As will be validated based on the SoG principles. we chose to exhaustively study the UML sequence diagram., we will take into account all of them in the present experi- 2. EXPERIMENT DEFINITION ment. This section reports on the delimitation of the study, the We observe that information to highlight might concern research questions that it attempts to answer and its hy- only one graphic component (e.g. one message, one lifeline, pothesis. one coregion). It can also concern more than one graphic component (e.g. multiple lifelines, multiple coregions, mul- 2.1 Delimitation of the study tiple execution specifications). The present study will cover Figure 1 resumes the delimitation of the experiment. Fol- both alternatives. lowing are justifications for each choice. Finally, we observed that distance between related graphic components can vary in two directions, horizontally and ver- Implementation I (alternatives: Effective Implementation tically for instance. We will experiment with both possibil- I, Other Implementation I’). ities. Size of the sequence diagram S (alternatives: small, medium, large). 2.2 Research questions Its layout L (alternatives: Horizontal distance HD, Verti- After delimiting the study, we will define the research cal distance VD). questions for the resulting scope. In fact, we observe that, Type of information to highlight TI (alternatives: con- for a single graphic component of the sequence diagram, cerns only one graphic component TI1, more than one graphic there are different possible implementations of the size mean component TIn). of visual annotation. This is due to the following facts. Dependent variables UML graphic nodes are mostly made of a border, a back- Responses of participants R (alternatives: true, false, com- ground and a text. Changing only its area can be seen as ob- plete, incomplete). vious, but we want to explore the effectiveness of varying the Response time of participants T. size of its border and text also. Moreover, some graphic com- ponents include multiple shapes. Lifelines include a rectan- 2.3.2 Hypothesis gle and a dashed line. LostMessages and FoundsMessages include an edge and a black point at the extremity. Varying the size of such graphic components might consist of chang- Table 1: Hypothesis ing the size of all its elementary shapes or some of them. Dependent Null hypothesis Alternative hypothe- We wonder about the most effective implementation. variables sis In addition, some graphic nodes can be embedded in other Response ∀ (S, TI, L); H0: T(I) ∀(S, TI, L) H1: T(I) graphic nodes. An execution specification, a coregion, Dura- time T > T(I’) < T(I’) tionConstraint, a DurationObservation and a StateInvariant Response R ∀ (S, TI, L); H0: T(I) ∀ (S, TI, L) H1: T(I) are always embedded to a lifeline. Continuations might be > T(I’) < T(I’) embedded to more than one lifeline. Changing their size can affect the size of graphic nodes to which they are embedded. The hypothesis for assessing the effectiveness of the I size We want to infer the most effective implementation. variations with the independent variables are given in table Furthermore, graphic components are semantically linked 1. The alternative hypothesis H states that the proposed to each others. Lifelines are linked via graphic paths, graphic effective implementations take less time to let participants paths having source and destination graphic nodes. High- give the right and complete answer to a given question. The lighting them with the size variation might mean highlight- experimented effective implementation I is proposed for each ing its semantically related graphic components also. possible combination of (S, TI, L). Figures in appendices il- Finally, some graphic nodes may contain other graphic lustrate the different implementations that we deem effective components. A Frame, an InteractionUse, a CombinedFrag- and the experiment aims at validating. They also illustrate ment and a coregion can contain executionSpecifications, an example of a question that concern one graphic path (a messages. They may also contain each others. Applying the message) with different implementations. the size to such graphic nodes might concern the contained other graphic nodes and vice versa. As a result, the following research questions arise. 3. EXPERIMENT DESIGN RQ1: What are the effective implementations of the size 3.1 Population, sample, and participants visual variation to all types of graphic components of the The sampling method used in this study is the conve- UML sequence diagram (i.e. container, contained, embed- nience sampling [6]. In fact, the target population of this ded to a graphic node, complex graphic node (composed of study is the community of UML users: practitioners, re- multiple shapes))? searchers, students. The HuFaMo attendees are a naturally Where effectiveness can be measured by the capability of formed and might be a representative sample of the target each implementation to preserve all the perceptual proper- population. They include students, researchers, UML prac- ties of the size, allowing viewers to relatively rapidly detect titioners and maybe some tool vendors. They are a part the accurate information that they are searching for. of the MoDELS community, interested in modeling and/or RQ2: How the effectiveness of each implementation can contributors on MDE. We assume that we will have ˜ 20 be controlled by the type of information to highlight (i.e. participants, considered as experts on UML. concerns only one graphic component, more than one com- ponent). 3.2 Data collection and materials RQ3: How the effectiveness of each implementation can be A web application will be used in the present experiment. controlled by the size of the diagram containing the imple- This is to be aware of the complexity of modeling tools mentation and its layout. (i.e. not all participants are familiar with the same mod- eling tool). Moreover, installing the same modeling tool to 2.3 Hypothesis formulation all participants will be time consuming, especially in the workshop (same timeslot as a presentation). If accepted, 2.3.1 Variables the web application will be developed between the accep- The experiment has 4 independent variables and two de- tance notification and the workshop date. It will be coded pendent variables. by the first author and tested before its use in the experi- Independent variables ment. The web application will first ask participants about their gender, level of experience and if they have visual defi- In addition, as mentioned before, participants might have ciency(ies). Then, it will display a question on a white page. some visual deficiencies. This additional input will be men- After its reading and comprehension, the participant is able tioned before beginning the task, so that we can take into to click a button to switch to the next page. A sequence account its influence on the results. We also note that each diagram, visually annotated with an implementation of the participant will have a different screen with different char- size will appear, along with its corresponding question on acteristics. We will ensure that at least the same value of the bottom. Parallelly, the application will trigger a time luminosity is set up and that the same web navigator is used counter. When the response to the question is found by the to open the web application. Finally, one of the outcomes participant, he can click a button to navigate to another of the study is the response time of participants. It is au- white page (without the sequence diagram), where he will tomatically saved when the participant finds the response be able to enter his response. At the same time, the appli- by clicking a button. Late clicking the button will bias the cation will stop the chronometer and save the time spent to results. We will stress on the importance of this step to par- answer. It will also save the corresponding response entered ticipants in the introduction phase. We will also try to add a by the participant. Sequence diagrams that will be used in voice recorder, so that participants can speak out loud when the experiment will be extracted from a models repository finding the response. Then, we will have to find mechanisms [12] [2]. Visual annotations using implementations of the to manage the simultaneous voices of participants, placed in size variations and questions will be manually proposed. the same setting. 3.2.1 Method 4.2 External validity One day before the experiment, the HuFaMo participants The HuFaMo participants are not only experts on model- will receive an e-mail requesting them to bring their laptops. ing but also interested in Human Factors in Modeling. So, The first author will ensure the availability of an internet they may know about the scope of this research, especially connection during the experiment day. The experiment will the perceptual properties of the means of visual annotations, begin by an introduction phase and a training session re- which can bias the study. To limit the latter threat to valid- lated to the experimental task. The first author will present ity, we will not inform them about the research questions of the web application that will be used in the experiment, the study nor its hypotheses. Moreover, the participants are for which details are mentioned in the previous subsection. not in a natural setting, using their own modeling tool and Then, the link of the web application will be sent to the Hu- moving naturally to their UML sequence diagrams. As a re- FaMo attendees via the workshop mailing list. The second sult, we will perform further additional empirical study (e.g step consists of the experiment’s task. This latter will indi- a case study in a natural setting) in order to be sure that the vidually be performed by each participant. The main treat- obtained result can be generalized to the whole population. ment will consist on the reading and the visual extraction of information from a visually annotated sequence diagram. The estimated time for the whole experiment is 30 minutes. 5. LITERATURE REVIEW The free use of additional means of visual annotations 3.3 Data analysis procedures in software engineering has been recognized as theoretically In the analysis procedure, we will report on the number of advantageous. This is via the secondary notation by the the HuFaMo attendees who didn’t participate to the study. cognitive dimensions framework [11]. A few empirical stud- We also plan to give a descriptive analysis of data for all ies aiming at assessing its benefits in UML visual notation independent and dependent variables of the study. At the have been conducted. However, if they considered the need end of the experiment, we want to analyse the relationship of empirical validations, they focus only on two axis: lay- between the independent and dependent variables. This outs and colors. The other means of visual annotations (i.e. is to find patterns of effective implementations, depending size, brightness, grain and orientation) have not been yet on a combination of (S, TI, L). For each combination of discussed, despite of their great performances on highlight- the three independent variables, we will determine the ef- ing information, known in cartography [5] and psychology fective implementation I, which has a minimum T and a [13] [18]. complete and true values of R. Therefore, we select the cor- Concerning layouts, there exist several empirical studies aim- relation/regression statistic tests. ing at finding effective layouts in UML diagrams. [19] [16] [15] use experiments to assess effective layouts for diagram comprehensions, user preferences, program understanding, 4. ANTICIPATED ETHICAL ISSUES IN THE etc. [20] uses eye tracking in an experiment involving 12 par- STUDY ticipants to identify the impact of layout, color and stereo- This section will report the internal and external threats types on comprehension of UML diagram. Most of the men- to validity. tioned researches [16] [15] [20] focus on UML class diagram. [17] focusses further on UML activity diagram and use case 4.1 Internal validity diagram. While the sequence diagram belongs to the three The first internal threat to validity is the possible gain of most used UML artefacts in practice, we note few works on maturity by the participants during the study. That may it [7] [3]. happen because of the unicity of the studied type of UML diagram: sequence diagrams. As well as the uniqueness of 6. REFERENCES the studied visual variation. Therefore, we will ensure that diagrams will be randomly proposed so that questions con- [1] Object management group. http://www.omg.org/. cerning the same graphic component will not be successive. [2] UML repository. www.models-db.com. [3] S. Abrahao, C. Gravino, E. Insfran, G. Scanniello, and [18] A. Treisman. Preattentive processing in vision. G. Tortora. Assessing the effectiveness of sequence Computer vision, graphics, and image processing, diagrams in the comprehension of functional 31(2):156–177, 1985. requirements: Results from a family of five [19] K. Wong and D. Sun. On evaluating the layout of experiments. IEEE Transactions on Software UML diagrams for program comprehension. Software Engineering, 39(3):327–342, 2013. Quality Journal, 14(3):233–259, 2006. [4] S. Baltes and S. Diehl. Sketches and diagrams in [20] S. Yusuf, H. Kagdi, and J. I. Maletic. Assessing the practice. In Proceedings of the 22Nd ACM SIGSOFT comprehension of UML class diagrams via eye International Symposium on Foundations of Software tracking. In 15th IEEE International Conference on Engineering, FSE 2014, pages 530–541, New York, Program Comprehension (ICPC’07), pages 113–122. NY, USA, 2014. ACM. IEEE, 2007. [5] J. Bertin. Semiology of graphics: diagrams, networks, maps. 1983. APPENDIX [6] J. W. Creswell. Research design: Qualitative, quantitative, and mixed methods approaches. Sage Considering all independent variables, 12 sequence diagrams publications, 2013. are required for each implementation of the size variation. We argue that at least two diagrams are needed for each [7] J. A. Cruz-Lemus, M. Genero, D. Caivano, implementation. Therefore, for all 14 graphic components S. Abrahão, E. Insfrán, and J. A. Carsı́. Assessing the of the UML sequence diagram, at least 336 diagrams are influence of stereotypes on the comprehension of UML required for this study. sequence diagrams: A family of experiments. Information and Software Technology, 53(12):1391–1403, 2011. A. EFFECTIVE IMPLEMENTATIONS AND [8] B. Dobing and J. Parsons. How UML is used. AN EXAMPLE OF A QUESTION WITH Commun. ACM, 49(5):109–113, May 2006. DIFFERENT IMPLEMENTATIONS [9] W. J. Dzidek, E. Arisholm, and L. C. Briand. A realistic empirical evaluation of the costs and benefits of UML in software maintenance. Software Engineering, IEEE Transactions on, 34(3):407–432, 2008. [10] Y. El Ahmar, S. Gérard, C. Dumoulin, and X. Le Pallec. Enhancing the communication value of UML models with graphical layers. In Model Driven Engineering Languages and Systems (MODELS), 2015 ACM/IEEE 18th International Conference on, pages 64–69. IEEE, 2015. [11] T. R. G. Green and M. Petre. Usability analysis of visual programming environments: a cognitive dimensions framework. Journal of Visual Languages & Computing, 7(2):131–174, 1996. [12] B. Karasneh and M. R. Chaudron. Online img2UML repository: An online repository for UML models. In EESSMOD@ MoDELS, pages 61–66, 2013. [13] K. Koffka. Principles of Gestalt psychology, volume 44. Routledge, 2013. [14] M. Petre. UML in practice. In Proceedings of the 2013 International Conference on Software Engineering, ICSE ’13, pages 722–731, Piscataway, NJ, USA, 2013. IEEE Press. [15] H. C. Purchase, L. Colpoys, D. Carrington, and M. McGill. UML class diagrams: an empirical study of comprehension. In Software Visualization, pages 149–178. Springer, 2003. [16] B. Sharif and J. I. Maletic. An empirical study on the comprehension of stereotyped UML class diagram layouts. In Program Comprehension, 2009. ICPC’09. IEEE 17th International Conference on, pages 268–272. IEEE, 2009. [17] H. Störrle, N. Baltsen, H. Christoffersen, and A. Maier. On the impact of diagram layout: How are models actually read? In International Conference on Model Driven Engineering Languages and Systems (MoDELS) 2014, pages 31–35, 2014. Figure 2: Effective implementation of a ”lifeline” I, (TI=TI1, T=S) Figure 3: Effective implementations of ”message” I, (TI=TI1, T=S) Figure 4: Effective implementations of a ”fragment” I, (TI=TI1, T=S) Figure 5: Response to the question: What happens if the controller sends first hello? with the implementation I Figure 6: Response to the question: What happens if the controller sends first hello? with an implementation I’ Figure 7: Response to the question: What happens if the controller sends first hello? with an implementation I’ Figure 8: Response to the question: What happens if the controller sends first hello? with an implementation I’