=Paper=
{{Paper
|id=Vol-1522/Guana2015HuFaMo
|storemode=property
|title=How Do Developers Solve Software-engineering Tasks on Model-based Code Generators? An Empirical Study Design
|pdfUrl=https://ceur-ws.org/Vol-1522/Guana2015HuFaMo.pdf
|volume=Vol-1522
|dblpUrl=https://dblp.org/rec/conf/models/GuanaS15
}}
==How Do Developers Solve Software-engineering Tasks on Model-based Code Generators? An Empirical Study Design==
How Do Developers Solve Software-engineering Tasks on Model-based Code Generators? An Empirical Study Design Victor Guana and Eleni Stroulia Department of Computing Science University of Alberta Edmonton, AB. Canada {guana, stroulia}@ualberta.ca Abstract—Model-based code-generators are complex in nature; Given the complexity and heterogeneity of the technologies they are built using a variety of tools such as language work- involved in a code generator, developers who are trying to benches, and model-to-model and model-to-text transformation inspect and understand the code-generation process have to languages. Due to the highly heterogeneous technology ecosystem in which code generators are built, understanding and main- deal with numerous different artifacts. As a concrete example, taining their architecture pose numerous cognitive challenges in a code-generator maintenance scenario, a developer might to both novice and expert developers. Most of these challenges need to find all chained model-to-model and model-to-text are associated with tasks that require to trace and pinpoint transformation bindings, that originate a buggy line of code to generation artifacts given a life-cycle requirement. We argue fix it [11]. This task is error prone, if not virtually impossible, that such tasks can be classified in three general categories: (a) information discovery, (b) information summarization, and (c) when done manually. We believe that flexible traceability tools information filtering and isolation. Furthermore, we hypothesize are needed to collect and visualize information about the that visualizations that enable the interactive exploration of architecture and operational mechanics of code generators, to model-to-model and model-to-text transformation compositions reduce the cognitive challenges that developers face during can significantly improve developers’ performance when reflect- their life-cycle. With the purpose of tackling this challenge, ing on a code-generation architecture, and its corresponding execution mechanics. In this paper we describe an empirical we have developed ChainTracker [12][13], a tool that enables study conceived (a) to understand the performance of developers developers to better understand how model-based code gener- (in terms of time and precision) when asked to discover, filter, ators are built, using interactive traceability visualizations and and summarize information about a model-based code generator, code projections. ChainTracker gathers and visualizes model- using classic integrated development environments and editors, to-model, and model-to-text traceability information for ATL and (b) to measure and compare the developers’ effectiveness on the same tasks using state-of-the-art traceability visualizations and Acceleo model-transformation compositions (Figure 1). for model-transformation compositions. Whether at design or maintenance time, developers are con- stantly trying to solve software-engineering tasks on model- I. I NTRODUCTION based code generators. We argue that such tasks can be classi- Model-based code generation refers to a software- fied in three categories: (a) information discovery, (b) informa- engineering methodology for building systems that system- tion summarization, and (c) information filtering and isolation. atically differ from each other [1][2]. In effect, code gen- In this paper we describe an empirical study conceived (a) erators are frameworks for building applications from code to understand the performance of developers when asked to semantics that have been engineered for reuse. However, discover, filter, and summarize information about a model- code generators can be difficult to understand since they are based code generator, using classic integrated development typically composed of numerous elements, whose complex environments and editors, and (b) to measure and compare interdependencies pose cognitive challenges for developers the performance of developers executing the same set of performing design, implementation, and maintenance tasks tasks using state-of-the-art visualizations for code-generator [3][4]. traceability information. Model-based code generators integrate rule-based model- This study will enable us to analyze the performance of to-model transformation languages (such as ATL [5] and developers when reflecting on a model-based code generator EGL [6]) and template-based model-to-text transformation to achieve various software-engineering goals. Furthermore, languages (such as Acceleo [7]) to translate high-level system it will increase our understanding of how advanced develop- specifications into executable code and scripts [8][9]. At ment environments and traceability visualization tools, such the core of a model-based code generator, model-to-model as ChainTracker, can help developers to design, study and and model-to-text transformations are composed in so-called maintain a code generator. This study includes a comparative model-transformation chains (MTCs) [10]. analysis of developers’ performance in answering questions 33 Fig. 1. ChainTracker’s Main Screen (1) Transformation-composition Branch View; (2) Transformation Script View; (3) Binding Information Tables; (4.a) Filtering and Visualization Options; (4.b) Context-dependent menu to isolate artifacts related to metamodel elements, or generated textual sources. with ChainTracker versus using existing code editors (i.e. • What portions of code have evolved in the generated Eclipse). Furthermore, in this study developers’ performance codebases? and, assuming that code changes should is understood in terms of the time taken to answer a question indeed be included in future genertion instances, what and the correctness of their answers. elements of the underlying models and transformations In this paper, we first present our study research questions should be revised? and hypothesis. Second, we describe the study subject systems. To answer these questions, developers need to have a Third, we present a detailed description of the families of tasks thorough understating of the generation architecture. In this developers will solve in the study, including question templates study we hypothesize that the interactive exploration of model- that can be reused by the community. Finally we introduce to-model and model-to-text transformation scripts can signifi- the protocol of the study, its expected treats to validity, and cantly improve developers’ performance when reflecting on a expected contributions. code-generator architecture. Furthermore, we believe that the tasks that developers perform when reflecting on the design II. R ESEARCH QUESTIONS AND H YPOTHESIS and execution mechanics of a generator can be classified in three general categories. Let us now briefly discuss each one In the life-cycle of a code generator developers ask of them. multiple questions to optimize and maintain its infrastructure. 1. Information Discovery Tasks: The developer’s intent Particularly, once a code-generator has been built, developers when performing this family of tasks is to explore the code face multiple scenarios of evolution [14]. The two most generator to identify its major components, and to understand important among them are metamodel evolution, in which how the underlying transformation scripts are organized from changes are needed to the domain-specific language a static point of view. This type of task involves locating that interfaces with the end user, or to the intermediate individual elements of the code-generator’s architecture, i.e., metamodels that modularize the code-generation process, in individual metamodel elements, transformation rules, and col- order to improve the language expressiveness, and platform lections of transformation bindings, inside the generator’s evolution, where the generated code needs to be refined with source scripts. These tasks are commonly performed when a different purposes, such as fixing a bug or optimizing the developer is dealing with legacy code generators that need to performance of a generated codebase. In the latter scenario, be reused or optimized. the generator’s model-to-text and, in some cases model-to- 2. Information Summarization Tasks: The purpose of these model transformations, need to be modified in order to reflect tasks is for developers to measure generic information of the such refinements in a systematic way. Indeed, evolutionary code-generation architecture, such as to quantify the coverage scenarios in model-based code generators motivate questions of a model transformation, or to measure the size of its meta- about their architecture and execution mechanics, such as: models. Summarizing information about the code generator allows developers to assess, and potentially improve, its overall • Where does this generated feature come from? design and correctness [15]. • What chained generation artifacts would be affected if a 3. Information Filtering and Isolation Tasks: These tasks model element were removed or modified? are generally performed when developers are assessing the • What is the coverage of the transformation rules in each impact of platform-evolution scenarios. They involve tracing stage of my generation process? and isolating elements of the code-generation architecture 34 from a dynamic perspective, in order to find dependency rela- by novice Android application developers in rapid software tionships between metamodel elements, metamodel attributes, prototyping environments such as hackathons. transformation bindings, and generated pieces of code. Considering the above types of tasks we believe developers Acceleo-M2T Transformations solve when answering questions about a model-based code generator, we intent to investigate two research questions: ATL-M2M MM2 Code Transformations • Q1: How do developers approach the process of MM2 Code MM1 answering questions that involve the discovery, filtering, and summarization of artifacts that constitute a code PhyDSL MM4 Code generator? Metamodel MM5 Code • Q2: Do developers answer questions more accurately when solving tasks that involve information discovery, Intermediate Generated Metamodels Code filtering, and summarization using the interactive traceability visualizations provided by ChainTracker? Fig. 2. PhyDSL’s multi-branched model-transformation composition architec- ture: four model-to-model (M2M) ATL transformations, and four model-to- text (M2T) Acceleo transformations chained in four transformation branches. On the basis of the above questions we outlined two corresponding null hypotheses. ATL-M2M Acceleo-M2T Transformation Transformation • HQ1 : Developers spend an equal amount of time when MM1 MM2 Code solving tasks that involve information discovery, filtering, ScreenFlow Intermediate Generated and summarization of a model-based code generator Metamodel Metamodel Code using ChainTracker as they do using Eclipse editors. Fig. 3. ScreenFlow’s linear model-transformation composition architecture: one model-to-model (M2M) ATL transformation, and one model-to-text • HQ2 : Developers provide equally accurate answers, in (M2T) Acceleo transformation, chained in one transformation branch. terms of task solution correctness, using ChainTracker as they do using Eclipse editors. ATL model-to-model transformations and Acceleo model- to-text transformations are two widely adopted model- III. S UBJECT S YSTEMS transformation technologies in both industry and academic The subject systems of our study are two model-based environments. We believe that ATL and Acceleo exemplify code generators developed in our research laboratory: PhyDSL the semantic complexity of state-of-the-art transformation lan- (System Subject 1) and ScreenFlow (System Subject 2). guages built on top of model manipulation languages such as PhyDSL [16][17] is a model-based code generator for mo- OCL [18], thus generalizing the complexity behind modern bile physics-based 2D games (see Figure 2). It is built in model-based code generators. a textual domain-specific language, and a multi-branched IV. D EPENDENT AND I NDEPENDENT VARIABLES model-transformation composition that includes three model- to-model transformations implemented using ATL, and three Considering the hypotheses HQ1 and HQ2 , we have two template-based model-to-text transformations written in Ac- dependent variables in our study: celeo. PhyDSL is currently used to create cost-effective and fully-featured mobile games with rehabilitation purposes. • V arA : Time developers spend solving each task. PhyDSL is now being used in the construction of mobile • V arB : Developers’ accuracy in terms of task solution games used by the Faculty of Rehabilitation Medicine at the correctness. University of Alberta, the Glenrose Rehabilitation Hospital in Edmonton, Canada, and the Knowledge Media Design TABLE I Institute at the University of Toronto. ScrenFlow1 is a code S TUDY I NDEPENDENT VARIABLES generator for Android application skeletons with interface- Subject System Tasks ChainTracker Tasks Eclipse navigation logic, from graphic user interface storyboard spec- Subject System 1 V arCT1 V arEE1 ifications (see Figure 3). ScreenFlow is composed by a textual Subject System 2 V arCT2 V arEE2 domain-specific language, and a single-branched (i.e. linear) model-transformation composition that includes one model-to- The four independent variables of the study are V arCT1 , model transformation, and one model-to-text transformation, V arEE1 , V arCT2 , and V arEE2 (see Table I). The first written in ATL and Acceleo respectively. ScreenFlow is used two define the set of questions developers will solve using 1 a complete description and demo video of ScreenFlow can be found at PhyDSL as the subject system (a model-based code generator http://goo.gl/IGqLTv with a branched transformation composition). The last two 35 specify the set of questions to be solved using ScreenFlow Stage 1: The first stage involves a working session with 15 as a subject system (a model-based code generator with a developers. Each participant will be asked 30 questions about a linear transformation composition). We believe that by using subject model-based code generator. In this stage, information two subject systems with different levels of complexity (eight about the time spent by developers answering each question transformations in PhyDSL vs. two of ScreenFlow) the study will be collected using an in-house survey application. In this will be able to investigate if the compositional architecture first stage, developers will solve the first half of the tasks using of the generator affects developers when conducting software- off-the-shelf ATL and Acceleo code editors in Eclipse, and the engineering tasks. second half using ChainTracker. Stage 2: The second stage consists of a second working V. D ETAILED H YPOTHESIS session with a new group of 15 developers. They will be Taking into account our two high-level null hypotheses, our asked to answer the same set questions as developers in Stage two subject systems, and the variables of our study, let us 1. Developers’ performance will also be collected using our briefly discuss the set of detailed null hypotheses that this in-house survey application. In this second stage developers study will try to reject. They all share the following general will be instructed to solve the first half of the tasks using form: ChainTracker, and the second half using code editors in Hx V arxy : Ṽ arx CTy = Ṽ arx EEy Eclipse. At the end of each working session, developers will be asked • where x is A or B in place for hypothesis HA0 and HB0 to complete a survey on the usability of ChainTracker and related to V arA -time and V arB -accuracy respectively; their general experience during the session. Let us now briefly discuss how the working sessions will be structured. • Ṽ arx CTy is the median of our study dependent variables, where x indicates the developer’s time and A. Working Sessions precision, when solving tasks using the interactive visualizations provided by ChainTracker; During Stages 1 and 2, each developer will be assigned an individual working station consisting of a desktop computer • Ṽ arx EEy is the median of our study dependent in which Eclipse and ChainTracker will be installed and variables, where x indicates the developer’s time and deployed. This computer will also have an in-house system precision when solving tasks using off-the-shelf Eclipse capable of monitoring the participant’s activity such as mouse script editors for ATL and Acceleo; clicks and keystrokes events. Indeed, the difference between the working sessions of Stage 1 and 2 is the order of the tools • and y (1, or 2) refers to the result of a dependent that developers will use to solve the given tasks. Both stages variable obtained from developers solving tasks on the will be divided in four parts. System Subject 1: PhyDSL, and the System Subject 2: Part 1. The participants will receive a 20 minute high-level ScreenFlow, respectively. presentation of Eclipse, ChainTracker, and the purpose of the study. A demonstration of Eclipse’s features and user inter- In summary, four detailed null-hypothesis will be investi- face will be given through a typical scenario of information gated in this study. While HA V arA1 and HA V arA2 compare discovery, filtering, and summarization on both of the subject the median time spent by developers solving tasks that in- systems. A similar demonstration will be conducted using volve information discovery, filtering, and summarization on ChainTracker’s interactive visualizations, and code-projection single and multi-branched model-based code generators (i.e. features. Finally, participants will be asked to sign the in- developers spend an equal amount of time solving questions formed consent form of the study. using ChainTracker as they do using Eclipse editors for single Part 2. Participants will be pointed to our in-house survey and multi-branched code generators), hypothesis HB V arA1 application where they will answer questions about their and HB V arB2 compare the median accuracy (in terms of experience with modeling tools, and their overall software- task solution correctness) of developers conducting software- development expertise. More specifically, the questionnaire engineering tasks on single and multi-branched model-based will cover i) the developers’ number of years of software- code generators, respectively (i.e. developers provide equally development experience; ii) their experience using integrated accurate answers using ChainTracker as they do using Eclipse development environments; iii) their experience using mod- editors for single and multi-branched code generators). eling tools to document software system implementations; and iv) whether they been exposed to model-transformation VI. S TUDY P ROTOCOL technologies before. A predefined list of options includ- The protocol of the study will be divided in two main ing popular development environments, modeling tools, and stages that involve two independent working sessions with model-transformation technologies will be presented to the two different sets of participants, and two exit surveys that participants along with open fields that will receive alternative will assess the participants’ experience during the study (see answers. Figure 4). Part 3. Participants will have a five-minute break. 36 Stage 1 Stage 2 Working Session 1 Exit Survey Working Session 2 Exit Survey Data Analysis 15-20 Developers Session 1 15-20 Developers Session 2 1h. 30min (max) 1h. 30min (max) Fig. 4. Protocol Stages Part 4. Participants will be pointed to the second part of limited experience of potential participants with model-driven the survey that will ask them to solve tasks with Eclipse and engineering technologies, the study will not enforce model- ChainTracker. The mechanics of this part are as follows: driven engineering experience as a fundamental requirement, 1) Each participant will be presented with a question about however candidates with experience on model-transformation a subject system (see Section VI-B). technologies will be preferred. We believe that graduate and 2) The participant will use ChainTracker or Eclipse to find undergraduate students are potentially interested in acquiring information relevant to the question, and answer the different skills through the use of experimental tools that question. enhance their software-development abilities. Therefore the 3) The participant will submit her/his answer. study solicits participants enrolled in advanced software en- 4) The participant will be directed to the next question. A gineering courses in academic institutions. This study will be total of 30 questions will be asked to every participant also advertised in venues such as the International Conference covering each one of our proposed families of tasks; 15 on Model Driven Engineering Languages and Systems (MOD- questions will be related to the Subject System 1, and ELS) and the International Conference on Model Transforma- 15 to the Subject System 2. tion (ICMT) to potentially conduct additional virtual working sessions with highly skilled professionals on model-driven B. Question Templates engineering technologies. An appendix containing a collection of template questions VIII. P OTENTIAL T HREATS TO VALIDITY that involve information discovery, filtering, and summariza- tion tasks on model-based code generators can be found Construct validity (Do we measure what is intended?) In at: http://goo.gl/BDXpUO. Some examples of our template this study we will measure the performance of developers questions are shown below. answering questions about a model-based code-generating • Find a template’s upstream model-to-model transforma- system. We understand developers’ performance in terms of tion dependencies: What transformation rules are up- the time they take answering each question and their cor- stream related to the template line of code [line-id] in rectness. We have developed an in-house survey application the [template-script-name] script? that presents participants with the questions and measures the • Identify the transformation rule that contains a given time from when the question is showed to the participant metamodel binding: What transformation rule contains to the moment when the participant has submitted an an- the [metamodel-binding] binding in [model-to-model- swer. Furthermore, we have carefully instantiated our question transformation-name]? templates on our subject systems, and the correctness of • Evaluate how well a metamodel is used in a transforma- each expected answer has been validated by three model- tion composition: What percentage of the [metamodel- transformation experts. We do not foresee any significant name] metamodel is been covered by the transformation threats to the construct validity of study. composition? Internal validity (Are there unknown factors which might affect the outcome of the experiments?) We have identified C. Data Analysis two main threats to the internal validity of this study. First, Due to the nature of the variables and the limited number of the limited number of participants and their heterogeneous data points we will apply a Mann-Whitney “U” non-parametric expertise on model-driven development technologies may limit statistical test to study the hypothesis propositions. We will the validity of the study. This study, however, is planned to adopt an alpha level with a p-value lower than 5%, thus we be conducted with a minimum of 30 developers with at least will consider an acceptable probability of 0.05 for Type-I error, three years of software-development experience. Considering i.e. rejecting the null hypothesis when it is true. that model-driven engineering technologies (such as model- transformation languages and modeling tools) are still in their VII. PARTICIPANTS infancy, and are yet to be adopted by the software engineering The study solicits participants of any age and gender with at community at large, our pool of participants are representative least three years of programming experience. The experience of a community in which the majority of developers design- requirement is tightly related to the technical tasks that devel- ing and maintaining code generators are novice, or at least opers will perform during the duration of the study. Due to the not highly experienced, on model-driven engineering tools. 37 Furthermore, this study hopes to capture the interest of the the community and plays an important role in the collective model-driven engineering community and conduct additional endeavour to improve and boost the adoption of model- virtual working sessions with highly-skilled model-driven pro- driven engineering among software engineering researchers fessionals around the world. Indeed, having a diverse pool of and practitioners. participants will be highly valuable to the generalizability and ACKNOWLEDGEMENTS statistical soundness of the study. Second, we are aware of the learning curve of ChainTracker and how its accessibility This work was supported by The Killam Trust, NSERC (the might affect developers when trying to answer questions on Discovery and the IRC program), the GRAND NCE and IBM the subject systems. In order to minimize the impact of this Canada. threat to validity, we have included an introductory tutorial R EFERENCES at the beginning of our working sessions’ protocol (Section [1] U. Aßmann, J. Knoop, and W. Zimmermann, “Model-based code- VI-A). The tutorial will showcase different question-solving generators and compilers-track introduction,” in Leveraging Applications scenarios using ChainTracker and Eclipse. Furthermore, dur- of Formal Methods, Verification and Validation. Technologies for Mas- ing the last year we have iterated over ChainTracker’s graphic tering Change. Springer, 2014, pp. 386–390. [2] K. Czarnecki, “Generative programming: Methods, techniques, and user interface, running informal focus groups in order to make applications tutorial abstract,” Software Reuse: Methods, Techniques, and its features accessible and intuitive for developers. Tools, pp. 477–503, 2002. External validity (To what extend is it possible to gen- [3] R. France and B. Rumpe, “Model-driven development of complex software: A research roadmap,” in 2007 Future of Software Engineering. eralize the findings?) The subject systems of our study are IEEE Computer Society, 2007, pp. 37–54. two model-based code generators implemented using ATL, [4] J. Hutchinson, J. Whittle, M. Rouncefield, and S. Kristoffersen, “Em- a rule-based model-to-model transformation language, and pirical assessment of mde in industry,” in Proceedings of the 33rd International Conference on Software Engineering. ACM, 2011, pp. Acceleo, a template-based model-to-text transformation tech- 471–480. nology. Therefore any conclusions drawn from this study [5] F. Jouault and I. Kurtev, “Transforming models with atl,” in Satellite can not be fully generalized to the performance of develop- Events at the MoDELS 2005 Conference. Springer, 2006, pp. 128– 138. ers solving software engineering tasks on model-based code [6] D. Kolovos, R. Paige, and F. Polack, “The epsilon transformation generators built using other model-transformation technolo- language,” Theory and Practice of Model Transformations, pp. 46–60, gies. However, both Acceleo and ATL are widely used in 2008. [7] J. Musset, É. Juliot, S. Lacrampe, W. Piers, C. Brun, L. Goubet, academia and industry, and more importantly, both languages Y. Lussaud, and F. Allilaire, “Acceleo user guide,” 2006. are aligned to the Query/View/Transformation (QVT) standard [8] A. Bragança and R. J. Machado, “Transformation patterns for multi- for model-to-model transformations [19], and the Model to staged model driven software development,” in Software Product Line Conference, 2008. SPLC’08. 12th International. IEEE, 2008, pp. 329– Text Transformation Language (MOF) standard for model-to- 338. text transformations [20] proposed by the Object Management [9] K. Czarnecki and S. Helsen, “Classification of model transformation ap- Group (OMG), respectively. Therefore the observations of this proaches,” in Proceedings of the 2nd OOPSLA Workshop on Generative Techniques in the Context of the Model Driven Architecture, vol. 45, study can potentially be generalized to developers performing no. 3. Citeseer, 2003, pp. 1–17. the same set of tasks in generators, with similar size and [10] A. Kleppe, “First european workshop on composition of model trans- architecture, built using languages that comply with the same formations - cmt 2006,” Technical Report TR-CTIT-06-34, 2006. [11] V. Guana and E. Stroulia, “Backward propagation of code refinements set of standards. on transformational code generation environments,” in Traceability in Emerging Forms of Software Engineering (TEFSE), 2013 International IX. E XPECTED C ONTRIBUTIONS Workshop on, 2013, pp. 55–60. [12] V. Guana and E. Stroulia, “Chaintracker, a model-transformation trace The contributions of this study are twofold. First our study analysis tool for code-generation environments,” in Theory and Practice will be the first of its kind to investigate how developers of Model Transformations. Springer, 2014, pp. 146–153. approach the process of answering questions that reflect on [13] V. Guana, K. Gaboriau, and E. Stroulia, “Chaintracker: Towards a comprehensive tool for building code-generation environments,” in Pro- the design and execution mechanics of model-based code ceedings of the 2014 International Conference on Software Maintenance generators. It is our strong belief that by gaining insight on and Evolution (ICSME). IEEE Press, 2014. the human aspects of model-driven software development, [14] A. Van Deursen, E. Visser, and J. Warmer, “Model-driven software evo- lution: A research agenda,” in Proceedings 1st International Workshop the community will be able to propose tools that make on Model-Driven Software Evolution, 2007, pp. 41–49. the construction of code generators less error prone and [15] J. Wang, S.-K. Kim, and D. Carrington, “Verifying metamodel coverage less cognitively challenging, thus potentially increasing the of model transformations,” in Software Engineering Conference, 2006. Australian. IEEE, 2006, pp. 10–pp. adoption of model-driven engineering techniques as a whole. [16] V. Guana and E. Stroulia, “Phydsl: A code-generation environment for Second, our study will increase the understanding on how 2d physics-based games,” in 2014 IEEE Games, Entertainment, and developers can solve software-engineering tasks on model- Media Conference (IEEE GEM), 2014. [17] V. Guana, E. Stroulia, and V. Nguyen, “Building a game engine: A tale based code generators, more accurately and efficiently, using of modern model-driven engineering.” interactive traceability collection and visualization tools such [18] J. Warmer and A. Kleppe, The object constraint language: getting your as ChainTracker. This study will gather information necessary models ready for MDA. Addison-Wesley Professional, 2003. [19] OMG, “Mof model to text transformation language (mofm2t), 1.0,” to enhance the current features of ChainTracker, and to create 2008. new ones that further support developers in their daily tasks. [20] OMG., “Meta object facility (mof) 2.0 query/view/transformation (qvt),” Furthermore, we believe this study is a novel contribution to 2015. 38