=Paper=
{{Paper
|id=Vol-2245/flexmde_paper_1
|storemode=property
|title=Comparative Case Studies in Agile Model-driven Development
|pdfUrl=https://ceur-ws.org/Vol-2245/flexmde_paper_1.pdf
|volume=Vol-2245
|authors=Kevin Lano,Hessa Alfraihi,Shekoufeh Kolahdouz-Rahimi,Mohammadreza Sharbaf,Howard Haughton
|dblpUrl=https://dblp.org/rec/conf/models/LanoARSH18
}}
==Comparative Case Studies in Agile Model-driven Development==
Comparative Case Studies in Agile Model-Driven Development K. Lano1 , H. Alfraihi1,3 , S. Kolahdouz-Rahimi2 , M. Sharbaf2 , and H. Haughton1 {hessa.alfraihi,kevin.lano}@kcl.ac.uk {sh.rahimi,m.sharbaf}@eng.ui.ac.ir 1 Dept. of Informatics, King's College London, London, UK 2 Dept. of Software Engineering, University of Isfahan, Isfahan, Iran 3 Dept. of Information Systems, Princess Nourah bint Abdulrahman University, Saudi Arabia Abstract. This paper reports on experiences of integrating Agile and Model-Driven Development, for the development of code generators and nancial systems. We evaluate the benets of the Agile MDD approach by comparing Agile non-MDD and Agile MDD developments of code generators, and an agile MDD development of a nancial application with three other independent versions of the same application developed using dierent approaches. We also compare the functionality of the systems and a variety of technical debt metrics measuring the quality of the code and its design. Based on the case study results, we have found evidence that the use of Agile MDD leads to reductions in development eort, and to improvements in software quality and eciency. 1 Introduction Integration of Agile and Model-driven Development (Agile MDD) is of signi- cant interest to practitioners who want to utilise the benets of both approaches. However, only limited research has been carried out to investigate the impact of their integration [1,3]. In this paper we compare Agile MDD and other ap- proaches in two dierent application areas: (i) code generators and (ii) nancial systems. Agile MDD developments of UML to C and UML to Python code gen- erators are compared in Section 3 to Agile non-MDD developments of other code generators. In Section 4 a comparison between four independent developments of the same nance application is carried out, with the developments using dierent approaches: Agile MDD, MDD, Agile, and non-Agile hand-coded development. 2 Research methodology Our research goal is to evaluate the impact of integrating Agile development and MDD. For this goal we ask the following research questions: RQ1: What is the impact of integrating Agile development and MDD on the software product? RQ2: What is the impact of integrating Agile development and MDD on the software development process? In order to answer RQ1, the properties of quality, eciency and maintainabil- ity of the case study versions were compared. For quality and maintainability, we use measures of technical debt [9]: EAS (Excessive application complexity c [7] or LOC, with a threshold of 1000); ENR (Excessive number of rules, nrules > 10); ENO (Excessive number of operations/methods nops > 10); EHS (Ex- cessive operation/method size, threshold of 100 for c or LOC); MHS (Maximum operation size); ERS (Excessive rule size c > 100); CC (Cyclomatic complexity of rule logic or of procedural code > 10); MCC (maximum CC); CBR (Num- ber of rule/operation explicit or implicit calling relations > nrules + nops, or any cyclic dependencies exist in the rule/operation call graph); EPL (number of parameters for a rule or operation/method > 10); DC (duplicate expressions or statements with token count > 10). We use threshold values based on those used in the PMD technical debt anal- ysis tools (pmd.github.io): EAS 1000 LOC; EHS 100 LOC; EP L 10 parameters; CC 10; EN O 10 per class. 3 Code generation case study As part of the UML-RSDS toolset (www.nms.kcl.ac.uk/kevin.lano/uml2web) a number of code generators are provided to map UML and OCL to program code. Translators to Java (3 versions), C# and C++ were manually coded in Java using an agile approach over the period 2003 to 2015, whilst more re- cent code generators, UML2C for ANSI C and UML2Python for Python, have been developed using our Agile MDD process [2] in UML-RSDS [8]. Specically, the code generators were developed by writing UML-RSDS specications, from which code is automatically generated both for testing purposes and for delivery of executables. All of the generators were written by the same single developer (the rst author). The source metamodel of UML2C and UML2Python consists of 33 classes (www.nms.kcl.ac.uk/kevin.lano/libraries/design.km3), whilst the C target metamodel has 24 classes. UML2C and UML2Python are the largest applications that have been developed using UML-RSDS. A systematic requirements engineering process was followed for the UML2C and UML2Python generators, involving document mining, scenario analysis, goal decomposition, exploratory and evolutionary prototyping, inspection and refac- toring. For UML2C, the development was organised into 5 iterations, one iteration for each of the following main functional requirements: F1.1: Translation of types F1.2: Translation of class diagrams F1.3: Translation of OCL expressions F1.4: Translation of activities F1.5: Translation of use cases. In each iteration an informal mapping of language elements from UML to the target languages was constructed. For example, Table 1 shows the mapping of types from UML to C for iteration 1. Scenario UML element e C representation e' F1.1.1.1 String type char* F1.1.1.2 int, long, double types same-named C types F1.1.1.3 boolean type unsigned char F1.1.2 Enumeration type C enum F1.1.3 Entity type E struct E* type F1.1.4.1 Set(E) type struct E** (array of E', without duplicates) F1.1.4.2 Sequence(E) type struct E** (array of E', possibly with duplicates) Table 1: Informal mapping scenarios for UML types to ANSI C Table 2 shows the corresponding mapping for Python. There were 2 iterations with F1.1 and F1.2 considered in the rst, and F1.3, F1.4, F1.5 in the second. Scenario UML element e Python representation e' F1.1.1.1 String type str F1.1.1.2 int, long, double types int, int, oat F1.1.1.3 boolean type bool F1.1.2 Enumeration type Enum class instance F1.1.3 Entity type E class E F1.1.4.1 Set(E) type set(E) F1.1.4.2 Sequence(E) type list(E) Table 2: Informal mapping scenarios for UML types to Python In contrast to the large eort expended on writing code in the case of the manually-coded code generators, the principal eort in the UML2C and UML2Python developments was the formal specication denition and review/test- ing of this specication. In UML-RSDS, the construction, review, testing, op- timisation, correction and refactoring of the application specication replaces the corresponding code construction processes in a conventional non-MDD ap- proach. No manual coding was used for either UML2C or UML2Python, and a declarative/functional style of specication was used for both. The formal specication consists of a class diagram and use cases, dened by OCL constraints. An example constraint from UML2C is the following, which expresses formally the informal mapping F1.1.3: Entity:: CPointerType->exists( p | p.ctypeId = typeId & CStruct->exists( c | c.name = name & c.ctypeId = name & p.pointsTo = c ) ) The constraints map instances of the UML metamodel into instances of a C metamodel. Additional constraints dene how concrete C text is produced from the C model. In the case of UML2Python, text was directly produced from the UML model. The design structure of the UML2C synthesiser is an external transformation chain of two transformations: uml2Ca, which produces the app.h header le for the C output, and uml2Cb, which produces the app.c main code le. F1.1 and F1.2 are implemented in uml2Ca (an internal chain of 2 subtransformations), and F1.3, F1.4 and F1.5 are implemented in uml2Cb as a chain of 3 subtransfor- mations. For UML2Python a single uml2py transformation was delivered, with a hierarchical organisation: model2py (implementing F1.1, F1.2, F1.5) is a client to stat2py (F1.4), which is a client of exp2py (F1.3). 3.1 Evaluation We compare UML2C and UML2Python to the manually-coded C++ code gen- erator. The developer had low experience in C++, C and Python, and hence the three generators are comparable in each requiring signicant background research and experimentation with dierent generation strategies. UML2C was the rst generator to target a procedural language, and presented greater con- ceptual diculties than previous generators (eg., how to represent inheritance). UML2Python was the rst generator for a dynamically-typed language. Table 3 compares the overall development eort in person days between the C++, C and Python code generators. Stage C++ generator UML2C UML2Python Requirements Elicitation 10 17 2 Evaluation/Negotiation 10 5 1 Specication 60 56 2 Review/Validation 50 57 1 Implementation/ 180 49 3 Testing Total 310 184 9 Table 3: Development eort for C++, C and Python code generators A major factor in this dierence is the much simpler and more concise trans- formation specication of the C and Python code generators (expressed in UML- RSDS) compared to the Java code of the C++ code generator. Not only is the UML2C specication 7 times shorter than the UML to C++ Java code, but the latter is scattered over multiple source les (eg., Attribute.java, Association.java, Entity.java, etc) and mixed with the code of other parts of UML-RSDS, mak- ing debugging and maintenance more complex compared to the C translator, which is dened in 2 self-contained specication les. In total, the core code of the UML-RSDS tools is 90,500 lines of Java code, of which approximately 20% (18,100 lines) is the C++ code generator. In contrast the UML2C specication is 874 (uml2Ca) and 1576 (uml2Cb) lines, in total 2450 lines. The UML2Python generator is a single le of 605 lines. The OCL specication of UML2C is highly declarative and corresponds directly to the informal requirements, hence it is eas- ier to understand and modify compared to a programming language implemen- tation. UML2Python reused the structure of the UML2C informal specication, and substantial parts of the UML2C formal specication, with simplications due to the higher level of the Python language compared to C. Whilst UML2C is explicitly divided into 5 sequential stages, each subdivided into model to model and model to text modules, the C++ generator has a mono- lithic structure. UML2Python has a hierarchical structure of 3 strongly separated subsystems. Only two design patterns (Iterator and Visitor) are used in the C++ generator, whilst 13 are used to organise UML2C and 5 in UML2Python. Table 4 summarises the dierences in software quality aspects between the C++ generator and UML2C and UML2Python. Measure C++ generator UML2C UML2Python Size (LOC) 18,100 2,450 605 Abstraction level Low (code) High (specication) High (specication) Software architecture Partial Detailed Detailed Modularity Low (1 module) High (10 modules) Medium (3 modules) Cohesion Low High High Coupling Low Low Low Design patterns 2 13 5 Table 4: Software quality aspects of C++ and C, Python code generators We can also compare the level of design aws or technical debt in the C++ translator and in UML2C and UML2Python. For the C++ translator the data has been calculated using the PMD tool (https://pmd.github.io). For UML2C and UML2Python we used the technical debt metrics built in to UML-RSDS. Measures of fan-out, duplicate code and coupling between operations are not measured by PMD, so are omitted in the comparison. Table 5 compares the TD gures for UML2C, UML2Python and the C++ generator. For the latter, the gures for EHS/ERS, CC and EPL are produced by dividing the total numbers of awed operations in the entire UML-RSDS toolset by 5. For EAS and ENO+ENR we count the number of classes (C++ generator) or sub- transformations (UML2C and UML2Python) with these aws. It can be seen that the gures for UML2C and UML2Python are generally lower than for the C++ translator. Transformation EAS EHS + ERS CC EN O + EN R EP L C++ generator 5 16.6 110.6 28 0.4 UML2C 2 13 32 4 0 UML2Python 1 8 6 3 0 Table 5: C++ translator versus UML2C and UML2Python technical debt We also compared the eciency of the code generators applied to example models of varying size up to 100 classes with 100 features in each class. 4 Financial case study In this case study we compared the Agile MDD approach with three other ap- proaches: MDD, Agile, and hand-coded approaches. The four applications all implemented the same problem and were completed independently using dier- ent development approaches by dierent teams. The case study problem is to calculate and evaluate the risk of nancial instruments known as Collateralized Debt Obligation s (CDOs) [4,5]. Risk analysis of a CDO involves calculating the probability P (S = s) of a total credit loss s from the CDO. The specication of the calculation is given in [5]. To evaluate eciency, we measured the total execution time of each applica- tion version on a sample dataset input of 16 sectors and output of P(S = s) for all values of s ≤ 20. In order to answer RQ2, we used qualitative measures through an opinion survey, in which the developers were asked to report the main benets they per- ceived, along with the issues they faced in each approach. Agile MDD approach The CDO application was implemented using the UML- RSDS Agile MDD approach [8,2]. The customer of this application was a - nancial analyst working in a nancial company. The developer had 10 years experience of UML-RSDS and had no prior experience in nancial applications. MDD approach The CDO case study was redeveloped using the same specica- tion of [5]. The developer used EMF/ECORE [10] to specify the metamodels, and the transformation was implemented using the Epsilon Transformation Lan- guage (ETL) [6]. This solution was developed by one developer who had 4 years experience of ETL and had no experience in nancial applications. Agile approach The CDO application was redeveloped using the Scrum process. The application was implemented in Java and the development process was or- ganised in three iterations (each one-week long). Some Scrum techniques were used such as product backlog, sprint backlog, user stories, requirements prioriti- sation, sprint planning, sprint review, and frequent customer involvement. The developer had 5 years experience of Java programming and had no experience in nancial applications. Original CDO application The CDO application was previously developed in C++ and used in a nancial company. It was developed using a traditional code-centric non-agile approach. The developer had over 20 years experience in C++ programming and was a nancial analyst in that company. Table 6 presents the execution time and the size in LOC of the four applica- tion versions Agile MDD MDD Agile Original Execution Time of P(S=s) 23ms 123ms 39ms 231ms for all s ≤ 20 Execution Time of P(S=s) 93ms 531ms > 15min > 15min for all s ≤ 50 Size(LOC) 94 143 196 163 Table 6: Execution time and LOC for each CDO version For RQ2 we use a 3-point scale Low, Medium, High to quantify the dierent aspects reported by developers (Table 7). Aspect Agile MDD MDD Agile Original Development Medium, due to High, due Medium High eort small to need specication to rework size specication Errors in Low due to High: Missing Low, due to Low deliverables direct customer functionality customer feedback feedback Ability High, due to High, due to Medium Medium to make small/declarative small changes specication specication Development High: based Medium High: based Medium organisation on on iterations iterations Table 7: Impact on development process (CDO case) Direct communication with the customer representative during development was very important in reducing errors and correcting misunderstandings in the Agile and Agile MDD cases. The major challenge for the developers of the Agile MDD, MDD and Agile versions was in understanding the domain and formalising the required functionality correctly. 5 Discussion 5.1 Code generator case study For the code generator versions, there were no signicant dierences in eciency between the Agile MDD C and Python code generators and the Agile non-MDD C++ code generator. In terms of product quality, there were clear gains in re- duced application size and reduced technical debt, and improved system struc- turing. In turn, these benets make the ongoing maintenance of the UML2C and UML2Python code generators more eective than for the manually-coded generators. For example, we recently implemented a source metamodel enhance- ment within 30 minutes for each of these generators, the corresponding change for the C++ generator was around 2 hours. Lessons learnt from the UML2C development were applied to UML2Python, resulting in a much reduced devel- opment eort (5% of the UML2C eort) in the case of UML2Python. Another factor here is that the language gap between UML and Python is much smaller than between UML and C (however the gap between UML and C is higher than between UML and C++). Substantial reuse of the UML2C specication in UML2Python was facilitated by the concise and semantically-transparent nature of this specication. Table 8 summarises the dierence in TD levels between the three code generator versions. Transformation aws aws/LOC C++ generator 155.6 0.009 UML2C 51 0.021 UML2Python 18 0.029 Table 8: Summary of code generator technical debt Although the aw density is lower for the non-MDD approach, the number of aws is much higher. The overall functionality of the C++ and C code generators is comparable, yet the C++ generator has 3 times the number of aws of the C generator. 5.2 Financial case study For the nancial case study, in terms of eciency, the application developed using Agile MDD was the fastest. With regard to the product quality, the Agile MDD application had consistently better metrics than the other versions. Table 9 and Table 10 present the average of the values of the metrics for the two Agile versus the two non-Agile approaches and for the two MDD approaches versus the two non-MDD approaches. Agile approaches therefore have better values than non-Agile approaches in 8 of the 9 measures while MDD approaches have better values than non-MDD approaches in 7 of the 9 measures. Ec. LOC ENO MHS MCC CBR DC aws aws/LOC Agile 31ms 145 10.5 15 4 9(1.5) 1 3 0.0207 Non-agile 177ms 136 11 26 4.5 12(1.5) 1.5 4 0.0294 Table 9: Agile versus non-Agile approaches for CDO Ec. LOC ENO MHS MCC CBR DC aws aws/LOC MDD 73ms 118.5 10 13.5 2.5 12(2) 0.5 3 0.0253 Non-MDD 135ms 162.5 11.5 27.5 6 9(1) 2 4 0.0246 Table 10: MDD versus non-MDD approaches for CDO 5.3 Outcome of research questions For RQ1, we found improvements in the quality aspects and TD measures for the Agile MDD solutions in both case studies, and in eciency of the nancial case. For RQ2 we found a 40% reduction in development eort for UML2C com- pared to the C++ code generator, despite the greater challenges of generat- ing C compared to C++. Specication-level reuse was highly benecial in the UML2Python case, whilst only code-level reuse was possible for the C++ gen- erator. Errors in deliverables were reduced in the Agile MDD and Agile versions of the CDO case due to direct customer collaboration in development. In both the code generator and CDO case studies, the use of incremental development and iterations helped in the organisation and decomposition of the Agile MDD developments. 5.4 Threats to validity A signicant issue concerns the development team size. Agile methods emphasise communication and team collaboration. In this research, the code generator application versions and MDD and original CDO versions were implemented by single developers, whilst the Agile MDD and Agile CDO versions involved a main developer and an advising customer representative. Therefore it is dicult to generalise our conclusions for larger projects. Secondly, only one form of Agile MDD is used (UML-RSDS) and only one form of Agile (Scrum), so our analysis does not necessarily hold for other varieties of these approaches. Conclusion Our results in these case studies show some evidence that Agile MDD has im- proved eciency and quality of delivered software. In future work, we intend to extend this study using larger case studies with larger development teams. References 1. Hessa Alfraihi and Kevin Lano. The integration of agile development and model driven development: A systematic literature review. In Proceedings of the 5th International Conference on Model-Driven Engineering and Software Development, MODELSWARD, 2017. 2. Hessa Alfraihi and Kevin Lano. A process for integrating agile software de- velopment and model-driven development. In In 3rd Flexible MDE Workshop (FlexMDE) co-located with ACM/IEEE 20th International Conference on Model Driven Engineering Languages & Systems (MoDELS 2017), CEUR Workshop Pro- ceedings, pages 412417, Austin, TX, USA, 2017. CEUR-WS.org. 3. Håkan Burden, Sebastian Hansson, and Yu Zhao. How MAD are we? Empirical Evidence for Model-driven Agile Development. In Proceedings of XM 2014, 3rd Extreme Modeling Workshop, volume 1239, pages 211, Valencia, Spain, September 2014. CEUR. 4. M Davis, V Lo, et al. Infectious defaults. Quantitative Finance, 1(4):382387, 2001. 5. Ola Hammarlid et al. Aggregating sectors in the infectious defaults model. Quan- titative Finance, 4(1):6469, 2004. 6. Dimitrios S Kolovos, Richard F Paige, and Fiona AC Polack. The epsilon trans- formation language. In International Conference on Theory and Practice of Model Transformations, pages 4660. Springer, 2008. 7. K. Lano, S. Kolahdouz-Rahimi, M. Sharbaf, and H. Alfraih. Technical debt in model transformation specications. In ICMT, 2018. 8. Kevin Lano. Agile model-based development using UML-RSDS. Boca Raton: CRC Press, 2017. 9. Radu Marinescu. Assessing technical debt by identifying design aws in software systems. IBM Journal of Research and Development, 56(5):91, 2012. 10. Dave Steinberg, Frank Budinsky, Ed Merks, and Marcelo Paternostro. EMF: eclipse modeling framework. Pearson Education, 2 edition, 2008.