=Paper=
{{Paper
|id=Vol-2145/p09
|storemode=property
|title=Reverse Engineering of UML Use Case Model from Website Usage Records
|pdfUrl=https://ceur-ws.org/Vol-2145/p09.pdf
|volume=Vol-2145
|authors=Vaidotas Drungilas,Lina Čeponienė,Mantas Jurgelaitis
}}
==Reverse Engineering of UML Use Case Model from Website Usage Records==
Reverse Engineering of UML Use Case Model from Website Usage Records Vaidotas Drungilas Lina Čeponienė Mantas Jurgelaitis Department of Information Systems, Department of Information Systems, Department of Information Systems, Kaunas University of Technology Kaunas University of Technology Kaunas University of Technology Informatics faculty Informatics faculty Informatics faculty Kaunas, Lithuania Kaunas, Lithuania Kaunas, Lithuania vaidotas.drungilas@ktu.lt lina.ceponiene@ktu.lt mantas.jurgelaitis@ktu.lt Abstract—Though UML is rather commonly used for maintaining created and implementing new software features modelling various software systems, if not properly maintained, rather than spending time for updating the documentation. To UML models could lose their practical value. Fixing the mismatch match the tendency of directing most of the effort into between documentation and the current state of software, requires implementation stage, Agile project management significant effort from development team. This also applies to systems that have no documentation at all or legacy systems, which methodologies have a tendency to be rather popular among web documentation is not available. Reverse engineering can be used developers. Software products that are built using Agile for generating UML diagrams for existing systems. In this paper methodology, usually do not concentrate on having detailed we present a method for reverse engineering UML Use Case model documentation. Consequently, in this paper we tackle a from websites. This method enables generating UML Use Case and problem of increasing efficiency of modelling process for Activity diagrams from the recorded user activity in the website. websites and suggest reverse engineering as a possible solution. Reverse engineering is the process of analyzing a system to Keywords—UML; reverse engineering; website; Use Case identify its structure and behavior in order to create its visual diagram; Activity diagram. representation [4]. Reverse engineering can be used to I. INTRODUCTION understand how software works and to transform some kind of static information, like program code, into models and Unified Modelling Language (UML) is rather commonly documentation [5]. used for modelling various software systems [1]. UML is Reverse engineering can also be used for generating UML applied not only during development of complex software diagrams [6] [7] [8]. In this paper we present a method for systems but also during maintenance of the systems in use. As reverse engineering UML Use Case model from websites. This for deployed systems that require support and updates, models method enables generating UML Use Case and Activity help to analyze and understand inner structure and functionality diagrams from the recorded user activity in the website and of a system [2]. Most of the models and documentation are websites’ HTML files. usually created during initial software development stages. If The rest of the paper is organized as follows. The second not properly maintained these models lose their practical value. section presents related work in the field of reverse engineering An example of improper maintenance could be a situation when UML diagrams. The third section is dedicated to describing our the final product receives updates and new features, without proposed method for generating UML diagrams from registered updating documentation and leaving it obsolete. While using user actions. Section four describes the prototype developed for this kind of obsolete documentation, maintenance of software our proposed method. The fifth section presents the results of and introducing new features becomes more difficult. Fixing evaluation of the prototype by applying it for a particular the mismatch between documentation and the current state of website. The last section summarizes the findings of our software, requires significant effort from development team. research and discusses the future work. This also applies to systems that have no documentation at all or legacy systems, which documentation is not available. II. RELATED WORK Websites more than any other type of software demand Reverse engineering of UML models is rather common in constant updates and fixes to meet changing user demand and the field of software engineering. Reverse engineering can to beat harsh competition [3]. This demand and competition greatly reduce the effort required to construct UML diagrams. puts pressure on web developers to implement changes as fast UML diagrams can be divided into two categories, one as possible, without wasting valuable resources and time. As describes structure of the software system, and the other defines demand grows, website developers tend to concentrate on its behavior [9]. Structural UML diagrams can be reverse engineered from Copyright held by the author(s). code or other static structure. Class diagrams can be easily transformed from static code [8], [10]. Many tools support this 54 option through plugins or default functionality, e.g. Eclipse [11] based on the idea that websites use a common architecture or Visual Paradigm [12]. which can be used to extract information about user activities. On the other hand, reverse engineering of behavioral diagrams, like Use Case and Activity diagrams, is not so III. PROPOSED METHOD FOR REVERSE ENGINEERING UML USE commonly implemented and used. Nevertheless, there have CASE MODEL been significant effort to create a working method for reverse As UML Use Case model provides detailed overview of engineering of UML behavioral diagrams [6] [7] [13] [14]. systems functionality, it is one of the main components of high El-Attar and Miller proposed a method to reverse engineer quality system documentation [15]. Our approach in reverse Use Case models from structured Use Case descriptions [7]. engineering UML Use Case model should provide ability to This method requires structured text as an input which is flexibly analyze web applications, and transform analysis analyzed and transformed into the Use Case diagram. This results into UML Use Case model. The created Use Case model method if used correctly could greatly improve consistency of should include Use Case diagram along with each Use Case software documentation and would allow creating precise and specified by an Activity diagram defining the functionality of unambiguous Use Case models. The best result using this that Use Case. method could be in early development stages. On the other The proposed method consists of two main steps for reverse hand, while using Agile methodologies this method would not engineering UML Use Case model from the selected website be the best choice because it requires additional effort for (Fig. 1): creating and formatting Use Case specifications. 1) registering usage of the analyzed system; Another method for reverse engineering UML diagrams has 2) transforming registration results into XMI file. been proposed by Muhairat [6]. It uses event table as an input for generating Use Case diagram. Event table, as defined in [6], has four main elements: event, source of event, action and object. These main elements are later transformed, using the proposed process, which consists of actor identification, relation between actors’ identification, use case identification, relation between use cases identification and integration of all found elements. Just like [7] method, this method requires a lot of effort for creating sufficiently detailed event table to generate informative Use Case model. This method could be the most useful during requirement analysis phase. Much can be learned not only form reverse engineering Use Case diagrams, but also from reverse engineering other behavioral diagrams. An excellent example of reverse engineering behavioral diagrams is presented in research by Ziadi, da Silva, Hillah, and Ziane [13]. They proposed an approach how to fully dynamically reverse engineer UML Sequence diagrams. This dynamic method is intended for the systems where static code analysis is not applicable directly. Fig. 1 Structure of the proposed method Approach also defined how to extract the traces of a working system. This idea is used as a basis in one of the steps of our During the first phase of reverse engineering UML Use proposed method for extracting website usage information. Case model, user opens the system that will be analyzed. He Di Lucca, Fasolino and Tramontana proposed a specialized inputs his role in the system and activity that he will be tool specifically intended for website reverse engineering, performing. User then continues to use system while his activity called WARE [14]. This tool is capable to reverse engineer is being recorded by reverse engineering system in the UML Use Case and Sequence diagrams as well as Class background. User can input as many roles and activities as it is diagrams. But instead of working with dynamic content, this required to completely represent his usage of the system. tool uses static code as an input for reverse engineering. This Registering of user actions should be performed by a number of tool is only applicable in situations where the full access to users that is required to cover all functionality of system. After source code is granted. In contrast, our research focuses on all users register their activities, they should be able to export reverse engineering UML behavioral diagrams independently result files and send them to the system analyst. System analyst of the availability of the websites’ source code. then should be able to merge all the result recordings together, Most of the research conducted in the area of reverse into one full structure that represents actions performed in the engineering Use Case model is not intended particularly for analyzed system. websites. Our approach focuses on reverse engineering UML behavioral diagrams from websites, specifically from the information recorded during website usage. Our proposal is 55 A. System usage registration process The component that will be used for registering system usage should not interfere with system functionality by any way. The recording component should be able to read HTML files that user is interacting with. It should work as a background process that captures user input events, such as clicks and form submits. To describe these events correctly, recording component should also store information about HTML elements that user interacts with. These elements should be uniquely described with an identifying element. Registration component should allow user to define what kind of role he is performing in the system and to define what kind of activity he will be performing. The process of registration consists of initialization and recording steps as can be seen in Fig. 2 Fig. 3 The structure of the user activity registration result C. Transformation process Transformation process is defined in Fig. 4. In order to create a detailed UML model, transformation step should be performed. During this step, registration input is being analyzed for detecting relations between actors and use cases, also between use cases themselves. Fig. 2 Activity diagram representing the process of system usage registration B. Registration result Registration results are then provided as an input for the component that transforms user actions into UML Use Case model. This input should be stored in a structure that has elements described in Fig. 3. This structure should store all Website URLs that user interacted with during time of recording. In addition, user should provide the name of the Role, which exists in given Website. Each Role will be performing some kind of Activity. Each Activity should be defined by Webpage that it was performed on and events that were performed in that same Webpage. Each event is described Fig. 4 Activity diagram representing the process of transformation to XMI file with detailed information about type of Action, and information about HTML element that was in use during that event. Activity diagram defining the transformation (presented in Fig. 4) specifies what actions are required to transform registered results to XMI. The first action removes duplicate actors, to keep the model concise. The second action is required to detect and create generalization relations between actors. The third action performs grouping operations with the registered data. These grouping operations consist of detecting extend and include relations between use cases. The final step takes the 56 results from all relation detection steps and creates XMI file data that user provides during the recording. The higher amount basing on the structure of metamodels of UML Use Case and of recorded data should transform into a model that is more Activity diagrams. detailed and thus more informative. 1) Detection of extend type relations D. Generalization relation detection To enhance the Use Case model, our proposed approach Generalization relation detection starts by scanning all detects extend relations. All the actions required to detect registered use cases and searching for two or more actors which extend relations are represented in Fig. 6. These extend have matching use cases. Algorithm also checks whether it relations are important in describing alternative scenarios in the needs to create a new user, in order to display a generalization model. Extend relation detection starts with checking all correctly. If the user creation is required, the system analyst recorded action sequences and finding partially repeating should be prompted to input actor name for new actor. After actions. From these sequences, the matching parts are extracted, these steps, the system creates generalization relations between by comparing them to each other. At the beginning and end of the actors. As a last step, the system maps all required use cases the extracted sequence, decision and merge points are created to required actors. Generalization relation detection step is the respectively. Afterwards the extracted, remaining and newly only step that changes configuration with actors in the model, created elements are merged together to create the complete so after this step we will have the final number of actors in the activity diagram. For each path that is now separated from the model. The activity diagram describing generalization detection main path of the activity flow by a decision point, a new use process is defined in Fig 5. case can be created. User provides the names for these use cases and the algorithm creates them in a model. The next step in extension relation detection is creating extension relation between newly created use cases and the use case they originated from. Finally the extension points are created for the use case, which has incoming extend relations. Fig. 5 The process of detecting generalization relations E. Detecting relations between Use Cases Detection of use case relations like extend and include, depends on a data set provided by the user of proposed method. Our method does not detect generalization between Use Cases, only include and extend relationship. To detect extend relations, the user of a system should record the same activity on a website Fig. 6 The process of detecting extend relations twice or more. If these data sets of the same activity will provide exactly the same information, detection would just discard it. 2) Detection of include type relations Otherwise, if some differences would be found in these data In order to decrease redundancy in use case model, include sets, the proposed algorithm can create a more detailed use case relations can be used between use cases. As in our method the model. Success of relation detection depends on an amount of amount of recorded data should be quite big, include relations 57 help to reduce the number of repeated actions in the model. webpages. Initial and final activity nodes are also created and Include relation detection starts with finding all repeating action all nodes and activities are joined together in a continuous flow. sequences where repeated sequence length is higher than user G. Transformation results. defined number N. For each sequence found, the algorithm starts use case creation by prompting the user to input a use case Results after generation are stored in XMI file. This file name and creating use case element with that name. These consists of Use Case diagram with elements that are described found sequences are then removed form use cases where they by UML Use Case metamodel [9]. For each use case, an activity originated. To finalize, include type relations are created by diagram is created, representing all actions that the user joining newly created use cases and use cases that the sequences performs in the system under analysis. Activity diagram is also originated from. The process of include relation detection is based on UML metamodel for Activities and Actions [9]. The presented in Fig. 7 main elements that this method detects are roles, use cases, and actions. Relations join each of these elements: association relationship joins use cases and actors, generalization is used between actors, include and extend relations can join two use cases, and control flow relations connect the actions in activity diagrams. User can download the generated XMI file and store it as needed. As most of UML modeling tools support XMI as their import format, users should just import this file and have the working version of Use Case model. IV. THE IMPLEMENTED PROTOTYPE OF THE PROPOSED METHOD To test whether the proposed method could be utilized in practice, a prototype has been implemented. Prototype was realized as a Chrome plugin using JavaScript. It enables users to submit information about their role in the website activity they will be performing. As user continues to use the system that is being analyzed, his actions are recorded. Recorded actions are then stored in JSON file. After user indicates that he has ended registration process, he can start transformation process. The system transforms registered JSON structure to Use Case and Activity diagrams. Example of this JSON structure is presented in Fig. 8. Fig. 7 The process of detecting include relations F. Transformation to XMI format During the transformation to XMI step, all the information gathered in previous steps is transformed into Use Case and Activity diagrams. Transformation starts by creating a Use Case diagram. For each role that the user defined, an actor is created. For each activity, that the user defined, the use case is created. Moreover, the use cases that were found during detection of extension and include relationships are added to the model. All use cases and actors are joined using the detected relationship. For each use case, an Activity diagram is created. In this diagram, the algorithm creates two swimlanes, the first one for the actor that is interacting with the given use case, and the second for the system under analysis. For each recorded action Fig. 8 An example of JSON structure displaying use case “Download our algorithm creates the action in activity diagram. Actions are assignments data” named referring to the action naming rules, selecting the verb corresponding to the action performed on a HTML element as Afterwards this JSON file is transformed into XMI file, well as a noun extracted from HTML element attributes. These which can be imported into MagicDraw CASE tool as a model. rules define how each action on HTML element should This model can later be viewed, analyzed and modified by the transform into semantically correct action name. In a systems analyst. swimlane, actions are created describing the opening of new 58 V. EXAMPLE OF UML DIAGRAMS GENERATED USING THE IMPLEMENTED PROTOTYPE An experiment was conducted to verify whether the created prototype is capable of reverse engineering UML Use Case model. Website selected for this experiment was a virtual learning environment Moodle, customized for Kaunas University of Technology. The roles of student and teacher were analyzed. In order to create Use Case model, student and teacher were asked to perform actions in the analyzed website. Student performed actions for uploading a file into the system. Teacher recorded a process in which he downloaded the students’ submitted assignment. Fig. 9 displays the generated Use Case diagram. For this diagram, users provided two roles but the system identified one additional actor. In total three use cases were detected. The first actor was discovered by generalization Fig. 10 Generated activity diagram describing Login use case relation detection step during registration result transformation. As both of the actors had to log into the system a new user An example of generated activity diagram for use case named “Guest” was created. This user, as mentioned before, “Submit assignments” is displayed in Fig. 11. This diagram provides the ability to subtract the amount of excess use cases. displays interactions the user performed and webpages he For each of these use cases, the algorithm created an activity opened. The diagram describes algorithms’ ability to record diagram as expected. actions, and to transform them correctly. The action naming conventions do not convey the performed action information clearly, it heavily depends on the systems configuration. Most of the students’ actions in this use case were navigation through the website. One downside of using just URL changes to describe the opening of new windows in the system, is that it cannot record the opening of a modal window. The algorithms’ ability to detect modal windows should be improved in the future. Fig. 9 The generated Use Case diagram Activity diagram describing Use Case “Login” is presented in Fig. 10. In this diagram we can see that each non- authenticated user had to click login button styled by CSS class named “btn-login”. After the button is clicked, the system transfers all non-authenticated users to unified authentication system where user fills out login fields and submits the form. To finish login use case, the user clicks Login button and is transferred to the virtual learning environment. The generated Activity diagram demonstrates the prototypes ability to specify common actions, like login and registration. Fig. 11 Generated activity diagram describing Submit assignment use case The example use case “Download assignment” is described in activity diagram presented in Fig. 12. This activity provides visual feedback, demonstrating that some of the URL naming rules should be improved. On the other hand, this activity still 59 provides enough detail to cover all the most important actions of In the future, we are planning to implement the extended the use case. capability to support other UML modeling tools. The set of supported tools should include at least one open source tool that is free to use, so that the proposed method could be more accessible to wider audience. REFERENCES [1] M. R. Chaudron, W. Heijstek and A. Nugroho, "How effective is UML modeling?," in Software and Systems Modeling (SoSyM), 2012. [2] E. Arisholm, L. C. Briand, S. E. Hove and Y. Labiche, "The impact of UML documentation on software maintenance: an experimental evaluation," IEEE Transactions on Software Engineering, vol. 32, no. 6, pp. 365-381, 2006. [3] G. Rossi, Ó. Pastor, D. Schwabe and L. Olsina, Web Engineering: Modelling and Implementing Web Applications, Springer-Verlag London, 2008. [4] E. J. Cross and J. H. Chikofsky, "Reverse engineering and design recovery: a taxonomy," in IEEE Software, vol. 7, no. 1,, 1990, pp. 13-17. [5] J. Hibschman and H. Zhang, "Unravel: Rapid Web Application Reverse Engineering via Interaction Recording, Source Tracing, and Library Detection," in UIST '15 Proceedings of the 28th Annual ACM Symposium on User Interface Software & Technology, Daegu, Kyungpook, Republic of Korea, 2015. Fig. 12 The generated activity describing Download assignment use case. [6] M. Mohammad I and E. A.-Q. Rafa, "An approach to derive the use case diagrams from an event table," in 8th WSEAS International Conference, The results of experiment indicate that created prototype is Cambridge, 2009. capable of creating UML Use Case model. Both Use Case and [7] M. El-Attar and J. Miller, "Producing robust use case diagrams via Activity diagrams were created describing system usage in great reverse engineering," Softw Syst Model, pp. 7-67, 2008. detail. As current prototype is only capable of detecting [8] M. I. Muhairat and A. Abdel, "A New Reverse Engineering Approach to generalization relations, further iterations of this prototype will Convert," International Journal of Software Engineering & Applications only increase the expressivity of generated models. Generation (IJSEA), 2014. of activity diagrams describing each use case provide even more [9] "UML 2.5 Specification," 1 03 2015. [Online]. Available: depth to generated UML Use Case model. As semantic value of http://www.omg.org/spec/UML/2.5/PDF. these Activity diagrams can still be improved in future releases, [10] E. Korshunova, M. P. M. v. d. Brand and M. Mousavi, "CPP2XMI: generated Activity diagrams still provide enough information Reverse Engineering of UML Class, Sequence,," in Proceedings of the about system usage. 13th Working Conference on Reverse Engineering (WCRE'06), 2006. [11] "eclipse," The Eclipse Foundation, 2108. [Online]. Available: VI. CONCLUSIONS AND FUTURE WORK http://www.eclipse.org. Reverse engineering of UML diagrams is utilized in various [12] "visual-paradigm," Visual Paradigm, 2018. [Online]. Available: areas of software engineering. There are many applications of https://www.visual-paradigm.com/. reverse engineering, but most of them are for structural UML [13] T. Ziadi, M. A. A. d. Silva, L. M. Hillah and M. Ziane, "A Fully Dynamic diagrams. Behavioral UML diagrams can also be reversed and Approach to the Reverse Engineering of UML Sequence Diagrams," in 16th IEEE International Conference on Engineering of Complex in our work we have proposed the methodology for reversing Computer Systems, ICECCS, Las Vegas, United States, 2011. UML Use Case model from the data recorded during website [14] G. A. D. Lucca, A. R. Fasolino and P. Tramontana, "WARE: a tool for usage. Our algorithm analyses recorded user activity and the Reverse Engineering of Web Applications," Journal of Software transforms it into UML Use Case and Activity diagrams. The Maintenance and Evolution: Research and Practice - Special issue: Web prototype of the proposed algorithm was implemented as a site evolution, pp. 71-101, 2004. Chrome extension. As this prototype is just the first of its kind, [15] Richard Soley and the OMG Staff Strategy Group, Model Driven it generates use case and activity diagrams, but is not yet capable Architecture, 2000. of detecting extend and include relations. [16] B.A. Nowak, R.K. Nowicki, M. Woźniak, and C. Napoli,. "Multi-class nearest neighbour classifier for incomplete data handling," in The results of experiment indicate that the implemented International Conference on Artificial Intelligence and Soft Computing, prototype is capable of generating UML Use Case model. Both pp. 469-480, 2015. Use Case and Activity diagrams were successfully generated using the prototype. The experiment results indicated that our method could be successfully utilized in practice. The experiment also provided valuable feedback about required improvements on action naming rules. 60