Traceability Links Recovery in BPMN Models Raúl Lapeña SVIT Research Group, Universidad San Jorge, Spain rlapena@usj.es Abstract. Traceability Links Recovery has been a topic of interest for many years. However, Traceability Links Recovery in models in general, and BPMN models in particular, has not received enough attention yet. Through my work, I aim to fill this research gap by studying Trace- ability Links Recovery between requirements and BPMN models. So far, under the tutelage of directors Carlos Cetina and Óscar Pastor, I adapted Traceability Links Recovery code techniques to work over BPMN mod- els. The produced approach was applied to two different case studies, an academic one and an industrial one. The outcomes of the research outperformed the state of the art baseline. Under the light of these novel findings, opportunities for new research unfold. Keywords: Traceability Links Recovery, BPMN Models, Model Driven Engineering 1 Introduction Traceability Links Recovery (TLR) is defined as the software engineering task that deals with the identification and comprehension of dependencies and re- lationships between software artifacts. It has been a subject of investigation for many years within the software engineering community [1, 2]. Research has shown that affordable traceability can be critical to the success of a project [3], and leads to increased maintainability and reliability of software systems [4], also decreasing the expected defect rate in developed software [5]. In recent years, TLR has been attracting more attention [6]. However, most of the works focus on performing TLR tasks in code artifacts [7], while TLR in process models is a topic that has not received enough attention yet. Through my work, I aim to fill this research gap by studying TLR between requirements and process models. So far, under the tutelage of directors Carlos Cetina and Óscar Pastor, I adapted TLR code techniques to work over process models (specifically, BPMN models). More precisely, through the work presented in [8], we studied TLR between re- quirements and process models through three different approaches, two adapted code techniques and a models-specific baseline. Given a query requirement and a process model, the three approaches used different means to extract a fragment from the model, relevant to the implementation of the query requirement. The three approaches were evaluated through the Camunda BPMN for Re- search case study (github.com/camunda/bpmn-for-research) and through a real-world industrial case study, provided by our industrial partner, CAF (Con- strucciones y Auxiliar de Ferrocarriles, www.caf.es/en), a worldwide provider of railway solutions. One of the adapted code techniques achieved the best results for all the measured performance indicators in both case studies, outperform- ing the other two techniques. The overall findings of our paper suggested that adapting code techniques that provided good results in code was beneficial for TLR between requirements and BPMN models, since the outcomes outperformed those of a models-specific baseline. Under the light of these findings, a research question arises, unfolding opportunities for novel research: How can we further improve TLR in BPMN models? The rest of the paper is structured as follows: Section 2 describes the Ap- proach that obtained the best results and how to apply it to TLR between requirements and BPMN models. Section 3 details the baseline technique and the designed Evaluation. Section 4 presents the obtained Results. Section 5 for- mulates the Research Question that arises from our ongoing work. Section 6 discusses potential Future Work. Section 7 mentions the research Methodology in use. Finally, Section 8 reviews the works related to this one. 2 Approach This section describes the Mutation Search technique, the technique designed in [8] that obtained the best results for TLR between requirements and BPMN models, providing insight on its steps, application, and outcomes. 2.1 Mutation Search The Mutation Search technique receives a query requirement and a BPMN model as input, generates a population of fragments, and ranks said fragments through Latent Semantic Indexing. From the ranking, the first fragment is taken as the proposed solution. In order to generate the fragments population, algorithm 1 is followed. In the algorithm, an empty population and a seed fragment (chosen randomly from the input model) are created. Then, until the algorithm meets a stop condition (for instance, a certain number of iterations), the fragment is mutated and each new mutation is added to the population, avoiding the addition of repeated fragments. In the algorithm, a mutation in a fragment can be caused by: (1) adding one new event, gateway, or task that is connected to an already present event, gateway, or task, (2) removing an element with only one connection, or (3) adding or removing a lane from the fragment. The performed mutation is chosen randomly on each iteration. The top part of Fig. 1 shows this process, having the example input BPMN model on the left, and some example fragments on the right, generated through the usage of the algorithm. The generated fragments are represented through the text contained in all their elements. The text of both the input require- ment and the generated fragments is then processed through general phrase 53 Algorithm 1 Mutation Search Algorithm 1: P ← [] . Initialize the population 2: F ← randomF ragment(inputM odel) . Create an initial seed fragment 3: while !(StopCondition) do . While the stop condition is not met 4: F ← mutateF ragment(F ) . Mutate the fragment 5: if !(F ∈ P ) then . If the new fragment is not in the population 6: P ←P +F . Add the new mutation to the population 7: end if 8: end while 9: return P . Return the population styling techniques (lowercasing and tokenization), Parts-Of-Speech Tagging [9], and Lemmatizing [10]. Finally, the requirement and the fragments are fed into Latent Semantic Indexing, which ranks the fragments according to their similitude to the re- quirement. Latent Semantic Indexing (LSI) [11] is an automatic mathemati- cal/statistical technique that analyzes relationships between queries and docu- ments (bodies of text). LSI has been successfully used to retrieve Traceability Links between different kinds of software artifacts in different contexts [7]. To that extent, LSI produces a term-by-document co-ocurrence matrix. The bottom left part of Fig. 1 shows an example term-by-document co-occurrence matrix, with values associated to an example. Each row in the matrix (term) stands for each of the words that appear in the processed text of the requirement and the model elements. Each column in the matrix (document) stands for each of the fragments (MF1 to MFn) generated through the algorithm. The final column (query), stands for the processed input requirement. Each cell in the matrix contains the frequency of each term in each document. Vector representations of the documents and the query are obtained by nor- malizing and decomposing the term-by-document co-occurrence matrix using a matrix factorization technique called Singular Value Decomposition (SVD) [11]. In Fig. 1, a three-dimensional graph of the SVD is provided, on which it is pos- sible to notice the vectorial representations of some of the columns. To measure the similarity degree between vectors, the cosine between the query vector and the documents vectors is calculated. Cosine values closer to one denote a high degree of similarity, and cosine values closer to minus one denote a low degree of similarity. Through this measurement, the fragments are ordered according to their similarity degree to the requirement, producing the relevancy ranking shown on the bottom right part of Fig. 1. From the ranking, the first fragment is considered as the candidate solution for the requirement, and consequently taken as the final output of the Mutation Search technique. 54 REQUIREMENT The system will open the doors EXAMPLE MODEL FRAGMENTS MF1 Are the doors open? MF2 MODEL Yes Inhibition X Push doors button MF6 No Are the doors open? Human Open the doors Push doors Yes X Inhibition button MF9 No MFn Module Module Are the doors open? Door Door Open the doors Open the Push doors button X doors Documents Query Singular Value Decomposition Scores Model Element Ranking MF1 MF2 … MF9 … MFn Query MFN MFn Inhibition 1 0 … 0 … 0 0 MF9 MF9 = 0.97 Keywords MF2 Door 0 1 … 1 … 2 1 QQ MFn = 0.52 Button 0 1 … 0 … 1 0 MF6 MF1 … Open 0 0 … 1 … 1 1 … … … … … … … … MF6 = - 0.93 Fig. 1. Mutation Search Technique Example 3 Evaluation The following paragraphs introduce the baseline, the experimental setup, the case studies, and the oracles used to evaluate the baseline and Mutation Search. This section also details the design of the evaluation. 3.1 Linguistic Rule-Based Baseline Spanoudakis et al. [12] present a linguistic rule-based approach to support the automatic generation of traceability links between natural language requirements and conceptual models. Specifically, the traceability links between the require- ments and the conceptual models are generated through a set of requirement- to-object-model (RTOM) rules that specify sequences of terms and grammatical patterns. The technique searches for matching patterns in the requirements and the conceptual models, producing a link per each found match. We worked with a set of rules adapted so that the technique works over BPMN models. 3.2 Experimental Setup Through [8], TLR between requirements and BPMN models is performed. The results obtained by Mutation Search are compared against those of a models- specific baseline. An overview evaluation can be seen in Fig. 2. The top part shows the inputs, extracted from the documentation provided in the case stud- ies: requirements, BPMN models, and the approved traceability between both. The approved traceability is a document that depicts the correct fragments that 55 correspond to the requirements. It is provided by software engineers from our in- dustrial partner, and conforms the oracle of the evaluation. For each case study, the linguistic baseline takes the mentioned inputs, and generates a single frag- ment for each requirement. The generated fragment is compared with the oracle fragment. The Mutation Search technique generates a ranking of fragments per requirement instead. Since the rankings are ordered from best to worst trace- ability, the first fragment in each ranking is picked for comparison against its corresponding oracle. Once the comparisons are performed, a confusion matrix is calculated both for the baseline and for Mutation Search. Approved Requirements BPM Model Oracle Traceability Approaches Input Linguistic Mutation Search Model Fragment Model Fragment (Ranking @ 1) Precision, Recall, F-Measure, MCC Measurements & Report Fig. 2. Experimental Setup A confusion matrix is a table that is often used to describe the performance of a classification model (in this case, the linguistic baseline and Mutation Search) on a set of test data (the solutions) for which the true values are known (from the oracle). The confusion matrix distinguishes between the predicted values and the real values, classifying them into four categories: (1) true positive; (2) false positive; (3) true negative; and (4) false negative. Then, some performance measurements are derived from the values in the confusion matrix. In particular, a report including four performance measurements (recall, precision, f-measure, and MCC) is created for each of the two case studies, both for the baseline and for Mutation Search. 3.3 Case Study and Oracles In order to perform the evaluation of the approaches, we relied on two different case studies: (1) the Camunda BPMN for Research academic repository, and (2) a set of BPMN models provided by CAF, our industrial partner. In order to obtain the performance results of the approaches, we relied on the available correct solutions, provided in both case studies. 56 Camunda BPMN for Research: The Camunda BPMN for Research case study consists of four BPMN modeling exercises. Each exercise contains an as- sociated textual description and the solution model for the provided description. In order to apply the approaches to the Camunda case study, a software engineer derived a set of requirements from the problem descriptions. Each exercise has an associated solution model for the provided description. The same software engineer who derived the requirements from the problem descriptions also gen- erated a set of fragments from the solution model, mapping each fragment to a single requirement. Thus, we were provided with a set of requirements, the fragments that implement them, and the TLR mapping between both artifacts. CAF: For our evaluation, CAF provided us with the requirements and BPMN models of five railway solutions. They also provided us with their ex- isting documentation on requirements to BPMN models traceability, where each requirement is also mapped to a single fragment. 4 Results Table 1 outlines the results. Each row shows the obtained precision, recall, f- measure, and MCC values. The Mutation Search technique achieved the best results for all the performance indicators in both case studies, providing a mean precision value of 63%, a mean recall value of 77%, a combined F-measure of 68%, and an MCC value of 0.60 for the Camunda BPMN for Research case study, and a mean precision value of 79%, a mean recall value of 72%, a combined F- measure of 74%, and an MCC value of 0.69 for the CAF case study. Table 1. Mean Values and Standard Deviations for Precision, Recall and F-Measure Precision Recall F-Measure MCC Linguistic - Camunda 40%±25% 35%±22% 33%±13% 0.25±0.19 Linguistic - CAF 35%±28% 35%±10% 30%±7% 0.18±0.13 Mutation Search - Camunda 63%±21% 77%±22% 68%±19% 0.60±0.24 Mutation Search - CAF 79%±19% 72%±19% 74%±16% 0.69±0.20 5 Research Question From the results of our work, a Research Question arises: How can we further improve TLR in BPMN models? The following section will address this ques- tion, briefly mentioning some of the possible future works derived from a close inspection of the results. 57 6 Future Work This section presents some ideas and opportunities for future work that arose from the presented Research Question: 1. (Accepted - CAiSE 2019) Tacit knowledge in the requirements may have a negative impact on semantic-based techniques. How can we minimize this impact? 2. (Currently under review - IS CAiSE 2018 special issue) BPMN models have some particularities that other models lack. Could we take in account these particularities in our techniques in order to lead them to en- hanced results? 3. (Ongoing work) BMP models have less text that other models. Could we enrich the text of BPMN models to improve TLR techniques based on text search? 7 Methodology To perform this research, as well as our ongoing work, we have followed the design science methodology guidelines presented in [13]. 8 Related Work Related works focus on the impact and application of linguistic techniques to TLR problem resolution at several levels of abstraction. Works like [14, 15] use linguistic approaches to tackle specific TLR problems. In [16], the authors use linguistic techniques to identify equivalence between requirements. The work presented in [17] uses linguistic techniques to study how changes in requirements impact other requirements in the same specification. Our work is not based or focused on linguistic techniques as a means of TLR analysis, but we rather study novel techniques to perform TLR between requirements and BPMN models. Other works target the application of LSI to TLR tasks. De Lucia et al. [18] present a tool based on LSI in the context of an artifact management system. [19] takes in consideration the possible configurations of LSI when using the technique for TLR between requirement artifacts. Through our work, we do not study the management of artifacts nor different LSI configurations or how LSI configurations impact the results of TLR, but we rather study TLR between requirements and BPMN models. References 1. Gotel, O.C., Finkelstein, C.: An Analysis of the Requirements Traceability Prob- lem. In: Proceedings of the First International Conference on Requirements Engi- neering, IEEE (1994) 94–101 58 2. Spanoudakis, G., Zisman, A.: Software Traceability: a Roadmap. Handbook of Software Engineering and Knowledge Engineering 3 (2005) 395–428 3. Watkins, R., Neal, M.: Why and How of Requirements Tracing. IEEE Software 11(4) (1994) 104–106 4. Ghazarian, A.: A Research Agenda for Software Reliability. IEEE Reliability Society 2009 Annual Technology Report (2010) 5. Rempel, P., Mäder, P.: Preventing Defects: the Impact of Requirements Traceabil- ity Completeness on Software Quality. IEEE Transactions on Software Engineering 43(8) (2017) 777–797 6. Parizi, R.M., Lee, S.P., Dabbagh, M.: Achievements and Challenges in State-of- the-Art Software Traceability between Test and Code Artifacts. IEEE Transactions on Reliability 63(4) (2014) 913–926 7. Rubin, J., Chechik, M.: A Survey of Feature Location Techniques. In: Domain Engineering. Springer (2013) 29–58 8. Lapeña, R., Font, J., Cetina, C., Pastor, O.: Exploring new directions in traceability link recovery in models: The process models case. In: Proceedings of the 30th International Conference on Advanced Information Systems Engineering (CAiSE). (2018) 9. Hulth, A.: Improved Automatic Keyword Extraction given more Linguistic Knowl- edge. In: Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics (2003) 216–223 10. Plisson, J., Lavrac, N., Mladenic, D., et al.: A Rule Based Approach to Word Lemmatization. In: Proceedings of the 7th International Multi-Conference Infor- mation Society. Volume 1., Citeseer (2004) 83–86 11. Landauer, T.K., Foltz, P.W., Laham, D.: An Introduction to Latent Semantic Analysis. Discourse Processes 25(2-3) (1998) 259–284 12. Spanoudakis, G., Zisman, A., Pérez-Minana, E., Krause, P.: Rule-Based Gener- ation of Requirements Traceability Relations. Journal of Systems and Software 72(2) (2004) 105–127 13. Wieringa, R.J.: Design science methodology for information systems and software engineering. Springer (2014) 14. Sultanov, H., Hayes, J.H.: Application of Swarm Techniques to Requirements Engineering: Requirements Tracing. In: 18th IEEE International Requirements Engineering Conference. (2010) 15. Sundaram, S.K., Hayes, J.H., Dekhtyar, A., Holbrook, E.A.: Assessing Traceability of Software Engineering Artifacts. Requirements Engineering 15(3) (2010) 16. Falessi, D., Cantone, G., Canfora, G.: Empirical Principles and an Industrial Case Study in Retrieving Equivalent Requirements via Natural Language Processing Techniques. Transactions on Software Engineering 39(1) (2013) 17. Arora, C., Sabetzadeh, M., Goknil, A., Briand, L.C., Zimmer, F.: Change Impact Analysis for Natural Language Requirements: An NLP Approach. In: IEEE 23rd International Requirements Engineering Conference. (2015) 18. De Lucia, A., Fasano, F., Oliveto, R., Tortora, G.: Enhancing an Artefact Man- agement System with Traceability Recovery Features. In: Proceedings of the 20th IEEE International Conference on Software Maintenance, IEEE (2004) 306–315 19. Eder, S., Femmer, H., Hauptmann, B., Junker, M.: Configuring Latent Seman- tic Indexing for Requirements Tracing. In: Proceedings of the 2nd International Workshop on Requirements Engineering and Testing. (2015) 59