Towards Hierarchical Code-to-Architecture Mapping Using Information Retrieval Zipani Tom Sinkala 1, Sebastian Herold 1 1 Department of Mathematics and Computer Science, Karlstad University, Karlstad, Sweden Abstract Automating the mapping of a system’s code to its architecture helps improve the adoption of successful Software Architecture Consistency Checking (SACC) methods like Reflexion Modelling. InMap is an interactive code-to-architecture mapping recommendation technique that has been shown to do this task with good recall and precision using natural language software architecture descriptions of the architectural modules. However, InMap like most other automated recommendations techniques maps low level source code units like source code files or classes to architectural modules. For large complex systems this can still be a barrier to adoption due to the effort required by a software architect when accepting or rejecting the recommendations. In this study we propose an extension to InMap that provides recommendations for higher-level source code units, that is, packages. It utilizes InMap’s information retrieval capabilities, using minimal architecture documentation, applied to a software’s codebase, to recommend mappings between the software’s high-level source code entities and its architectural modules. We show that using our proposed hierarchical mapping technique we are able to reduce the effort required by the architect, as high as 6-fold in some cases, and still achieve good precision and fairly good recall. Keywords 1 Automated Mapping, Software Architecture Consistency Checking, Information Retrieval. 1. Introduction [11, 15, 16]. This implies, in the case of systems developed using an object oriented programming language, where classes are considered as the Mapping code to architecture is a task that is underlying unit of source code, they automate common in Software Architecture Consistency mapping at a class level – attempting to predict Checking (SACC) [1, 11, 14, 16]. Popular SACC which architectural module, a class (or class-file) methods like Reflexion Modelling [9, 12] require maps to. This has been done quite well with a mapping step in order to be able to identify techniques like InMap [15, 16] and NBC [11]. In conformance or divergence of a system’s code to our paper “InMap: Automated Interactive Code- its intended software architectural modules [8, 9, to-Architecture Mapping Recommendations” we 12, 13]. The mapping step is a manual and labour- show that InMap achieved a recall of 0.87-1.00 intensive task for the most part that becomes a and precision of 0.70-0.96 for the systems tested. barrier to industry adoption of effective SACC However, in a large system of say a 1000+ techniques like Reflexion Modelling especially classes, in spite of achieving recall and precision for large complex software systems [1, 7]. of 1, it is still burdensome for an architect to There have been a number of techniques that inspect over a thousand recommendations before have been created that attempt to decrease the accepting them as correct. In an attempt to reduce burden of mapping on software architects by the effort needed, we investigate making mapping automating the mapping step [4–6, 11, 15, 16]. recommendations for higher-level source code Most of these however, are class- or file-based units – that is, we make mapping recommend- ECSA2021 Companion Volume EMAIL: tom.sinkala@kau.se (Z.T. Sinkala); sebastian.herold@kau.se (S. Herold) Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) dations for larger units of code at a time (packages incremental, in that HuGME does not attempt to rather than classes) thereby reducing the amount map all source code entities in one complete step; of work required by an architect. In this paper, we rather it maps a subset at a time until no more present an automated hierarchical package mapping is possible. The approach is non- mapping technique. It garners from the successful hierarchical as it views the mapping task from a information retrieval-based InMap approach [15, clustering perspective in which source code 16] that computes similarity of an unmapped class entities that are mapped to the same hypothesized to an architectural module. We exploit class-to- entity form a cluster [4]. module similarity scores produced by InMap to In their study, the results for HuGME had on generate package-to-module similarity scores. average about 90% recall and 80-90% accuracy These are filtered using a defined set of hueristics [5]. To get these results the technique needed from which recommendations, that are detemined about 20% of the system’s source entities to be by a system’s package hierachy, are made. We pre-mapped before running the algorithm. Of show that using our proposed hierarchical interest is that because this mapping technique is mapping technique we are able to reduce the effort dependency-based, for it to give meaningful required by the architect, as high as 6-fold in some results, the 20% pre-mapped source entities need cases, and still achieve good precision. to be spread across various modules. In addition, Section 2 briefly discusses automated mapping they must have dependencies to unmapped techniques along with their hierarchical mapping entities. This presents a problem in that in order to capabilities. In Section 3, we detail the approach, benefit from this technique one needs to not only describing how package scores are computed and dedicate some time for pre-mapping but must also how package-to-module mapping recommend- ensure that the mapping is evenly spread across dations are constructed. Section 4 describes the the modules. Additionally, one must also ensure experiment setup to evaluate the technique and that the selected pre-mapped source code entities presents the results obtained. In Section 5, we have dependencies to the unmapped entities interpret and discuss the results and in Section 6 otherwise entity relationship discovery is poor. we draw our conclusions on our findings and This all becomes a highly labour-intensive present opportunities for further research. exercise. Furthermore, because it uses clustering algorithms based on high cohesion and low 2. Related Work coupling, if developers do not follow this principle in the software’s implementation then the mapping of the algorithm will be affected [2]. Christl et al. conceived, HuGME, a Bittencourt et al. propose an information dependency analysis (DA) based automated retrieval (IR) based technique that uses the same mapping recommendation technique. It clusters a automated mapping recommendations approach software system’s source code using an as HuGME except it replaces dependency-based architect’s knowledge about its intended attraction functions with IR based similarity architecture [4, 5]. HuGME applies an attraction functions [3]. It calculates the similarity of an function, which minimizes coupling and unmapped source entity to a module by searching maximizes cohesion, to produce a matrix of for specific terms (a module’s name and mapped attraction scores for unmapped entities to modules classes, methods and fields) within the source [17]. The calculation of the score uses the code of the unmapped class. Similar to HuGME, dependency values between unmapped entities Bittencourt et al.’s technique needs some manual and mapped entities. The higher the score, the pre-mapping before it can automate mapping. higher the likelihood that an unmapped entity Olsson et al. combine IR & DA methods in belongs to a given module. All unmapped entities their automated mapping technique called Naive that result in only one candidate having a Bayes Classification (NBC) [11]. NBC uses similarity score higher than the arithmetic mean Bayes’ theorem to build a probabilistic model of of all scores produce a single recommendation. classifications using words taken from the source All unmapped entities for which two or more code entities. The model gives the probability of candidates exist are presented to the user in ranked words belonging to a source file entity. This is order, from highest to lowest, as recommend- augmented with syntactical information of the dations. HuGME presents recommendations to dependencies, a method called Concrete the user to allow cluster decisions to be made Dependency Abstraction [11]. Just like HuGME, exclusively by the architect. This process is Olsson et al.’s proposed technique requires a pre- mapped set in order to perform well. Both results in considerable work for an architect in the Bittencourt et al.’s and Olsson et al.’s results case of large software systems. We therefore showed that when there was a smaller pre-mapped explore the following research question: set there was a decreasing trend in the f1-score of their techniques [3, 11]. Additionally, they both How can we exploit InMap’s good class- do not address package-level based mapping. to-module mappings to produce package- Naim et al. present a technique called to-module mappings, thereby reducing the Coordinated Clustering of Heterogeneous effort needed by an architect in accepting Datasets (CCHD), that combines both DA and IR and/or rejecting mapping recommend- methods to compute a similarity score for source dations produced by InMap? code files [10]. CCHD uses an architect’s feedback on the recovered architecture to In the following section, we describe our iteratively adjust the results until there are no approach to answering this question. suggestions for change. These adjusted results train a classifier that automatically places new 3. Approach code added to a codebase in the “right” architectural module. However, the technique is not necessarily meant for automated mapping in We begin by describing the InMap technique SACC but rather for software architecture briefly. We then describe a technique for recovery tasks. Moreover, it too does not directly hierarchical package-to-module mapping that address package-level based mapping. builds on top of InMap. Common among industry tools is the use of naming patterns (or regular expressions). For 3.1. InMap example, the expressions **/gui/** or *.gui.* or net.java.gui.* can be used to map source code InMap is an interactive code-to-architecture units (whether classes or packages) to an automated mapping technique for SACC methods architecture module named GUI. This is the that uses information retrieval concepts to technique used by both Sonargraph Architect produce class-to-module mapping recommend- and Structure101 Studio in addition to their drag dations. It does not require manual pre-mapping & drop capabilities. However, the drawback of in order to produce recommendations, rather it using naming patterns and/or drag & drop uses natural language architectural descriptions of functionality is that they are both manual tasks the architectural modules as input to predict which makes mapping a tedious exercise – mappings. It presents its best mapping especially for large software systems that have recommendations a page/set at a time (the most complex mapping configurations. optimal being 30 per page) from which the In summary, despite advances made, available architect can accept and reject. As techniques that are designed to automate mapping recommendations from each page/set are accepted have short comings. Some require an initial set of or rejected, InMap learns from this and adapts its the source code to be pre-mapped manually [3–5, next page/set of recommendations from the 11], while the industry tools that do not require obtained knowledge. This method works quite pre-mapping offer manual methods. Additionally, well giving an average recall of 97% and a the automated mapping techniques that require precision of 82% for the systems evaluated [16]. pre-mapping in order to “jump-start” mapping, as it were, require about 15-20% of the source code to be pre-mapped in order to give worthwhile 3.1.1. Class-to-Module Similarity results [4, 5, 6, 15]. InMap [15, 16] addresses the limitations of InMap’s algorithm is made up of seven steps these techniques in that it is able to automate [16]. However, for our hierarchical package-to- mapping without requiring pre-mapping. Using module mapping technique the following steps in simple and concise natural language descriptions InMap are used to generate what are called class- of the architecture modules it is able to automate to-module mapping scores. mapping of a completely unmapped system with Firstly, the source code files are filtered to rather good results. Its limitation though is that the exclude any external or third-party package mapping recommendations provided are for low- libraries or classes of system that the architect level source code units, namely, classes. This does not want to include in the mapping exercise. Secondly, the filtered sourced files are stripped of However, if we could map entire packages then any special characters and programming language we could reduce the effort needed. For example, a keywords. Third, the pre-processed source code package that has 50 classes that all map to the files are indexed as an inverted index. In the fourth same module could be (or should be) given as a and fifth steps, InMap formulates a query using single package-to-module mapping recommend- four items namely, (1) the names of the modules dation. Additionally, because packages are and (2) the module’s architectural descriptions hierarchal in nature, they present even more (stripped of any special characters and stop opportunity to reduce the number of “necessary” words) to search the indexed source code files for mapping recommendations to present to an similarity to each module. In the first iteration, architect. For example, say we have two packages InMap uses this information only to build a query. A and B that are both sub-packages of C. If A and However, once the first set of classes are mapped, B have 50 classes each and say all the classes in A InMap then adds to the query (3) the names of and B map to the same module. Then mapping C classes mapped to a module and (4) the names of to the module would suffice and saves the methods contained within classes mapped to a architect from reviewing 99 other mapping module. This ‘enriches’ the query used to search recommendations. Figure 1 illustrates a package for the similarity of an unmapped class to a hierarchy, that our technique (and certainly module. Therefore, after each set of newly others) can benefit from to reduce the number of mapped classes the query for the next set of recommendations needed. recommendations looks different. The search returns a set of scores for every class-module pair 3.2.1. Package-to-Module Similarity based on the similarity information retrieval function, tf-idf. The tf-idf scores are called class- Our package-to-module mapping technique picks to-module similarity scores (𝑆𝑆𝑆𝑆𝑐𝑐𝑐𝑐 ), where, c and up from step 5 of the InMap algorithm after it m are a class-module pair in the system. Specifics produces similarity scores for all class-to-module of how tf-idf is calculated can be found in [16]. pairs. We group the class-to-module similarity scores 𝑆𝑆𝑆𝑆𝑐𝑐𝑐𝑐 , according to the packages they 3.1.2. Class-to-Module Mapping belong to. This means for each package we have Recommendations a set of classes with scores to each identified module. From this set of class-to-module In the sixth and seventh steps, InMap gives as similarity scores that have a given package as a class-to-module mapping recommendation the their parent we then calculate the interquartile highest scoring class-to-module pair. The mean (𝐼𝐼𝐼𝐼𝐼𝐼𝑝𝑝𝑝𝑝 ), where, p and m are a package- architect can either accept or reject it. However, module pair in the system. That is, the range of InMap presents as recommendations either: only values between the first quartile and third quartile those above the arithmetic mean of all highest (the interquartile range, IQR) are used to scoring class-module pairs; or the best 30 calculate the arithmetic mean. Module IQRs for a recommendations (if those above the mean is package taken from Jittac are demonstrated in greater than 30). After the architect gives Figure 2. The lowest 25% and the highest 25% of feedback, it returns to step 4 and repeats steps 4 to the scores are ignored. Important to note is that the 7 until no more recommendations can be given. IQR and hence the IQM of a non-terminal package Our proposed hierarchical package mapping is calculated from not only the classes that belong technique picks up right after the fifth step, that is, to the package but also the classes of its child once InMap produces the matrix of class-to- packages. For example, se.kau.cs.jittac.eclipse.b- module similarity scores (𝑆𝑆𝑆𝑆𝑐𝑐𝑐𝑐 ). uilders.jdt shown in the package tree in Figure 1 has its IQR calculated using the 8 classes that belong to it but also the 3 classes in s- 3.2. Hierarchical Package Mapping e.kau.cs.jittac.eclipse.builders.jdt.commands and the single class in se.kau.cs.jittac.eclipse.buil- In as much as InMap is able to achieve good ders.jdt.util. Formally, we define 𝐼𝐼𝐼𝐼𝐼𝐼𝑝𝑝𝑝𝑝 as, results with the approach described in Section 3.1 because it based on class module mappings, the effort required by architects could still be significant for large and complex systems. Figure 2: Package hierarchy for Jittac - one of the systems we evaluate our technique on. Figure 1: Box plots for a package taken from Jittac showing the IQRs for Jittac’s modules as well as the class distribution inside and outside the IQRs. The x-axis shows the class 𝑆𝑆𝑆𝑆𝑐𝑐𝑐𝑐 scores and the y- axis shows the architectural modules of the system. The number in brackets beside a module indicates the total number of classes for the given package that have an 𝑆𝑆𝑆𝑆𝑐𝑐𝑐𝑐 score to the module. number of classes that make up the package p and 3 4 𝑛𝑛 i is the position of 𝑆𝑆𝑆𝑆𝑐𝑐𝑐𝑐 in the ordered set of class- 2 to-module similarity scores for the package p. 𝐼𝐼𝐼𝐼𝐼𝐼𝑝𝑝𝑝𝑝 = � 𝑆𝑆𝑆𝑆𝑐𝑐𝑐𝑐 𝑖𝑖 (1) Using the scores within the IQR as opposed to 𝑛𝑛 𝑛𝑛 𝑖𝑖 = + 1 4 the full set of scores makes a package-to-module where, p and m are a package-module pair in the similarity, more resilient to the presence of outlier system, c has p as its parent package, n is the classes in the class-to-module similarity scores that it is derived from. Figure 2 shows outlier Table 1 tree bottom-up starting with the terminal packages and working our way up to the root package. At Extract of 𝑺𝑺𝑺𝑺𝒑𝒑𝒑𝒑 scores taken from Jittac. A value each tree-depth level we retain the package-to- >= 0.6 (highlighted blue) implies it is a good module similarity scores into two sets for each package-to-module similarity score; a score >= package, namely a set of outstanding package-to- 1.5 (highlighted red) implies it is an outstanding module similarity scores and a set of good package-to-module similarity score. package-to-module similarity scores. Outstanding Modules mappings are those in which a package has a score Packages above the outstanding threshold and its child architecture- eclipse- impl- model ui model packages have a score above the good threshold. se.kau.cs.jittac.model 2.3 -0.6 1.0 Good mappings are those in which a package and se.kau.cs.jittac.model.am 2.6 -0.4 0.4 its children have a score above the good threshold. se.kau.cs.jittac.model.am.events 2.4 -0.4 0.3 We formally define this notion with the following se.kau.cs.jittac.model.am.io 2.3 -0.5 0.6 two rules, se.kau.cs.jittac.model.im 1.0 -0.5 2.3 se.kau.cs.jittac.model.im.events 0.9 -0.6 1.6 Given: se.kau.cs.jittac.model.im.io 0.5 - 1.6 Package p Module m classes which we define as classes with Package-to-module score 𝐒𝐒𝐒𝐒𝐩𝐩𝐩𝐩 𝑆𝑆𝑆𝑆𝑐𝑐𝑐𝑐 scores that are higher than the box plot max, Good score threshold GSt Outstanding score threshold OSt classes with 𝑆𝑆𝑆𝑆𝑐𝑐𝑐𝑐 scores that are lower than the box plot min but also classes with 𝑆𝑆𝑆𝑆𝑐𝑐𝑐𝑐 scores that Rule 1: A mapping (pim) is called good iff are within the box plot min-max but outside the IQR. The result of this step is a matrix of IQMs for 𝐒𝐒𝐒𝐒𝐩𝐩𝐩𝐩 >= GSt each package-module combination. We then apply feature scaling to normalize the and for all sub-packages pi of p, pim is a good IQM module scores for each package. We use mapping. standardization (also known as z-score normalization) which makes the scores for each Rule 2: A mapping (pim) is called outstanding iff package-module pair have a zero-mean. In our hierarchical package mapping technique we call 𝐒𝐒𝐒𝐒𝐩𝐩𝐩𝐩 >= OSt the resulting z-scores of the standardization normalization package-to-module similarity and for all sub-packages pi of p, pim is a good scores (𝑆𝑆𝑆𝑆𝑝𝑝𝑝𝑝 ). Formally we define 𝑆𝑆𝑆𝑆𝑝𝑝𝑝𝑝 as mapping. follows, Figure 3 illustrates the rules with an example using 𝑆𝑆𝑆𝑆𝑝𝑝𝑝𝑝 scores shown in Table 1. You will 𝐼𝐼𝐼𝐼𝐼𝐼𝑝𝑝𝑝𝑝 − 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎�𝐼𝐼𝐼𝐼𝐼𝐼𝑝𝑝𝑝𝑝 � 𝑆𝑆𝑆𝑆𝑝𝑝𝑝𝑝 = (2) notice that despite the package 𝜎𝜎 se.kau.cs.jittac.model having good and outstanding scores for the modules impl-model where, p and m are a package-module pair in the and architecture-model respectively in Table 1, system, 𝐼𝐼𝐼𝐼𝐼𝐼𝑝𝑝𝑝𝑝 is the original package-to-module Figure 3 indicates that the package has no good similarity score, 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎�𝐼𝐼𝐼𝐼𝐼𝐼𝑝𝑝𝑝𝑝 � is the mean or outstanding mappings. This is because it fails of the 𝐼𝐼𝐼𝐼𝐼𝐼𝑝𝑝𝑝𝑝 scores for a specific package to the to satisfy the second part of Rule 2, that is, that all range of given modules, and 𝜎𝜎 is the standard its sub-packages must have good mappings to the deviation of 𝐼𝐼𝐼𝐼𝐼𝐼𝑝𝑝𝑝𝑝 . Using this method on all same module. However, one of se.kau.cs.jittac. package module pairs we obtain a matrix of model’s sub-packages has a good mapping to the package-to-module similarity scores for the entire same module but the other does not hence no good system. Table 1 shows an extract of these scores. or outstanding mappings for the se.kau.cs.jittac. model package. 3.2.2. Package Mapping Filtering These rules are applied from the bottom of the package tree starting with the deepest terminal packages then their parent packages, then their Using the matrix of package-to-module grandparent packages and so on and so forth until similarity scores we then traverse the package- we reach the root package at the top of the tree. Figure 3: Package tree traversal in order to produce package-to-module mappings recommendations. This is necessary as packages higher up in the systems that were used in the evaluation of package tree depend on the results of packages InMap’s class-to-module mapping technique. lower in the package tree. These are Ant, a command line and API-based tool for process automation; ArgoUML, a 3.2.3. Package-to-Module Mapping desktop-based application for UML modelling; JabRef a desktop-based bibliographic reference Recommendation Selection manager; Jittac an eclipse plugin for reflexion modelling tasks; ProM a desktop-based processes Once both sets of good and outstanding mining tool; and TeamMates a web-based mappings for each package are obtained, we then application for handling peer reviews and traverse the package tree top-down. At each tree- feedback. Table 2 shows the attributes of these level we check if a package has outstanding systems. The natural language architectural mappings and pick the highest that fulfils the module descriptions used as input to InMap to above defined criteria for outstanding and generate class-module similarity scores were recommend it as the most likely mapping. If a obtained from the previous study of InMap. The package is recommended then we terminate prior study of InMap obtained the oracle following that tree path downwards and do not mappings, that is the correct list of code-to- recommend any of its sub-packages, we instead module mappings, from experts involved in proceed to check its siblings. If a package returns developing each respective open-source project. an empty set, then we go one-step lower in the The oracle package-to-module mappings used in package tree. Figure 3 illustrates this; it shows this study were extracted from these. We retained two package-to-module mapping recommend- in the oracle only packages that had direct 1-1 dations (in bold). Observe that architecture- mappings with a module, and excluded packages model is recommended as the module to which that had child entities that map to more than one se.kau.cs.jittac.model.am should map to and impl- module. model as the module to which se.kau.cs.jittac. From the oracle mappings we only extracted model.im should map to. Their sub-packages are package-to-module mappings, leaving out the skipped since they are already considered as a class-to-module mappings to allow us to evaluate result of Rule 2 and se.kau.cs.jittac.model has no the performance of proposed technique strictly at mapping recommendation since it retained no a package-level. Table 2 also shows the number mappings after the package mapping score of packages in the oracle mapping of a system. filtering step. This is the number of actual packages our proposed technique should predict mappings for, 4. Evaluation in other words, the packages that are of concern. For example, if se.kau.cs.jittac.eclipse is part of the oracle mapping and our technique puts up Test Cases: We evaluated our hierarchical se.kau.cs.jittac.eclipse.builders as a possible map- package mapping approach on six Java-based Table 2 “human architect” accepting and rejecting the recommendations produced. System Case Studies For all possible single decimal combinations System Ant Argo JabRef Jittac ProM Team within the range -5.0 to 5.0 for the good and Attributes UML Mates outstanding threshold we collected the recall of Version # r584500 r13713 3.7 0.1 (…) 6.9 5.11 the package mappings as well the technique’s # of source files 778 1,429 843 124 700 467 precision. The min-max of the test range was # of source files after filtering (# of classes) 724 763 840 110 699 293 based on the highest and lowest 𝑆𝑆𝑆𝑆𝑝𝑝𝑝𝑝 scores # of packages 64 60 118 27 162 18 obtained by all 6 systems. We also collected the # of packages in 14 21 11 9 30 11 number of recommendations it took to achieve the oracle mapping given recall & precision. Finally, we also # of source files in oracle package 558 692 812 98 675 293 collected the class coverage (or code reach), that mapping is, the number of classes that were mapped as a # of modules 15 17 6 9 11 11 result of their parent packages being mapped by our hierarchical mapping technique. Table 3 Results: Table 3 shows the results obtained for the optimal thresholds for each system, i.e. they Results showing the optimal thresholds for each gave the best results for the range of values tested. system tested. We got for three systems, Ant, JabRef and Test Good Oustand. # of Package Package Class TeamMates, perfect precision with TeamMates System Thresh. Thresh. Recomm. Recall Precision Coverage getting the same for its recall and class coverage. Ant 1.9 2.4 9 0.64 1.00 276/558 (50%) We found 6 out of Jabref’s 11 package-to-module ArgoU 0.1-1.6 3.2-3.3 13 0.43 0.69 93/692 (13%) mappings (as package recall) and 9 of Ant’s 14, JabRef 1.2-1.4 1.6-2.0 6 0.55 1.00 794/812 (98%) Jittac 1.0-1.4 1.7 7 0.67 0.86 88/98 (90%) which resulted in class coverage of 98% and 50% ProM 0.4-0.6 1.6 37 0.40 0.32 58/675 (9%) respectively. For Jittac, 90% of its classes were TeamM 1.3-1.4 0.1-1.2 10 1.00 1.00 293/293 (100%) mapped by finding 6 of it’s 9 package-to-module mappings with a precision of 0.86. ArgoUML had Table 4 fairly good precision but low recall resulting in low class coverage as well. ProM appeared to be Effort comparison of class vs package mappings an outlier obtaining poor precision and the lowest for systems with class coverage >= 50%. recall from the six systems tested. All results presented are for a single iteration (or pass) of the Class Mapping Package Mapping Test Effort saved technique. Class Class ( Effort System Coverage after # of Coverage # of reduced ) In Table 4 we compare the effort required by Recomm. Recomm. … after 1 pass an architect of our hierarchical mapping technique Ant 13 passes, 50% 390 50% 9 381 ( –97.7% ) vs InMap in its original form. We do this by JabRef 32 passes, 98% 853 98% 6 847 ( –99.3% ) looking at the class coverage of each technique Jittac 7 passes, 90% 123 90% 7 116 ( –94.3% ) and the number of recommendations an architect TeamM 14 passes, 97% 275 100% 10 265 ( –96.4% ) has to sift through to achieve the given class coverage. Table 4 shows this for the systems that ping we count this as a false positive even though achieved more than 50% class coverage after a the latter is a child package of the former. The single iteration. In simple terms we define the reason is the technique must reduce the effort effort saved (𝐸𝐸𝐸𝐸) and the effort reduced (𝐸𝐸𝐸𝐸) as needed by an architect and therefore must be follows, penalized for recommending child packages of a 𝐸𝐸𝐸𝐸 = | 𝑅𝑅 𝑝𝑝 − 𝑅𝑅 𝑐𝑐 | (3) package that is already mapped (or should be). Experimentation & Data Collection: To 𝐸𝐸𝐸𝐸 𝐸𝐸𝐸𝐸 = −100 × (4) experiment on the test cases with various good 𝑅𝑅 𝑐𝑐 and outstanding threshold combinations we extended the evaluator tool we developed in our where 𝑅𝑅 𝑐𝑐 is the number of class-to-module previous studies of InMap to accommodate the recommendations needed by the InMap class- evaluation of package-based mappings. Using the based technique and 𝑅𝑅 𝑝𝑝 is the number of package- oracle architecture package-to-module mappings to-module recommendations needed by our of each system the tool automatically simulates a hierarchical package mapping technique. As an example, Table 4 shows that in the case of Ant it would take 390 recommendations to map 50% of outstanding score threshold is very close to or the Ant’s classes using the InMap class-to-module same as the arithmetic mean of the max package technique, whereas it would take 9 similarity scores for each module of the system. recommendations to map 50% of Ant’s classes And the optimal good score threshold was usually using our heirarchical package-to-module 0.5 less than the optimal outstanding score mapping technique. You will also notice the effort threshold. This establishes a basis for developing saved is more than 800 recommendations for an automated approach for deriving threshold JabRef and the effort reduced is more than 90% values that will give good results across different for all 4 systems. systems. Threats to Validity: Since our package-based 5. Discussion technique is derived from InMap the external validity of its results is affected by similar things, that is, factors such as number of modules and Table 3 shows that the technique has almost classes, code commenting style and quality, and perfect precision, 0.91 excluding ProM. This is architecture description quality. Therefore, more likely due to the fact that our hierarchal package cases studies with varying attributes would add to mapping technique is an extension of InMap’s the validity of the results. However, the results of class-to-module similarity function. Using simple the six test systems used with varying attributes natural language descriptions of architecture shown in Table 2 provide a compelling case for an modules the InMap algorithm, which has the automated hierarchical package mapping class-to-module similarity score 𝑆𝑆𝑆𝑆𝑐𝑐𝑐𝑐 function at technique. its core, was shown to obtain rather good With regard to construct validity, the effort precision. Our hierarchical package mapping required by an architect using our technique needs technique borrows from InMap’s success by using to be evaluated against other package-based the information retrieval based 𝑆𝑆𝑆𝑆𝑐𝑐𝑐𝑐 to generate mapping methods provided by industry tools like is own package-to-module similarity score 𝑆𝑆𝑆𝑆𝑝𝑝𝑝𝑝 . drag & drop, naming patterns or regular The package recall of our technique is fairly expressions. For example, how does our good considering that these results are obtained hierarchical package-based technique compare only after 1 iteration (or pass). As outlined in with manually mapping packages? Evaluations Section 3.1, InMap is an interactive-iterative such as these would require enhanced user studies technique that presents a set of recommendations with software architects in appropriately planned at a time and progresses by learning from the and controlled experiments. feedback of the architect to formulate the next set of recommendations. However, the number of iterations (or passes) is proportional to the size of 6. Conclusion & Future Work the system under review. Compare Tables 2, 3 and 4, observe that systems with a high number of We have presented a proposed solution to source files require a high number of passes (or hierarchical package-based mapping. It extends or iterations) compared to the “smaller” systems. builds on InMap, an information retrieval class- Table 3 shows that with our hierarchical mapping based mapping technique that uses concise natural technique we are able to obtain a package recall language architectural descriptions of modules. of more than 50% in the first pass for 4 out of the Our hierarchical package-based mapping 6 systems. Of these 4, from the first iteration we technique provides almost perfect precision and get 50% class coverage for Ant with the other 3 fairly good recall and great code coverage. But getting more than 90% class coverage. Despite most importantly our techniques helps reduce the this, two systems get low package recall and class effort or workload required by an architect in coverage. We do not see this as a problem because accepting and rejecting mapping it is resolved simply by having more package- recommendations in interactive techniques like mapping recommendation iterations which would InMap. The technique is an improvement over the still be far less compared to class-based mapping manual package mapping methods used in today’s recommendation algorithms. state-of-the-art reflexion modelling tools. Table 3 shows the threshold values that give Despite reducing effort required, the drawback the optimal results for each system. However, we of using a purely package-based approach is that observed some similarities across the systems in due to their 1-1 package-to-module mapping style our threshold values experiments. The optimal these methods do not work well for systems that have more complex mapping configurations. It is [7] Knodel, J. 2010. Sustainable Structures in not always the case that packages, and their Software Implementations by Live Compliance members directly map to modules in a 1-1 Checking. manner. It is more likely the case that a software [8] Knodel, J. and Popescu, D. 2007. A Comparison of Static Architecture Compliance system’s code-to-architecture mapping has a Checking Approaches. Proceedings of the combination of both package and class mappings. Sixth Working IEEE/IFIP Conference on Cases where package members are spread across Software Architecture (USA, 2007), 12. multiple modules requires a class-based [9] Murphy, G.C. et al. 2001. Software Reflexion technique. Therefore, we plan as future work to Models: Bridging the Gap between Source and derive an approach to combine InMap’s good High-Level Models. IEEE Transactions on class-based approach with the good package Software Engineering. 27, 4 (Apr. 2001), 364– hierarchy-based approach presented in this paper. 380. DOI:https://doi.org/10.1109/32.917525. The aim is to combine class and package mapping [10] Naim, S.M. et al. 2017. Reconstructing and recommendations in a way that benefits from the Evolving Software Architectures Using a Coordinated Clustering Framework. advantages, and negates the disadvantages, of Automated Software Engineering. 24, 3 both mapping styles. Nevertheless, the (2017), 543–572. hierarchical packaged/based mapping technique DOI:https://doi.org/10.1007/s10515-017- presented in this paper remains useful and is 0211-8. useful in cases where it is appropriate to map [11] Olsson, T. et al. 2019. Semi-Automatic entire packages. Mapping of Source Code using Naive Bayes. ACM International Conference Proceeding Series. 2, (2019), 209–216. 7. References DOI:https://doi.org/10.1145/3344948.334498 4. [1] Ali, N. et al. 2018. Architecture Consistency: [12] Passos, L. et al. 2010. Static Architecture- State of the Practice, Challenges and Conformance Checking: An Illustrative Requirements. Empirical Software Overview. IEEE Software. 27, 5 (2010), 82– Engineering. 23, 1 (2018), 224–258. 89. DOI:https://doi.org/10.1007/s10664-017- DOI:https://doi.org/10.1109/MS.2009.117. 9515-3. [13] Rosik, J. et al. 2011. Assessing Architectural [2] Bauer, M. and Trifu, M. 2004. Architecture- Drift in Commercial Software Development: A aware adaptive clustering of OO systems. Case Study. Softw., Pract. Exper. 41, (2011), Eighth European Conference on Software 63–86. Maintenance and Reengineering, 2004. CSMR [14] de Silva, L. and Balasubramaniam, D. 2012. 2004. Proceedings. (2004), 3–14. Controlling software architecture erosion: A [3] Bittencourt, R.A. et al. 2010. Improving survey. Journal of Systems and Software. 85, 1 automated mapping in reflexion models using (2012), 132–151. information retrieval techniques. Proceedings DOI:https://doi.org/https://doi.org/10.1016/j.js - Working Conference on Reverse s.2011.07.036. Engineering, WCRE. (2010), 163–172. [15] Sinkala, Z.T. and Herold, S. 2021. InMap: DOI:https://doi.org/10.1109/WCRE.2010.26. Automated interactive code-to-architecture [4] Christl, A. et al. 2007. Automated Clustering mapping. Proceedings of the ACM Symposium to Support the Reflexion Method. Information on Applied Computing (Mar. 2021), 1439– and Software Technology. 49, 3 (2007), 255– 1442. 274. [16] Sinkala, Z.T. and Herold, S. 2021. InMap: DOI:https://doi.org/https://doi.org/10.1016/j.i Automated Interactive Code-to-Architecture nfsof.2006.10.015. Mapping Recommendations. Proceedings - [5] Christl, A. et al. 2005. Equipping the reflexion IEEE 18th International Conference on method with automated clustering. 12th Software Architecture, ICSA 2021 (Mar. Working Conference on Reverse Engineering 2021), 173–183. (WCRE’05) (2005), 10 pp. – 98. [17] Wiggerts, T.A. 1997. Using Clustering [6] Fontana, F.A. et al. 2016. Tool Support for Algorithms in Legacy Systems Evaluating Architectural Debt of an Existing Remodularization. Proceedings of the Fourth System: An Experience Report. Proceedings Working Conference on Reverse Engineering of the 31st Annual ACM Symposium on (1997), 33–43. Applied Computing (New York, NY, USA, 2016), 1347–1349.