Improving the Structural Size Measurement Method Through the Assessment of Nested (Multi-Level) Control Structures in UML Sequence Diagram Hela Hakim1, Asma Sellami1, Hanêne Ben Abdallah2, and Alain Abran3 1 Mir@cl Laboratory, University of Sfax, ISIMS, BP 242. 3021.Sfax-Tunisia. 2 Higher Colleges of Technology, Dubai, UAE 3 École de Technologie Supérieure – ETS, University of Quebec (Montréal, Canada) hakim.hela@yahoo.fr, asma.sellami@isims.usf.tn, hbenabdallah@hct.ac.ae, alain.abran@etsmtl.ca Abstract. The COSMIC ISO 19761 method is used in software industry as a contributor to estimation improvements and for comparability across projects. The COSMIC method is based on the identification of data movements but it does not consider the data manipulations. To take into account data manipulations associated with data movements at a detailed level of granularity, the measurement of control structures, also referred to as structural size, has been suggested previously. While the previous work focused on the use of constructs (alt, opt, and loop) for one structural level, the multi-level had not been considered. This work proposes refinements to our previous structural size measurement method through the assessment of the nested (multi-level) control structures in the sequence diagram. This refined method proposes detailed measures in different situations where the three constructs can be nested. These measures can be very useful for the software project planning that requires better effort estimation. A web site case study “Digital-Training Center” is used to illustrate and apply the proposed measurement algorithms. Keywords: Functional size measurement, Structural Size Measurement SSM, COSMIC, ISO 19761, combined fragments, nested combined fragments, mutli- level. 1 Introduction The software development industry as a whole has a disappointing track record when it comes to completing a project on time and within budget. The Standish Group well- known Chaos Report confirms that only 32% of software development projects are completed successfully within the estimated schedule and budget [21]. Software developers are constantly under pressure to deliver on time and on budget. As a result, many projects focus on delivering functionality at the expense of meeting the details of functionality, defined as the different scenarios describing one functionality [20]. When software details become visible and clients’ demands on software quality increase, functionality details can no longer be considered of secondary importance. Many systems fail or fall into disuse precisely because of inadequacies in these details. In an object oriented technology such details are modeled in an UML sequence diagram and some others diagrams related to modeling functionality. In practice, UML sequence diagrams notation is considered as the most popular diagram [7]. "Copyright ©2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0)" Early in the software project lifecycle, details of the specific requirements of the software to be built as well as the staffing needs and other project variables are unclear. The variability in these factors contributes to the uncertainty of project effort estimates. As the sources of variability are further investigated and pinned down, the variability in the project decreases, and thus the variability in the project effort estimates can also decrease. A number of measurement methods (e.g., Mk II [3], NESMA [4], IFPUG [2], FISMA [5], and COSMIC [1]) are proposed for sizing software functionality and for estimation purposes. Although the design of COSMIC Functional Size Measurement method has been proposed to overcome the shortcomings of the first generation methods, it does not separately assess the data manipulations associated with data movements. For a finer level of granularity of measurement, the structural size measurement with a focus on the nested multi-level control structure has been proposed: it provides development team and managers with a more detailed software size measurements and could lead to a more accurate level of estimates. This work proposes an improvement (refinement) of our previous structural size measurement method for sizing detailed requirements expressed in the form of UML sequence diagrams. In this paper, we focus on how to assess the nested combined fragments (such as opt, alt and loop) within a sequence diagram. The remainder of the paper is organized as follows: Section 2 presents an overview of the COSMIC method, the Structural Size Measurement method and the UML sequence diagram and its Combined Fragments. It also discusses some related works. Section 3 describes the refined Structural Size (SS) Measurement method. Section 4 illustrates its application combined with the COSMIC through the “Digital-Training Center” case study. Section 5 discusses and presents some limitations of our work. Finally, section 6 concludes the presented work and outlines some of its possible extensions. 2 Background 2.1 COSMIC- FSM: ISO 19761 The COSMIC Functional Size Measurement (FSM) method is becoming popular because of its ability to size different types of software (e.g., business application, real- time software, embedded software, mobile apps, neural networks, etc.) [1]. COSMIC measures the software size from the Functional User Requirements (FUR) in terms of COSMIC Function Point (CFP) units. Some measurement concepts need to be understood before starting the use of COSMIC method: the FURs describe a set of data movements that move data groups consisting of one or more data attributes to and from functional processes. Functional processes are the behavior of the software as viewed by the functional users. A functional user is either a human, an external device or another software that sends or receives data described in FUR. A boundary acts as an interface between the functional users and the software to be measured. In fact, four data movement types occur between functional users, functional processes and persistent storage: Entry (E): The functional user sends data to the functional process through the boundary. eXit (X): The data is sent from the functional process to the user through the boundary. Read (R): The data is moved from persistent storage through the functional process. Write (W): The data is moved to persistent storage through the functional process. A Functional process always has at least one entry data movement with either a Write (W) to persistent storage or an exit (X) data movement. This will account for at least two data movements and will be sized as two CFP. Thereafter, these data movements are used to get better insights into sizing the software product. 2.2 Structural Size Measurement Method In our previous work [20], we proposed the Structural Size Measurement (SSM) method. It was designed by following the measurement process recommended in [8]. The main reason for creating such method was the need of detailed measures to quantify data manipulations. As in the COSMIC method where the software functional size is derived by quantifying the FUR [9], the structural size is derived by quantifying the FUR at a detailed level (e.g., at the structural level). The proposed SSM is applied on the combined fragments of a sequence diagram to measure its structural size (SS). This SS, also named control structural size, refers to the structural size of both conditional control structures CCS and iterative control structures ICS, described respectively through the alt, opt, and loop constructs. The SS of a sequence diagram is defined at a fine level of granularity (i.e., the size of the flow graph of their control structures). The use of SS requires the identification of two types of data manipulation depending on the structure type CCS (alt and opt combined fragments in the flow graph) and-or ICS (loop combined fragment in the flow graph). Each data manipulation is equivalent to 1 CSM (Control Structure Manipulation) unit. The sequence structural size is computed by adding all data manipulations identified for every flow graph. 2.3 UML Sequence Diagram and its Combined Fragments The UML Sequence diagrams are a popular dynamic modeling solution in UML because they specifically focus on lifelines, or the processes and objects that live simultaneously, and the messages exchanged between them to perform a function before the lifeline ends [7]. Sequence diagrams can be useful references for software developers who must perform detailed design for each software component, and at different levels of details. Basically, a sequence diagram is drawn to:  represent the details of a UML use case;  model the logic of a sophisticated procedure, function, or operation;  see how objects and components interact with each other to complete a process;  plan and understand the detailed functionality of an existing or future scenario. In sequence diagrams, combined fragments are logical groupings, represented by a rectangle, which contain the conditional structures that affect the flow of messages. A combined fragment contains interaction operands and is defined by the interaction operator. The type of combined fragment is determined by the interaction operator in which the type of logic or conditional statement are identified. The interaction operator defines the behavior of the combined fragment that can be used to describe several control and logic structures in a compact and concise manner. A combined fragment can also contain nested combined fragments or interaction containing additional conditional structures that represent more complex structures that affect the flow of messages. The combined fragments opt, alt, and loop are summarized in Table 1. Table 1. Description of Alt, opt, and Loop combined fragments Fragment Description types OPT The combined fragment “opt”: encloses a sequence that might happen or not. The condition under which it occurs can be specified in the guard. ALT The combined fragment “alt”: Contains a list of fragments that contain alternative sequences of messages. Only one sequence occurs on any occasion. A guard in each fragment indicates the condition in which it can run. A guard of ELSE indicates a fragment that should run if no other guard is true. If all guards are false and there is no ELSE, then none of the fragments executes. LOOP The “loop” combined fragment repeats the condition as it is indicated in the guard a number of times. The “loop” combined fragments have the properties Min and Max that indicate a minimum and maximum number of times to be repeated by the fragment. The default is not a restriction. 2.4 Related Work Over the past two decades, researchers have proposed object-oriented software metrics to measure the quality of software design and improve the productivity [10]: for example, the CK metrics, MOOD Metrics, etc. [19, 18, 13, 12, 11, 14, 15]. The CK metrics suite can be used for measuring the software complexity [17] serving both as an analyzer and a predictor. To develop better quality software, it is necessary to identify the complexity at module, method and class level (e.g., coupling, cohesion and inheritance that have an effect on complexity). Most of these metrics studies have focused on object-oriented design and are limited only to the static aspect of design (class diagram) size. In this paper, we focus on how to measure the dynamics aspect of software design (e.g., sequence diagrams) at a fine level of granularity. In other words, measuring the sequence diagram structural size not only at one level (as it is described in our previous work [20]), but also at the multi-level where combined fragments (opt, alt, loop) are nested. 3 Improving the Structural Size Measurement Method through the Assessment of Nested Control Structures This section presents how the structural size measurement method can be improved by taking into account the in-depth nested (multi-level) control structures in the Sequence Diagram (SD). For this purpose, we propose different algorithms for assessing the nested (multi-level) combined fragments (alt, opt, and loop). In our earlier work we presented a structural size method for measuring data manipulation expressed by the combined fragments opt, alt and loop based on Sequence diagram descriptions. In this section, we extend our earlier work as follows:  First, we classify the nested Combined Fragments into three levels and each level is further divided into several cases.  Second, we propose different algorithms to support the nested multi-level combined fragment in the sequence diagrams and its corresponding flow graph. 3.1 Classifying the Combined Fragments at Different Levels (Muli-Level) Before sizing the sequence diagram containing the multi-level nested control structures in terms of CSM units, it is important for measurers to distinguish the combination of nested control structures’ categories in each level that require more attention. This is to avoid misinterpretation of measurement results. There are many more cases to be identified for classifying the combined fragments into categories at a multi-level hierarchy (including the following three cases). For simplicity sake, we will focus only on the two first categories. 1. Case when one type of the SD control structures is used at all levels. Note that the number of levels can be one or more. 2. Case when two types of the SD control structures are used at all levels. 3. Case when all the above SD control structures are used at all levels. 1st Category: A Single Control Structure Type used at all Levels This category includes the alternative ALT Combined Fragment with several alternatives, the choice OPT Combined Fragment, and the iterative LOOP Combined Fragment.  If the alternative ALT Combined Fragment contains or not one or more nested blocks having the same control structure type (ALT Combined Fragment nested in an ALT Combined Fragment), the following situations should be distinguished: − All levels have n sequence flows and each alternative has nested ALT Combined Fragment (where n is a constant number) (See Algorithm 1.1) − All levels have n sequence flows and some alternatives do not have nested ALT Combined Fragment (See Algorithm 1.1) − 1stlevel has n sequence flows and all other levels have p sequence flows (with p! = n) (See Algorithm 1.2) − 1stlevel has n sequence flows and only some other levels have p sequence flows (with p! = n) (See Algorithm 1.2) − Each level has different number of sequence flows  If the choice OPT Combined Fragment (two sequence flows where one branch is connected to an end event) and all the nested blocks have the same control structure type (OPT Combined Fragment). − Each level contains OPT Combined Fragment restricted to two sequence flows where one sequence flow is linked to end event (because it contains always a single path) (See Algorithm 1.3).  For an iterative LOOP control structure, we denote the case when each level includes a number of iterative control structures (the number of iterations is arbitrary). The following situations can be observed: − Loop with n iterations at all levels (See Algorithm 1.4) − Loop with n iterations in the first level and p iterations in the next levels (See Algorithm 1.5) − Loop with different iterations in each level (See Algorithm 1.6) 2nd Category: Two different Control Structures used at all Levels This category may include the following cases: 1stcase:  If the Alt combined fragment in the first level is followed by a nested OPT Combined Fragment, the following situations are considered: − All levels having OPT Combined Fragment (See Algorithm 2.1) − Some levels having OPT Combined Fragment in which the following cases are established: − Each alternative in the first level contains the Opt combined fragment and the other levels may have or not the OPT Combined Fragment (See Algorithm 2.2) − Each alternative may or not have the OPT Combined Fragment in all levels (See Algorithm 2.3) 2ndcase:  The OPT Combined Fragment in the first level is followed by a nested ALT Combined Fragment.  The OPT Combined Fragment in the first level is followed by a nested LOOP Combined Fragment (See Algorithm 2.4). 3rd case:  The LOOP Combined Fragment in the first level is followed by a nested OPT Combined Fragment.  The LOOP Combined Fragment in the first level is followed by a nested ALT Combined Fragment. 3rd Category: All different Control Structures Type are used at all Levels In this category, all possible combinations of the different control structures can be used. 3.2 Sizing the Nested (Multi-Level) Control Structures in the Sequence Diagram With the information from the sequence diagram and its corresponding flow graph, measurers can quickly identify the combined fragments at different levels (including the nested level). Thereafter, by applying the proposed algorithms (See Tables 2 and 3), the detailed structural size of a sequence diagram can be generated. Let:  SDm be a Sequence Diagram containing a nested (multi-level) combined fragments,  GSS be its corresponding flow graph,  GSS(SDm) be the graph-based structural size function expressing the Structural Size of the whole Sequence diagram, and  GSS(F) be the graph-based structural size function expressing the Structural Size of the combined fragment within the Sequence diagram. The sequence diagram structural size is derived as the sum of the Structural size of all the combined fragments (multi-level) as shown by equation (1): 𝑆𝑆(𝑆𝐷𝑚) = ∑𝑛𝑖=1 𝑆𝑆(𝐹𝑖 ) (1) where:  n is the number of combined fragments Fi in the SDm,  SS(Fi): the Structural size of a fragment Fi Since each fragment represented in a SD can be itself decomposed into a new fragment that refines it, each Fi can contain (or not) one or more nested control structures. The structural sizes of these fragments are derived by using algorithms as illustrated in section 3.2.2. Note that: − GSSic(F) is the (sub)graph representing the structural size of the iterative control structure SS(ICS), − GSScc(F) is the (sub) graph representing the structural size of the conditional control structure SS(CCS) that may include alt combined fragment and-or opt combined fragment. To measure the structural size of a fragment, we propose algorithms based on the defined strategy. Proposed algorithms for the 1st category. When a single control structure type is used at all levels, we propose to use Algorithm 1.1 and Algorithm 1.2. Algorithm 1.1 represents how sizing an ALT Combined Fragment nested within another ALT Combined Fragment in the same SDm, where all levels have n alternatives (with n a fixed number). Algorithm 1.2 illustrates the ALT Combined Fragment nested within another ALT Combined Fragment where the 1st level has n alternatives and all other levels have p alternatives (with p! = n). Proposed algorithms for the 2nd category When two types of the SD control structures are used, we propose to use Algorithm 2.1, Algorithm 2.2 and Algorithm 2.3. 1st case: Algorithm 2.1 expresses the case when the Alt combined fragment is followed by Opt combined fragment in each level. While Algorithm 2.2 and 2.3 represent the case when the alternative Alt combined fragment is followed by some Opt combined fragments, this situation represents two cases:  Algorithm 2.2 represents that each sequence flow in the Alt combined fragment (i=0) contains Opt combined fragment, and other alternatives in the followed levels may contain or not an Opt combined fragment.  Algorithm 2.3 represents that each alternative in the Alt combined fragment (i=0) may or not have choice Opt combined fragment in all levels (Algorithm 2.3). Table 2. Algorithms and measurement formulas for the 1st category Algorithm 1.1. ALT combined fragmentnested within anotherALT combined fragment in the SDm (All levels have n alternatives (with n a fixed number) and each node has a nested Combined Fragment F Begin inti=0; // i = [0,1] i is the root or the1stlevel alt=n; // n is the number of nodes in each level i i=1; int j=1// j = [1..h] where h is the number of nodes in the 1stlevel i=1; //Xij For each level i For each node xij out-degree (xij) = n 𝑆𝑆𝑐𝑐 (𝑥𝑖𝑗 ) = GSScc(𝑥ij ) End for Flow graphs modeling Algorithm 1.1 i++; int m = [1..d]; where d is the number of nodes in the second level(i=2) For each level i For each node yim out-degree (yim) = n 𝑆𝑆𝑐𝑐 (𝑦𝑖𝑚 ) = GSScc(𝑦im ) End for End for 𝑑 𝑆𝑆(𝐹) = ∑ (𝑆𝑆𝑐𝑐 (𝑦𝑖𝑚 ) ∗ (𝑆𝑆𝑐𝑐 (𝑥𝑖𝑗 )/𝑛)𝑖 ) 𝑚=1 = n*d =n*ni =ni+1 End for End Note that d= ni if there is a nested Alt combined fragment in each alternative SS(F)= ni+1 Else 𝑑 𝑆𝑆(𝐹) = ∑ (𝑆𝑆𝑐𝑐 (𝑦𝑖𝑚 ) ∗ (𝑆𝑆𝑐𝑐 (𝑥𝑖𝑗 )/𝑛)𝑖 ) 𝑚=1 Algorithm 1.2 Alt combined fragment nested in an Alt combined fragment: 1st level has n alternatives and all other levels have p alternatives (with p! = n) Begin int i=0 ; // i=[0,1] i is the root or the1stlevel alt=n; // n is the number of nodes in each level i i=1; int j=0// j = [0..h] where h is the number of nodes in the first level i=1;//xij For each level i For each node xij out-degree (xij) = n 𝑆𝑆𝑐𝑐 (𝑥𝑖𝑗 ) = GSS(𝑥ij ) End for i++; Flow graphs modeling Algorithm1.2 int m = [1..d]; where d is the number of nodes in the second level (i=2) For each level i For each node yim out-degree (yim) = p 𝑆𝑆𝑐𝑐 (𝑦𝑖𝑚 ) = GSS(𝑦im ) End for End for 𝑑 𝑆𝑆(𝐹) = ∑ (𝑆𝑆𝑐𝑐 (𝑦𝑖𝑚 ) ∗ (𝑆𝑆𝑐𝑐 (𝑥𝑖𝑗 )/𝑛)𝑖 ) 𝑚=1 = (p*d) CSM End for End 𝑑 𝑆𝑆(𝐹) = ∑ (𝑆𝑆𝑐𝑐 (𝑦𝑖𝑚 ) ∗ (𝑆𝑆𝑐𝑐 (𝑥𝑖𝑗 )/𝑛)𝑖 ) 𝑚=1 Algorithm 1.3 Opt combined fragmentand all the nested blocks have the same control structure type (Opt combined fragment) Begin int i=0; // i = [0, l] i is the root or the1stlevel alt=n=1; // n is the number of nodes in each level i i=1; For each level i For each node xi out-degree (xi) = n=1 𝑆𝑆𝑐𝑐 (𝑥𝑖 ) = GSS(𝑥𝑖 ) End for Flow graphs modeling Algorithm 1.3 End for i++; For each level i For each node yi out-degree (yi) = 1 𝑆𝑆(𝑦𝑖 ) = GSS(𝑦𝑖 ) End for End for 𝑆𝑆(𝐹) = (𝑆𝑆𝑐𝑐 (𝑦𝑖 ) ∗ (𝑆𝑆𝑐𝑐 (𝑥𝑖 )/𝑛)𝑖 ) = 1 CSM End Algorithm 1.4 Each level contains iterative control structures (LOOP Combined Fragment) where the number of iterations is fixed (r) inti=0; // i = [0..l] where i is the root or the1stlevel int itr=r; // r is the number of iterations i=1; For each level i having r number of iterations For each node xi out-degree (xi) = r 𝑆𝑆𝑖𝑐 (𝑥𝑖 ) = GSS(𝑥𝑖 ) End for Flow graphs modeling Algorithm 1.4 i++; For each level i having r number of iterations (r is a fixed number equal to the fixed number in the previous level) For each node yi out-degree (yi) = r 𝑆𝑆𝑖𝑐 (𝑦𝑖 ) = GSS(𝑦𝑖 ) End for End For 𝑆𝑆(𝐹) = 𝐺𝑆𝑆(𝑥𝑖 ) ∗ 𝐺𝑆𝑆(𝑦𝑖 ) = 𝑟 𝑖 End for Algorithm 1.5 Each level contains iterative control structures (LOOP Combined Fragment) where the number of iterations is fixed in the first level and s iterations in the next levels inti=0; // i = [0,1] i is the root or the1stlevel int it r= r; // r is the number of iterations in the first level int itr= s; // s is the number of iterations in other levels i=1; For each level i having r number of iterations For each node xi out-degree (xi) = r 𝑆𝑆𝑖𝑐 (𝑥𝑖 ) = GSS(𝑥𝑖 ) End for Flow graphs modeling Algorithm i++ 1.5 For each level i having s number of iterations which is different to the previous level For each node yi out-degree (yi) = s 𝑆𝑆𝑖𝑐 (𝑦𝑖 ) = GSS(𝑦𝑖 ) End for End For 𝑆𝑆(𝐹) = 𝐺𝑆𝑆(𝑥𝑖 ) ∗ 𝐺𝑆𝑆(𝑦𝑖 ) = 𝑟 ∗ 𝑠 𝑖 End for Algorithm 1.6 Each level contains iterative control structures (LOOP Combined Fragment) where the number of iterations is different from each level int itr=r; // r is the number of iterations in the first level int itr=s, t; // s and tare respectively the number of iterations in the next levels i=1; For each level I having r number of iterations For each node xi Flow graphs modeling Algorithm 1.6 out-degree (xi) = r 𝑆𝑆𝑖𝑐 (𝑥𝑖 ) = GSS(𝑥𝑖 ) End for i=2; For each level i having s number of iterations (s is different from the previous level) For each node yi out-degree (yi) = s 𝑆𝑆𝑖𝑐 (𝑦𝑖 ) = GSS(𝑦𝑖 ) End for End For i++; For each node yi out-degree (yi) = t 𝑆𝑆𝑖𝑐 (𝑦𝑖 ) = GSS(𝑦𝑖 ) End for 𝑆𝑆(𝐹) = 𝐺𝑆𝑆(𝑥𝑖 ) ∗ 𝐺𝑆𝑆(𝑦𝑖 ) =𝑟∗𝑠∗𝑡 End for Table 3. Algorithms and measurement formulas for the 2nd category Algorithm 2.1: Alt combined fragment in the first level is followed by nested opt Combined Fragment F in all levels Begin inti=0 ; // i=0…l is the number of levels Alt=n ; // n is the number of nodes in the level i=1; int j=1// j=1..h where h is the number of nodes in 1st level For each level i having choice Opt combined fragment For each alternative in level i For each node xij out-degree (xij) = 1 𝑆𝑆𝑐𝑐 (𝑥𝑖𝑗 ) = GSS(𝑥ij ) Flow graphs modeling End for Algorithm 2.1 i++ int m=1….d; where d is the number of nodes in 2nd level For each level i For each node yim out-degree (yim) = 1 𝑆𝑆𝑐𝑐 (𝑦𝑖𝑚 ) = GSS(𝑦im ) End for End For 𝑑 𝑆𝑆𝐹 = ∑ (𝑆𝑆𝑐𝑐 (𝑦𝑖𝑚 ) ∗ (𝑆𝑆𝑐𝑐 (𝑥𝑖𝑗 ))𝑖 ) 𝑚=1 𝑃 𝑆𝑆𝐹 = 𝑆𝑆𝑐𝑐 = ∑(GSS(𝑥𝑖 )+GSS(𝑦))𝑖 𝑖=1 End for 𝑛 𝑃 𝑆𝑆𝑠𝑒𝑞 = ∑ (∑(GSS(𝑥)+GSS(𝑦))𝑖 ) 𝑗=1 𝑖=1 𝑗 End for End Algorithm 2.2: Each sequence flow in the Alt combined fragment (i=0) contains Opt combined fragment, and other alternatives in the followed levels may have or not an Opt combined fragment Begin Int i=0 ; // i is the number of levels outdegree(x)=n; j=1..n ; //j number of nodes from 1 to n i=1..p; i//number of levels from 1 to p Alt=n ; // n is the number of nodes For (j=1, j