COSMIC Light vs COSMIC Classic Manual: Case Studies in Functional Size Measurement Tuna Hacaloglu1-2, Huseyin Unlu3, Onur Demirors3, Alain Abran4 1 Middle East Technical University, Ankara, 06531, Turkey 2 Atilim University, Ankara, 06830, Turkey tuna.hacaloglu@atilim.edu.tr 3 Izmir Institute of Technology, Izmir, 35430, Turkey [huseyinunlu,onurdemirors]@iyte.edu.tr 4 École de Technologie Supérieure, Montréal, Canada Alain.Abran@etsmtl.ca Abstract. Functional size has been used in software engineering for more than 40 years. When measured early in the software development life cycle, it can serve as direct input for effort estimation. The COSMIC Functional Size Measurement (FSM) method developed by the Common Software Measurement Consortium (COSMIC) is the latest ISO-compliant functional sizing method. A streamlined manual titled ''Software Development Velocity with COSMIC Function Points'' summarizes the measurement process and shortens the learning time. The aim of this study is to compare the classic COSMIC FSM manual and this new “light” manual in terms of accuracy of the resulting FSM applied to case studies. The findings show that use of the light manual results in accurate measurement. In addition, there were no significant time differences between the two. With respect to the variations in COSMIC Function Points (CFP) values in the two case studies, they three causes were identified: the Object of Interest (OOI) concept and corresponding data groups, details regarding Functional Process Independence, and Error/ Confirmation messages related to the scope of the information included in the manuals. Keywords: COSMIC, Function Points, Software Measurement, Size, ISO 19761 1 Introduction Software1 size measurement is a significant activity in software project management, as it is the main input for effort and schedule estimation, which provides a key business advantage [1, 2]. Successful effort estimation may reduce potential risks such as schedule and budget overruns. Reliable software size measurement is therefore of critical importance. Copyright ©2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). 2 Functional size measurement (FSM) has been one of the widely used size measures for more than 40 years [3]. Being based on standardized methods, FSM provides an objective size measurement method for establishing estimation models, internal process improvement, and benchmarking [4–6]. Functional size can also be utilized for keeping track of the project’s scope change, establishing an agreement between the supplier and acquirer, and improving the organization’s processes by normalizing performance and quality measures [7]. COSMIC [8] is one of the most commonly used [9] FSM methods accepted by ISO [10]. The measurement process of COSMIC is based on calculating data movements, such as Entry (E), Exit (X), Read (R) and Write (W), in functional processes that are triggered by functional users. There are various measurement guidelines having varying levels of detail for measuring the size of software in different domains such as business application software [11], real-time software [12], and service-oriented software [13]; some of these are also supported with case studies. All of these guidelines are based on the “COSMIC Measurement Manual” [8], which provides the rules and definitions in “The COSMIC Implementation Guide for ISO/IEC 19761: 2017” [10] as well as further guidance and examples to help measurers fully understand how to apply the method. However, because it describes the measurement process in detail, this manual, at 115 pages, can be overwhelming for software developers, especially if they are just at the learning stage or have limited time available. Presenting the method in a concise manner could speed up the learning process and encourage its use early on in the estimation process. Accordingly, Abran [14] has recently published “Software Development Velocity with COSMIC Function Points”, which presents a “light” version of COSMIC Function Points (FPs) and can be used to size the software functionality that can be delivered within a given time period. This new manual summarizes the COSMIC Measurement Manual in only 13 pages. Its aim is to facilitate the learning process for software developers being introduced to COSMIC as well as for those already using it. However, regardless which manual is chosen for learning COSMIC, the resulting size measurements must be consistent. In the literature, only a handful of studies [15, 16] compare the effect of measurers’ understanding and assumptions on measurement accuracy. There is no study comparing measurement accuracy resulting from the use of different manuals of the same FSM method. In light of this situation, the aim of the present study is to explore how using either the official “COSMIC Measurement Manual” (Manual 1) or “Software Development Velocity with COSMIC Function Points” (Manual 2) affects measurement results. It reports on the results of two business application case studies where two projects were measured by two measurers using Manual 1 and Manual 2. We compare the COSMIC Function Points (CFP) values calculated against the effort actually spent, and we discuss the causes of the difference in measurement results. Our findings reveal the strengths and difficulties of each of the two manuals and also suggest possible improvements to both. The remainder of this paper is structured as follows: Section 2 presents the research methodology and the cases. In Section 3, the results obtained are presented in terms of the calculated CFP values and effort spent, and the cause of the difference in 3 measurement results is discussed. Section 4 presents the threats to validity, and finally Section 5 presents the conclusion and possible future work. 2 Research Methodology The aim of this study is to investigate how using either Manual 1 (“COSMIC Measurement Manual”) or Manual 2 (“Software Development Velocity with COSMIC Function Points”) affects measurement results. In other words, we wanted to evaluate the effect of two different representations of the measurement procedure on measurement results considering the accuracy of CFP values, the causes for the differences, and the effort spent on the measurement process. To achieve this aim, the case study research method was adopted. The following subsections present the case study details. 2.1 Case Study Design To explore the effect of using either Manual 1 or Manual 2 on the measurement results, the following research questions were formulated: RQ1. Is there a difference between CFP values when measuring using Manual 1 versus Manual 2? RQ2. What are the causes of the difference in measurement results? RQ3. How does the effort spent on measurement differ when using Manual 1 versus Manual 2? Case Selection Criteria. In selecting the cases, the first criterion was to selected projects that measurers were not familiar with, in order to ensure the objectivity in the measurement process. We therefore selected cases from two different business application projects. Another criterion was the measurers’ ability to understand the project domain; for this reason, we chose a regular business application type of software. The last criterion was the availability of functional requirements documentation at an understandable level of detail for both measurers. Data Collection Procedure. The measurers collected and recorded the CFP values obtained as a result of the measurement, the time (in minutes) spent measuring the functional size of the projects, and the areas where they made assumptions or needed more information to make a judgment within the measurement process. Measurement Planning. The size of each project was to be measured by two measurers: Measurer 1 (M1) has three years of experience with the COSMIC FSM method, and Measurer 2 (M2) has one year. The measurers have not taken the certification exam; however, both have taken a formal COSMIC course and together they have measured more than 20 projects with over 5,000 function points. The measurement process was planned in two stages: • In the first stage, M1 was to measure the functional size of Case 1 using Manual 2, and M2 was to measure it using Manual 1. 4 • In the second stage, the measurers were to perform cross-measuring such that M1 measures Case 1 based on Manual 1, and M2 measures it based on Manual 2. Next, we planned the same setup for Case 2, but reversed: M1 measures Case 2 based on Manual 2, and M2 measures it based on Manual 1. On the one hand, each project was measured by each measurer and each of the manuals was used in the measurement process. On the other hand, although both measurers had experience with COSMIC measurement and had already read both manuals, while measuring they would keep to the guidelines of the manual (1 or 2). Finally, for consistency, measurement of each case was scheduled to start at the same time by each measurer. The starting and end times of each measurement would also be recorded to calculate the amount of time spent on the measurement. In addition, the results were discussed by the measurers immediately after the measurements, so all the differences and misunderstandings could be addressed and a consensus reached. The measurement plan is visualized in Figure 1. Fig. 1. Measurement plan with the two case studies 2.2 Description of Cases The two case studies selected were from a domain familiar to both measurers. Unfortunately, we cannot share the case details due to privacy concerns. • Case 1 is a business application software; written in a detailed way similar to a complete Software Requirements Specification (SRS) format, it includes 13 use cases. The documentation included use case descriptions, user interfaces, and an entity-relationship diagram (ERD) in the SRS document. • Case 2 is also a business application. We selected the software analysis document from a single sprint. In contrast to Case 1, there were no use case descriptions or data models such as an entity-relationship diagram in this sprint documentation. Instead, the functional user requirements (FURs) were written in simple textual descriptions together with screen mock-ups. 2.3 Measuring the Case Studies The measurement was performed using the functional user requirements and user interfaces included in the case documents as inputs. Following the measurement plan described in the previous section, the measurement for each case was performed according to the level of detail provided in the selected manual to identify the data movements (Entries, Reads, Writes, and Exits) and add them up to obtain the functional size of the software project. The effort spent was recorded by each measurer during each measurement process. 5 Upon completion of each measurement, the results were discussed in terms of effort, measured size, and guidance provided by the manual; this discussion also included an overall evaluation of the functionality of the two manuals. In addition, during these discussions, when an unclear functional user requirement for the measurement had been observed and when both measurers agreed, these were eliminated. 3 Results and Discussion 3.1 General Findings The measurement results of each measurer are presented in Table 1 for the two case studies and the measurement manual used. The detailed COSMIC size measurement manual (Manual 1) is the correct size measure. The correctness of the measurement has been verified by an independent measurer. As shown in Table 1, there is no significant difference in terms of the amount of time spent between the two measurers when using different manuals for each of the two cases. Here it is worth repeating that Manual 1 has 5 chapters and 115 pages whereas Manual 2 has 4 chapters and 13 pages. However, even though different manuals—with very different levels of detail—were used, the time spent by the measurers is similar. Table 1. Measurement effort and measured size for each case Case Measurement effort (minutes) Measured size (CFP) Manual 1 Manual 2 Manual 1 Manual 2 (Heavy) (Light) (Heavy) (Light) Case 1 88 93 71 85 Case 2 35 48 77 116 Interestingly, even though Manual 1 is much more detailed than Manual 2, the measurers who used Manual 1 as the reference completed the measurement in less time in both cases. However, when we look at the functional size values in terms of CFP, in both cases the measurers who used Manual 2 found higher CFP values. These findings imply that measurers using Manual 2 (i.e., the light version of Manual 1) can: • perform the measurement; and • identify more functional size-related items, but it takes more time for this specific study. One of the reasons for the change in CFP values observed in the measurement results of Case 2 is related to the EXIT data movement. For EXITs, both manuals specify the following rule: “An Exit accounts for all data manipulation to create the data group attributes to be output and/or to enable the data group to be output (e.g. formatting and presentation manipulations) and to be routed to the intended functional user.” However, in Case 2, it can be seen that Measurer 1, using Manual 1, identified more EXIT data movements than Measurer 2, who used Manual 2. This is because in Case 2 there were requirements specifying that the reports displayed on the screen could be exported in Microsoft Excel format. The measurer who used Manual 1 considered this 6 extra type of display as another functional process and counted it as an EXIT data movement. Measurer 2, on the other hand, did not define these as new EXITs. Consequently, the authors suggest that this issue needs to be clarified in both manuals. Apart from these comparisons of CFP values and measurement effort, upon completing the measurements the measurers met to evaluate the measurement process and identify the causes for any variations in CFP values for each requirement in the two case studies. From these discussions, the causes of these variations were classified into three groups with respect to detailed (Manual 1) versus summarized (Manual 2): 1) Object of Interest (OOI) concept and data groups 2) Functional Process independence, and 3) Error/Confirmation messages. 3.2 Variation in CFP Values Related to OOI and Data Groups Object of Interest (OOI) is an important concept in the COSMIC FSM method, where a data movement can only be taken into account when it conveys a data group belonging to the same OOI. This concept is described in detail in Manual 1, which also refers to the Guideline for Sizing Business Application Software [11], which includes a special section encouraging measurers to use different data analysis methods such as Entity- Relationship Analysis, Relational Data Analysis, and UML Class Diagram to identify OOI. However, OOI is not mentioned in Manual 2. Moreover, identification of persistent and transient data groups used to define OOI are described in Manual 1 but are not mentioned in Manual 2. For this reason, when measuring Case 2, Measurer 1 paid special attention to the OOI concept and found more data movements while Measurer 2, who performed the measurement using Manual 2, did not consider this concept. In addition: • In Case 2, one of the functional user requirements refers to a search by a filtering feature: Measurer 2, using Manual 2, added an extra CFP for each filtering feature, although it is an attribute of the same OOI. This led to greater CFP values for the Case 2 measurement based on Manual 2. • In Case 1, data analysis in the form of ERD in the requirements document helped Measurer 1 to identify the data groups without considering OOI. 3.3 Variation in CFP Values Related to Functional Process Independence Another important component of COSMIC FSM is the Functional Process. Manual 1 gives a detailed definition of the Functional Process while Manual 2 offers only a short description. More specifically, in Manual 1, under the functional process identification section, “Independence of functional processes” is explained in detail whereas this detail is not mentioned in Manual 2. Manual 1, states that “… in the COSMIC method (as in all other FSM Methods) each functional process is defined, modeled and measured independently of, i.e. without reference to, any other functional process in the same software being measured…”. In addition, Manual 1 further specifies that “when a statement of a FUR is implemented in software, any "functional commonality" may or may not be developed as reusable 7 software. All implementation decisions including the extent of actual or potential software reuse must be ignored when measuring functional size. The following observations apply to Case 2, where functional similarities had been observed: • Complying with the guidance information given in Manual 1, Measurer 1 included the functional similarities in his CFP calculation. • Since this detail is not mentioned in Manual 2, Measurer 2 eliminated the functional similarities in the functional size measurement. 3.4 Variations in CFP Values Related to Error/ Confirmation Messages Manual 1 includes detailed guidance on error/ confirmation Messages, whereas in Manual 2 this is only briefly presented under the EXIT type data movement-related description. More specifically, Manual 1 [8] states that “If the FUR of the functional process does not require any type of error/confirmation message to be issued, do not identify any corresponding Exit”. Therefore, in Case 2, which did not have a FUR specifying the information related to system behavior regarding error or confirmation messages: • Measurer 1, by referring to Manual 1, did not identify or count any related data movement. • Measurer 2 made an assumption and included the error/ confirmation messages in the measurement process. This lack of detail regarding error/ confirmation messages in Manual 2 causes inconsistencies in the measurement. For better measurement consistency, this information could also be added to Manual 2. On the other hand, in Case 1, the system behavior regarding error or confirmation messages was specified in use case descriptions. Thus, there were no major inconsistencies in Case 1 regarding error or confirmation messages. 3.5 Other Comments Regarding Measurement During the case studies, some problems already stated in the literature were also observed, such as the importance of the quality of requirements documents [17–19], difficulty identifying OOIs [14] due to the lack of data modeling [20] or errors in the data model (in Case 1), and challenges due to individual assumptions and understandings [15]. For example: • Even though both measurers were consistent within their own measurements, their understanding of the Manuals may have differed: for instance, one of the measurers defined different EXITs for each of the confirmations and errors throughout the measurement whereas the other one counted them as pairs. • The lack of data models and the paucity of information regarding requirements caused a challenge for Case 2: there was a combo box, but whether it retrieved its content from persistent storage or not was not specified in the requirements document. For this reason, the number of OOIs and the type of data movement involved were not clear. In addition, even though the Guideline for Sizing Business Application Software [11] specifies details related to combo boxes, neither Manual 1 or Manual 2 includes such detail. 8 4 Threats to Validity One of the validity threats can be related to measurer experience. Measurer 1 had three years of COSMIC experience while Measurer 2 had only one. They have not taken the certification exam; however, both have taken a formal COSMIC course and together they have measured more than 20 projects with over 5,000 function points. To overcome the difference in measurer experience that could have changed the calculated CFP values and the measurement effort, cross-measurement was used: Each measurer measured one case based on Manual 1 and the other based on Manual 2. At the end of each case, the measurement results were discussed, and if there was a functional user requirement that was unclear to the measurers, it was eliminated. In addition, all measurements were verified by an established COSMIC measurer. Another threat to validity is that both measurers had previous COSMIC knowledge and had previously measured different projects using Manual 1; this can affect the Manual 2 measurement results. To overcome the effect of Manual 1, measurers only applied the rules in Manual 2 for the measurements based on that manual: for instance, “object of interest” is not explained in Manual 2 and measurers did not consider this term in their measurements. Our research methodology is clearly given in Section 3, and if another researcher replicates this study, then the same procedures can be applied. 5 Conclusion and Future Work This study presented an experiment to explore the impact on measurement of using two distinct versions of the COSMIC FSM method [8, 14]. A major finding is that, when using Manual 2 as the reference, the measurers identified more data movements – e.g., higher CFP values. Therefore, it can be concluded that even though Manual 2 is the light version of Manual 1, it is still capable of yielding good measurement results. Analysis of the detailed measurement results obtained while using these two different manuals identified that the measurement differences came from three measurable concepts--Object of Interest (OOI) concept and corresponding data groups, Functional process independence, and Error/Confirmation messages--in particular when the quality and completeness of the project documentation varied. The authors suggest the incorporation of details related to these concepts into Manual 2. When the effort spent on measurement while using Manual 1 and Manual 2 was compared, no significant difference was noticeable. However, in both case studies, it took slightly more time to complete the measurement using Manual 2. Manual 2, the light version of the COSMIC FSM method, is intended to facilitate and speed up the learning process for newcomers to the COSMIC method. Future work is needed to assess the usability and learnability of both manuals; for example, another experiment could be conducted with newcomers to the COSMIC FSM methods trained using either Manual 1 or Manual 2. In this way, both the learning time and the measurement results could be compared. In addition, effort estimation models could be developed and the effectiveness of each manual could be compared on the basis of the results they provide in terms of effort estimation. Future studies on questions such as 9 “Would it be useful/essential to go through the 115 pages of Manual 1 when the recommended minor additions to Manual 2 would also be valuable.” References 1. Salmanoğlu, M., Öztürk, K., Bağrıyanık, S., Ungan, E., Demirörs, O.: Benefits and challenges of measuring software size: early results in a large organization. Presented at the 25th International Workshop on Software Measurement and 10th International Conference on Software Process and Product Measurement, IWSM-Mensura (2015). 2. Abran, A.: Software project estimation: the fundamentals for providing high quality information to decision makers. John Wiley & Sons (2015). 3. Albrecht, A.J.: Measuring application development productivity. Presented at the Proc. Joint Share, Guide, and IBM Application Development Symposium, 1979 (1979). 4. Abran, A., Desharnais, J.-M., Zarour, M., Demirörs, O.: Productivity-based software estimation models and process improvement: an empirical study. Int J Adv Softw. 8, 103– 114 (2015). 5. Commeyne, C., Abran, A., Djouab, R.: Effort estimation with story points and cosmic function points-an industry case study. Softw. Meas. News. 21, 25–36 (2016). 6. Trudel, S., Buglione, L.: Guideline for sizing Agile projects with COSMIC. Presented at the Proceedings of International Workshop on Software Measurement (2010). 7. Ozkan, B., Turetken, O., Demirors, O.: Software functional size: For cost estimation and more. In: European Conference on Software Process Improvement. pp. 59–69. Springer (2008). 8. COSMIC Measurement Manual Version 4.0.2. The Common Software Measurement International Consortium (2017). 9. Abran, A.: Automating Functional Size Measurement–a Survey. Presented at the UKSMA/COSMIC Conference 2011-22nd Annual conference on Metrics and Estimating: hosted in collaboration with COSMIC (2011). 10. ISO/IEC 19761:2017 - Software Engineering – COSMIC: A Functional Size Measurement Method. International Organization for Standardization (2017). 11. Guideline for Sizing Business Application Software. The Common Software Measurement International Consortium (2017). 12. Guideline for Sizing Real-time Software. The Common Software Measurement International Consortium (2015). 13. Guideline for Sizing Service-Oriented Architecture Software. The Common Software Measurement International Consortium (2015). 14. Abran, A.: Software Development Velocity with COSMIC Function Points. The Common Software Measurement International Consortium (2019). 15. Top, O.O., Demirors, O., Ozkan, B.: Reliability of COSMIC functional size measurement results: A multiple case study on industry cases. Presented at the 2009 35th Euromicro Conference on Software Engineering and Advanced Applications (2009). 16. Turetken, O., Top, O.O., Ozkan, B., Demirors, O.: The impact of individual assumptions on functional size measurement. In: Software Process and Product Measurement. pp. 155–169. Springer (2008). 17. Trudel, S., Abran, A.: Functional Size Measurement Quality Challenges for Inexperienced Measurers. In: Abran, A., Braungarten, R., Dumke, R.R., Cuadrado-Gallego, J.J., and Brunekreef, J. (eds.) Software Process and Product Measurement. pp. 157–169. Springer, Berlin, Heidelberg (2009). https://doi.org/10.1007/978-3-642-05415-0_12. 18. Desharnais, J., Buglione, L., Kocaturk, B.: Improving Agile Software Projects Planning Using the COSMIC Method. Presented at the workshop on Managing Client Value Creation Process in Agile Projects (Torre Cane, Italy (2011). 10 19. Desharnais, J.-M., Kocaturk, B., Abran, A.: Using the COSMIC Method to Evaluate the Quality of the Documentation of Agile User Stories. In: 2011 Joint Conference of the 21st International Workshop on Software Measurement and the 6th International Conference on Software Process and Product Measurement. pp. 269–272 (2011). https://doi.org/10.1109/IWSM-MENSURA.2011.45. 20. Hacaloglu, T., Demirors, O.: Measureability of Functional Size in Agile Software Projects: Multiple Case Studies with COSMIC FSM. In: 2019 45th Euromicro Conference on Software Engineering and Advanced Applications (SEAA). pp. 204–211 (2019). https://doi.org/10.1109/SEAA.2019.00041.