Selecting Suitable Software Effort Estimation Method Duygu Deniz Erhan1, Ayça Kolukısa Tarhan2 [0000-0003-1466-9605], Rana Özakıncı3 [0000-0002-7803-453X] 1,2,3 Hacettepe University, Department of Computer Engineering, Ankara, TURKEY 1 duygudeniz06@gmail.com, {2 atarhan, 3 ranaozakinci}@hacettepe.edu.tr Abstract. Effort estimation is one of the important factors affecting the success of software projects. In order to support this, many effort estimation methods have been developed from past to present. The reliability of the effort estimation of a project depends on the choice of the most appropriate method for the project characteristics and the estimation context. Even if a good performing method is used, the estimation results may remain to be inaccurate if an ap- propriate estimation method is not selected as appropriate to the project context. In this study, we proposed an approach for selecting the most suitable estimation method for a software pro- ject by considering the project characteristics and the stakeholder needs. An expert-opinion survey was prepared based on the key features of the commonly used estimation methods that have been frequently referred to in literature. The expert-opinion survey was answered by ex- perts who carried out scientific studies in the field of software effort estimation, and a decision matrix was created in the light of their opinions. Then, a questionnaire was built for eliciting information about project characteristics from an estimator who wants to carry out effort esti- mation for his/her project. With the decision matrix, the estimator can select the most suitable method for his/her estimation by answering the questionnaire. A sample study was conducted and the questionnaire was answered using the ISBSG data set. At the end, the appropriateness of the proposed approach was discussed. Keywords: Effort Estimation, Software Effort, Estimation Method, Method Selection, Decision Matrix, Expert Opinion. 1 Introduction Software effort estimation (SEE) is the process of predicting the amount of effort required to build a software system. For effective planning, effort and schedule esti- mation is required for a project. In order to provide this benefit, estimation process must be accurate and reliable but this is a difficult task. In order to address this prob- lem, many estimation methods have been proposed by researchers and many of the proposed methods have been shown to give successful results. Nevertheless, there is no estimation method that makes the most accurate estimates in all projects [1,2]. Estimation methods make successful estimations for projects that provide certain characteristics (organizational structure, type of project, development environment, etc.). It is stated that the most successful effort estimation method can change for a given data set because different criteria are used [3]. Since the estimation Copyright ©2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). 2 method that gives accurate results will change even for different projects within the same organization, the selection of methods on the basis of the organization may not give correct results. Although methods focusing on specific project features are con- tinually proposed for more accurate estimates in literature [4,5], it is necessary to perform analyses regarding the attributes of method, project and environment each time and even to use expert knowledge for determining which method is suitable. An accurate effort estimation can be achieved only by selecting an estimation method best matched to the estimation context. Several studies on selecting the suitable estimation method have been proposed by examining the project properties and data sets to be used [4,5]. Although these studies have shown that the success of estimation methods can change according to the pro- ject characteristics and data set, they do not propose a general method for different types of projects and environments. The existing selection methods are not feasible for a new project. Since the method is chosen according to the MMRE success of the estimations in old projects, the characteristics of the new project are not considered, so it may not be suitable for the new project. In this study, project characteristics that affect the success of estimation methods were examined. For this purpose, an expert-opinion survey was prepared and the rela- tionship between the project characteristics and estimation methods was studied. The survey was realized by referring to the knowledge of the experts having published scientific studies on software effort estimation. It was aimed to determine the most suitable estimation method for given project characteristics and stakeholder needs by using the data obtained from the expert-opinion survey and answers from estimator questions. Multiple Criteria Decision Analysis (MCDA) method was used while pro- cessing the data from the expert-opinion survey into a decision matrix. Then, in an example estimation scenario based on ISBSG dataset, a user was asked to answer a set of estimator questions prepared, and the most suitable estimation method was selected with the information of a new project to be estimated. The rest of this paper is organized as follows. Sec 2 provides background and re- lated work on software effort estimation, method selection and MCDA. Sec 3 ex- plains the proposed evaluation approach, alternative estimation methods used in this study and method selection criteria. Also, it is exemplified how decision matrix val- ues are formed over the answers to expert-opinion survey questions. Sec 4 presents example evaluation using ISBSG data set and related estimation assumptions, and explains the feasibility of the proposal. Sec 5 discusses the weaknesses in the pro- posed approach. Finally, Sec 6 concludes the paper with a summary of this study and plans for future. 2 Background and Related Work 2.1 Software Effort Estimation and Method Selection Since effort estimation method selection is a major factor in estimation, there are many studies in literature that examine and classify methods. At the same time, due to the importance of the criteria that affect the decision in method selection, there are 3 many studies that investigate the criteria of various parameters that affect the meth- ods. We chose the methods and the criteria we used to prepare the expert-opinion survey based on the studies that we describe below. Wen et al. [6] made a systematic literature review on machine learning (ML) based software development effort estimation models. They analyzed ML based models from four aspects: type of ML technique, estimation accuracy, method comparison, and estimation context. After reviews, the authors found that eight types of ML tech- niques are mostly used, including Case-Based Reasoning, Artificial Neural Networks, Decision Trees, Bayesian Networks, Support Vector Regression, Genetic Algorithms, Genetic Programming, Association Rules. They suggested that ML methods are usu- ally more accurate than non-ML methods. Also, they listed the strengths and weak- ness of the ML techniques used in software effort estimation. Jorgenson et al. [7] prepared a basis for the studies about improvement of software development cost estimation. They explored the question “What are the most investi- gated estimation methods and how this changed over time?” and as a result, they showed the distribution of articles on different estimation approaches per period and in total. Also, they make recommendations for future estimation researches. Marco et al. [8] made a systematic review on software effort estimation methods and reported that the number of studies on the subject is increasing. They also pre- pared a list of the best performing methods and the most used methods. The most active and influential researchers were also shown in their paper. We invited these researchers to answer our expert-opinion survey. Idri et al. [9] analyzed analogy-based SEE techniques according to criteria and the studies from some perspectives (estimation context, accuracy comparison, estimation accuracy etc.) and they found that more estimation techniques should be developed. They also said that accuracy in effort estimation depends on several categories of parameters. These parameters are: Dataset characteristics used (size, missing value, etc.), analogy process configuration (adaptation formula, feature selection, etc.), eval- uation method used (n-fold cross validation, disagreement, etc.). Bilgaiyan et al. [10] made a review on software cost estimation in agile software development. They prepared a study in which different estimation methods are re- quired to be successful, and discussed the difficulties of the methods. Shekhar et al. [11] made a comprehensive review on software effort estimation methods. They explained the working principles of many methods. In addition, by listing the advantages and disadvantages of the methods, they shed light on the de- sired and to be avoided situations in the use of the methods. Chirra et al. [12] tabulated all the methods in software cost estimation based on their type, amount of data required, validation methods used by them, weaknesses and strengths. They discussed the detailed results about the methods from several perspec- tives, including: type (algorithmic method, learning oriented method, etc.), strengths, weakness, accuracy, data (limited, extensive, etc.), and validation (cross validation method, Jackknife method, etc). There are also studies that associate SEE method selection with various criteria and want to structure it. We also overview these studies below. 4 In 2012, Sehra et al. [4] proposed a method for selecting an effort estimation meth- od based on the environment and the project type by using Fuzzy Analytic Hierarchy Process. They used reliability, mean magnitude of relative error (MMRE), prediction (Pred), and uncertainty criteria as input for their method. Their selected decision al- ternatives are Expert Judgement, COCOMO, and Fuzzy Neural Network based effort estimation methods. In 2017, Bansal et al. [5] proposed fuzzy weighted distance-based approximation (WDBA) to solve selecting an effort estimation method problem based on MCDA. They found that WDBA is more effective than other MCDA solutions due to the lack of complex matrix operations. They used magnitude of relative error (MRE), root mean square (RMS), prediction (Pred), root mean square error (RMSE), mean abso- lute relative error (MARE), variance absolute relative error (VARE), value accounted for (VAF), accuracy, reliability, uncertainty, and mean absolute error (MAE) as input to their method. They selected eleven algorithmic effort estimation methods as deci- sion alternatives. In 2015, Nayebi et al. [3] proposed an approach for selecting a machine learning effort estimation method for specific datasets. They selected nine machine learning methods as decision alternatives. They used prediction (Pred), correlation coefficient (CRR), and Bayesian Information Criterion (BIC) as input to their approach. They compared SEE methods based on these criteria by evaluating nine different datasets. Ozakinci and Tarhan [13] aimed to identify software defect prediction methods in the early stages of the project, which would give the most accurate result in defect prediction. In this study, the authors determined the criteria based on the project, data, and method features considering the related studies in literature. Then, they sent a survey to the experts and asked them to evaluate the criteria against the prediction methods. At the end, using the MCDA tool, they prepared a questionnaire for users to choose the appropriate method for software defect prediction. Our study employed a similar approach as specific to software effort estimation. 2.2 Multi Criteria Decision Analysis (MCDA) Multi-Criteria Decision Analysis is a structure used to resolve important and complex decision-making situations of decision makers [14]. MCDA is an “umbrella term to describe a collection of formal approaches which seek to take explicit account of multiple criteria in helping individuals or groups explore decisions that matter” [15]. Many MCDA methods have been proposed in literature. The most well-known of them are: AHP (Analytic Hierarchical Process) [16], TOPSIS (Ordering Simulation Technique in Ideal Solution) [17], PROMETHEE (The Preference Ranking Organiza- tion METHod for Enrichment of Evaluations) [18] and ELECTRE (Elimination and Choice Expressing the Reality) [19]. An MCDA decision making mechanism works with the following steps [20]: 1) Define the Decision Opportunity, 2) Identify Stakeholder Interests, 3) Build a Deci- sion Framework, 4) Rate the Alternatives, 5) Weight Stakeholder Interests, 6) Score the Alternatives, 7) Discuss Results, Re-Score, Discuss Again, and Decide. 5 3 Evaluation Approach The aim of this study was to provide estimators with a vehicle in selecting the best fit software effort estimation method to enable more accurate effort estimation of soft- ware projects, which is an important step in software project planning. Accordingly, an evaluation approach based on Multi-Criteria Decision Analysis was created to select the most suitable software effort estimation method. While applying the MCDA, the core elements were determined as follows: • Problem: Estimating software project effort accurately • Requirements: Developing software effort estimation model considering project requirements, data and environmental dynamics • Goal: Selecting a SEE method that can best meet the requirements • Criteria: Various aspects required to develop a software effort estimation model in relation to project requirements • Alternatives: Software effort estimation methods that can meet the requirements in accordance with the determined criteria • MCDA Tool: An excel based decision matrix prepared using expert opinions. 3.1 Alternatives and Criteria Alternatives. There are many different classifications of estimation methods in the literature [21]. In this study, we tried to select the most common effort estimation methods in classification and review studies. Although many review studies only examine the methods of one classification, we selected our alternatives by choosing methods from different classifications. While choosing our alternatives, we paid at- tention to be the most applied methods according to literature review studies. The methods we have chosen as an alternative in our study are as follows: • Neural Networks (NN) [6,8,11] • Case-Base Reasoning (CBR) [6,8] • Linear Regression (LR) [7,8,10] • Analogy Based (AB) [7,11] • Expert Judgement (EJ) [7,10,11] • Support Vector Regression (SVR) [6,8] • Decision Trees (DT) [6,8] • Bayesian Networks (BN) [6,8] Criteria. While preparing the questionnaire, the criteria that distinguish the SEE methods to evaluate were determined. These criteria play a role in determining how well the requirements match the methods. It is also aimed to determine the basic properties of the methods and to determine their compatibility with the project dy- namics [13]. Criteria and related questions are shown in Table 1. The headings of criteria used in evaluation are explained below. 6 a) Approach to construct method: This criterion defines the method’s approach to data dependency when configuring the SEE method. Methods estimate effort using historical data or estimation is done with different inputs independent of data. b) Data characteristics: When creating the SEE method, the characteristics of data are decisive to choose the method to be successful. Addressing the limitations of the data will help in choosing the right method. The sub-criteria determined for data char- acteristics are as follows: type of input data, dataset size, and number of parameters. c) Data quality: This criterion indicates the quality features of the data that will be used to construct the SEE method. These are uncertainty, missing values, and outliers. Uncertain data means that the data may be inaccurate, imprecise, untrusted or un- known. Besides, missing data for certain variables leads to poor estimations in some sensitive methods. Also, the outlier data can affect choosing the suitable method. An outlier is an observation that lies an abnormal distance from other values in a dataset. d) Method characteristics: This criterion defines the characteristics of the methods to use to construct the SEE method. The method should be interpretable, easy to use (not complex), speedy, maintainable, and adaptive. Interpretability indicates that the user can understand the cause of any result. Ease of use (not being complex) is the degree of which the method is not complicated in design. Speed is the degree of which the method is built in a short time and performs fast in general. Maintainability is the degree of which the method is easy to manage in time. Being adaptive means that the method can accept new data without re-running the SEE method. e) Project context: This criterion indicates the factors related to the context infor- mation of the project subject to SEE. The factors are iteration, domain, size, and pro- ject type. Software development life cycle is an affecting factor to build the SEE method. Domain information is the expertise in the project area. Project size infor- mation is considered as the size criterion. Project data type information represents cross-project or single-project options. There are differences between these types in terms of project management and obtaining project information. Project data type has been added as a criterion for information on whether this affects the method selection. In addition, the experts who answered the expert-opinion survey were asked to add the criteria that they thought would affect the choice of the method and to add further methods that should be considered if any. They suggested that personnel parameters and project parameters should be added to the evaluation criteria. Fuzzy logic, soft computing methods, and sequential model optimization were suggested as the addi- tional methods that should be considered in evaluation. Also, the experts advised that we should study with criteria and methods from the industry users’ perspective, and not only the researchers’ perspective. 7 Table 1. Criteria and related estimator questionnaire 3.2 Expert-Opinion Survey and Estimator Questionnaire A well-defined expert-opinion survey that collects the necessary data to specify the characteristics of the effort estimation methods was designed and conducted. The survey consisted of questions that allowed us to determine the weight of criteria de- fined above for the estimation methods. A group of experts having published studies on software effort estimation was selected and asked to participate in the survey. The experts have been doing academic studies for a long time in the field of effort estima- tion as seen in Figure 1. The expert-opinion survey resulted in answers by eight ex- perts for three different question types; List selection (QT1), Ranking on Likert scale (QT2), Yes/No selection (QT3). The first type is list selection, for which possible answers are A, B, both A and B. The Likert scale used has the following answer op- tions: very low, low, average, high, very high. The last one is Yes/No choice. As an example, the answers to these three question types for the Expert Judgment estimation method are given in Table 2. 8 Fig. 1. Year of expertise in SEE and organization types of the experts Table 2. Answers to three types of questions for Expert Judgment estimation method Expert QT1: Please select the conven- QT2: To what extent QT3: Do you think ient option on “Approach to do you think the fol- that iteration in soft- Construct the SEE method” lowing methods are ware development life with the below methods. “interpretable” by its cycle is an affecting users in SEE? factor in SEE with the following methods? E1 Based on human judgement Low Yes E2 Based on human judgement High Yes E3 Can address both Very Low Yes E4 Based on human judgement Average Yes E5 Based on human judgement Very High Yes E6 Based on human judgement E7 Based on human judgement High Yes E8 Based on human judgement High No The decision matrix in Table 3 was created using the answers to the expert-opinion survey from eight experts. Estimator questions and weights in the decision matrix were derived from the expert-opinion survey results. We explain below the steps for generating and weighting three sample estimator questions (EQ) with respect to the three types of survey questions. QT1. “Do you want your method be dependent on data?” (EQ1) and “Do you want to address human judgement?” (EQ2) questions were created of QT1 from the expert- opinion survey result. While determining the weight of EQ1 (WEQ1), the number of “Dependent on data” and “Can address both” answers given was divided by the num- ber of all answers to EQ1. Similarly, weight of EQ2 (WEQ2) was determined by divid- ing the number of “Based on human judgment” and “Can address both” responses by the number of all responses to EQ2. ● WEQ1 = Count (Dependent on data) + Count (Can address both) / Count (All EQ1 answers) WEQ1 = (0 + 1) / 8 WEQ1 = 0.13 ● WEQ2 = (Count (Based on human judgement) + Count (Can address both)) / Count (All EQ2 answers) WEQ2 = (7 + 1) / 8 WEQ2 = 1 9 QT2. The question “Is it important that SEE method has high interpretability?” (EQ12) was created of QT2 from the expert-opinion survey result. When determining the weight of EQ12, values in range [1-5] were assigned for the answers in range [Low-Very High]. Total weight of EQ12 (WTotal-EQ12) was calculated by summing the product of each answer (in [1-5]) by the determined weight value which was taken as the number of that answer given. Weighted sum of EQ12 (WEQ12) was determined by dividing the total weight of EQ12 (WTotal-EQ12) by the sum of all EQ12 responses mul- tiplied by the maximum weight value of 5. ● WTotal-EQ12 = 1 x Count (Very Low) + 2 x Count (Low) + 3 x Count (Average) + 4 x Count (High) + 5 x Count (Very High) WTotal-EQ12 = 1 x 1 + 2 x 1 + 3 x 1 + 4 x 3 + 5 x 1 WTotal-EQ12 = 23 ● WEQ12 = WTotal-EQ12 / (Count (All EQ12 answers) x 5) WEQ12 = 23 / (7 x 5) WEQ12 = 0.66 QT3. The question “Do you prefer iteration in software development life cycle?” (EQ17) was created of QT3 from the expert-opinion survey result and its weight was determined by dividing the number of “Yes” answers by the number of all EQ13 an- swers. ● WEQ17 = Count (Yes) / Count (All QT3 answers) WEQ17 = 6 / 7 WEQ17 = 0.86 The weights of the estimator questions were interpolated to the range [0-1] to en- sure that no criteria dominate other criteria during selection of an estimation method. In the calculation, the total number of answers given to the questions was used to eliminate the effect of the questions that were not answered by the experts. In this way, it is aimed to determine the estimation method selection not from the weight difference between the criteria, but from the weight difference between the key fea- tures of the methods. 10 Table 3. Decision Matrix An estimator questionnaire was derived from expert-opinion survey answers. The questionnaire is intended for use by a project staff who holds the role of an estimator and wants to carry out effort estimation in his/her project accurately. The expert- opinion survey was filled once by the experts and a decision matrix was prepared from it. After having the decision matrix prepared, the estimator can use this matrix in order to select the most suitable estimation method for his/her need by answering a number of estimator questions (EQ). In the decision matrix shown in Table 3, the first column (QID) refers to the identi- fier of the estimator question, the second column (Estimator Question) refers to the description of the estimator question, and the third column (Answer Type) refers to the way the question is answered. In that column ‘Multiple’ value is used for the crite- ria elicited by answering more than one question, and ‘Single’ value is used for the criteria elicited by answering only one question. In the other columns (Rating), the weights calculated from the expert-opinion survey as detailed above according to the estimator question types for the relevant estimation methods are given. In the estimation process, an estimator answers the estimator questions by giving a value of 1 or 0, suitable for the question in each row. The answers are multiplied by the relevant method ratings, and the calculated scores for all questions are summed for each method to find the method scores. The method with a higher score is more suitable for estimation. Details of using the decision matrix in estimation process is explained in the next section. 11 4 Example Evaluation 4.1 Data Set The proposed method was operated using the ISBSG [22] dataset as an example. Ac- cording to the study [9], ISBSG dataset is widely used for software project estima- tions. The International Software Benchmarking Standards Group (ISBSG) maintains a data repository containing software project data from many organizations. The ISBSG aims to provide a wide range of project data from many sectors to organiza- tions. These data can be used for awareness of trends, effort estimation, productivity benchmarking and comparing platforms and languages. The version of the ISBSG dataset that we used is Release 2016 R1.1. 4.2 Evaluation The decision matrix described in Table 3 was detailed with an example evaluation in Table 4. The estimator questions were answered using the ISBSG dataset and a num- ber of assumptions regarding the example estimation. In order to answer the ques- tions, a project of a company was selected from the ISBSG dataset and its information was examined. While some of the answers were answered directly by using the da- taset, some of them were answered by the first author according to the hypothetical estimation needs, considering the project information of the company. This infor- mation is shown in the Reason column in Table 4. The questions were answered as 1 for yes and 0 for no, in accordance with the estimation context. The reasons for answering the questions are as follows. The ISBSG dataset is used for the example estimation (EQ1) and the estimation model is preferred not to be dependent on human judgment (EQ2). That is why the answer to EQ1 is Yes, while that of EQ2 is No. Since there are categorical and numerical inputs in the ISBSG dataset, the answers given to EQ3 and EQ4 are Yes. The size of the dataset that can be used for training in the dataset is large, so the answer to EQ7 is Yes while the an- swers to EQ5 and EQ6 are No. The answer to EQ8 is Yes since other projects’ infor- mation can be used. The uncertainty information will not be addressed in the estima- tion, so the answer given to EQ9 is No. There are missing data in the dataset and this information will be handled in the estimation process (EQ10). There is an abnormal distance between the values in the dataset, so the preference is Yes for EQ11. The estimator does not need the estimation model to have high interpretability, low com- plexity, high maintainability and short built time. Therefore, the preferences are No for EQ12, EQ13, EQ14, and EQ15. The estimator does not need that the model can accept new data without regenerating so the answer for EQ16 is No. The iteration information from the dataset will not be handled in the estimation process (No for EQ17). The domain information will not be used in estimation, so the answer for EQ18 is No. The size information can be found in the dataset (Yes for EQ19). Finally, the estimator considers the project is a cross-project (No for EQ20 and Yes for EQ21). 12 After entering estimator responses, method scores were calculated using the rating values in the decision matrix. Summing all the scores in the relevant method column, the total score for each method was obtained in the last (SUM) row of the table. The answers and the total scores for each estimation method in our example evaluation can be seen in Table 4. The answers were given with respect to the characteristics of ISBSG dataset and the estimator’s assumptions. According to the decision matrix prepared with our approach, the most suitable ef- fort estimation model with a score of 6.26 is the Neural Network (NN) method and then the Case Base Reasoning (CBR) method with a score of 6.20. Table 4. Example Evaluation using the Decision Matrix Wen et al. [6] showed that NN and CBR methods with the usage of the ISBSG data set are the most frequently used ones. They prepared a list for “distribution of the studies over the types of ML techniques”. CBR and NN are at the top of the list. The research interest in CBR and NN methods have increased over the years compared to other methods. Also, these methods are more accurate than others when working with the ISBSG data set. According to the mean magnitude of relative error (MMRE) val- ues examined in the study, NN performed better than all other methods. Marco et al. [8] systematically gathered the information of many studies that exam- ined estimation methods in terms of accuracy. According to the results, the two best MMRE values of the studies performed with the NN method for the ISBSG dataset 13 were calculated as 9.5 and 49. The two best MMRE values for CBR method with the same dataset were obtained as 53 and 52.32. It is seen from these results that NN achieves better estimation performance with ISBSG dataset. As in our study, NN is more accurate when a choice is made between NN and CBR. Venkataiah et al. [23] examined which data set and which methods were studied together in the literature. As a result of his analysis, he stated that one of the most worked methods with the ISBSG data set is NN. The above studies [6,8] review and list the MMRE values of the estimation meth- ods from multiple studies. Although in some of these studies it is reported that meth- ods other than NN and CBR give more accurate results (e.g. [24,25]), there are also studies that contain results that support the selection of these methods as suggested by our study (e.g. [26,27]). Therefore, we can say that the results obtained by our pro- posed approach in the sample evaluation is partially supported with the results and suggestions of studies in the latter group. Nevertheless, comparing the selection of estimation methods in the studies based on the resulting MMRE values only might remain incomplete since the estimation process includes many requirements and assumptions other than the ones related to the dataset, as also considered in our evaluation approach. Accordingly, we need to create further estimation cases, or repeat the past estimation cases by applying our questionnaire when possible, to make more meaningful comparisons and to discuss the reliability of our evaluation approach. This is left for future work at the moment. 5 Discussion The proposed approach addresses the problem of selecting a suitable software effort estimation method through structuring the information and suggestions of the studies in literature into a decision matrix. It will be beneficial to expand the scope of the work by including the gains in the field of effort estimation in software industry. For this purpose, it will be beneficial to include the opinions of the experts working in this field in the industry to the expert opinions analyzed within the scope of our study. In this way, in addition to the observed effects in academy, the effects experienced in industry can be reflected to the process of selecting a suitable estimation method. In addition, eight effort estimation methods, which are widely referenced in the litera- ture, are analyzed within the scope of this study. Similarly, the scope of the study can be expanded by analyzing the effort estimation methods and features that are com- monly used in the industry. The most important factor affecting the selection of the appropriate estimation method is the answers to the estimator questions. Therefore, in order to answer the questionnaire, it is necessary to have sufficient knowledge of the characteristics of the project and related data to be included in the estimation. Failure to reflect the project characteristic to the decision matrix through estimator questions will negatively affect the selection of the estimation method. This situation may lead to poor estimation by choosing an unsuitable method. Our approach does not control whether the project 14 characteristics are accurately reflected in the decision matrix. The responsibility in this matter is on the person who will perform the estimation. The majority of the experts who answered the expert-opinion survey are from the university. In future studies, reaching the experience of the people working in this field in the industry will increase the value of the expert-opinion survey results. 6 Conclusion In this study, an approach has been proposed for selecting the most suitable software effort estimation method considering the project characteristics and the needs of the stakeholders. The approach aims to assist users in choosing the most suitable estima- tion method in the targeted estimation context. We started this study by identifying the distinctive criteria for software effort esti- mation methods. To identify these criteria, we used findings of several literature re- views and followed the approach of a similar study in [13]. Then, using the criteria, we prepared an expert-opinion survey to take the opinions of the experts in SEE. The aim of the expert-opinion survey was to enable us to establish a relationship between methods and criteria, in the form of a decision matrix. We calculated the rating values for the questions derived from the criteria using the answers given by eight experts to the survey. Then, a questionnaire was prepared to be answered by the user (estimator) who wants to perform the estimation. The user would be able to see the accuracy scores of the estimation methods on the decision matrix by answering the questionnaire. To make our work understandable, we explained it through an example evaluation based on the ISBSG data set, and found that Neural Network and Case Based Reason- ing are the most suitable methods in our estimation context. This method selection is partially supported with the results of the studies in the literature, and there is a need for further studies to validate the results of the evaluation approach. We think that one of the most important factors that determine the success of our study is the number of experts who answer the expert-opinion survey. As future work, we plan to send our survey to the experts in industry. We think that the expert-opinion survey answered by more experts will increase the reliability of the decision matrix. References 1. Shepperd, Martin & Cartwright, Michelle. Predicting with sparse data. IEEE Trans. Softw.Eng., 27(11):987–998, 2001 2. Idri, Ali & Mbarki, Samir & Abran, Alain. (2004). Validating and understanding software cost estimation models based on neural networks. 433 - 434. 10.1109/ICTTA.2004.1307817. 3. Nayebi, Fatih & Abran, Alain & Desharnais, Jean-Marc. (2015). Automated selection of a software effort estimation model based on accuracy and uncertainty. Artificial Intelligence Research. 4. 10.5430/air.v4n2p45. 15 4. Sehra, Sumeet Kaur & Brar, Yadwinder & Kaur, Navdeep. (2012). Multi Criteria Decision Making Approach for Selecting Effort Estimation Model. International Journal of Com- puter applications. 39. 975-8887. 10.5120/4783-6989. 5. Bansal, Ashu & Kumar, Brijesh & Garg, Rakesh. (2017). Multi-criteria decision making approach for the selection of software effort estimation model. Management Science Let- ters. 7. 285-296. 10.5267/j.msl.2017.3.003. 6. Wen, Jianfeng & Li, Shixian & Lin, Zhiyong & Hu, Yong & Huang, Changqin. (2012). Systematic literature review of machine learning based software development effort esti- mation models. Information & Software Technology. 54. 41-59. 10.1016/j.infsof.2011.09.002. 7. Jorgensen, M. & Shepperd, M. (2007). A Systematic Review of Software Development Cost Estimation Studies. IEEE Transactions on Software Engineering, 33, 33--53. doi: 10.1109/TSE.2007.256943. 8. Marco, R. & Suryana, N. & Ahmad, S.S.S. (2019). A systematic literature review on methods for software effort estimation. Journal of Theoretical and Applied Information Technology. 97. 434-464. 9. Idri, A., Amazal, F. A. & Abran, A. (2015). Analogy-based software development effort estimation: A systematic mapping and review. Inf. Softw. Technol., 58, 206-230. 10. Bilgaiyan, Saurabh & Sagnika, Santwana & Mishra, Samaresh & Das, M.N. (2017). A Systematic Review on Software Cost Estimation in Agile Software Development. JOURNAL OF ENGINEERING SCIENCE AND TECHNOLOGY REVIEW. 10. 51-64. 10.25103/jestr.104.08. 11. Shekhar, Shivangi & Kumar, Umesh. (2016). Review of Various Software Cost Estimation Techniques. International Journal of Computer Applications. 141. 31-34. 10.5120/ijca2016909867. 12. Chirra, Sai Mohan Reddy & Reza, Hassan. (2019). A Survey on Software Cost Estimation Techniques. Journal of Software Engineering and Applications. 12. 10.4236/jsea.2019.126014. 13. Ozakinci, Rana & Tarhan, Ayça. (2019). An Evaluation Approach for Selecting Suitable Defect Prediction Method at Early Phases. 10.1109/SEAA.2019.00040. 14. Figueira, José & Greco, Salvatore & Ehrgott, Matthias. (2005). Multiple Criteria Decision Analysis, State of the Art Surveys. Multiple Criteria Decision Analysis: State of the Art Surveys. 78. 10.1007/b100605. 15. Belton, Valerie & Stewart, Theodor. (2002). Multiple Criteria Decision Analysis: An Inte- grated Approach. 10.1007/978-1-4615-1495-4. 16. Saaty, R.W. (1987) The Analytic Hierarchy Process—What It Is and How It Is Used. Mathematical Modelling, 9, 161-176. http://dx.doi.org/10.1016/0270-0255(87)90473-8. 17. Hwang, C.L. & Yoon, K. (1981). Multiple Attribute Decision Making: Methods and Ap- plications. Springer-Verlag, New York. 10.1007/978-3-642-48318-9. 18. Brans, J.P. & Mareschal, Bertrand. (2005). Chapter 5: PROMETHEE methods. Multiple Criteria Decision Analysis: State of the Art Surveys. 164-189. 19. Figueira, J. & Mousseau, V. & Roy, B. (2005) Electre Methods. In: Multiple Criteria Deci- sion Analysis: State of the Art Surveys. International Series in Operations Research & Management Science, vol 78. Springer, New York, NY. 20. Multi-Criteria Decision Analysis, https://projects.ncsu.edu/nrli/decision- making/MCDA.php. 21. Vera, Tomas & Ochoa, Sergio & Perovich, Daniel. (2018). Survey of Software Develop- ment Effort Estimation Taxonomies. 10.13140/RG.2.2.14599.29601. 22. ISBSG, International Software Benchmarking standards Group, http://www.isbsg.org. 16 23. Venkataiah, V. & Mohanty, Rama & Nagaratna, M. (2017). Review on intelligent and soft computing techniques to predict software cost estimation. International Journal of Applied Engineering Research. 12. 12665-12681. 24. Moosavi, S. H. Samareh & Bardsiri, V. Khatibi (2017).Satin bowerbird optimizer: A new optimization algorithm to optimize ANFIS for software development effort estimation. Eng. Appl. Artif. Intell., vol. 60. https://doi.org/10.1016/j.engappai.2017.01.006. 25. Pospieszny, Przemyslaw & Czarnacka-Chrobot, Beata & Kobyliński, Andrzej. (2017). An effective approach for software project effort and duration estimation with machine learn- ing algorithms. Journal of Systems and Software. 137. 10.1016/j.jss.2017.11.066. 26. Azzeh, Mohammad & Neagu, Daniel & Cowling, Peter. (2011). Analogy-based software effort estimation using Fuzzy numbers. Journal of Systems and Software. 84. 270-284. 10.1016/j.jss.2010.09.028. 27. Azzeh, Mohammad & Neagu, Daniel & Cowling, Peter. (2010). Fuzzy grey relational analysis for software effort estimation. Empirical Software Engineering. 15. 60-90. 10.1007/s10664-009-9113-0.