Mathematical Modeling of Effort of Mobile Application Development in a Planning Phase Sergiy Prykhodko1[0000-0002-2325-018X], Natalia Prykhodko1[0000-0002-3554-7183], Kateryna Knyrik1[0000-0001-8434-4035] and Andrii Pukhalevych1[0000-0002-8827-3251] 1 Admiral Makarov National University of Shipbuilding, Mykolaiv, 54025, Ukraine sergiy.prykhodko@nuos.edu.ua Abstract. Mathematical modeling of effort of development of mobile applica- tions (apps) by non-linear regression model using multivariate normalizing transformation is performed. A three-factor non-linear regression model to es- timate the effort (in man-hours) of developing the mobile apps in a planning phase is constructed on the basis of the Johnson four-variate transformation for SB family. This model is constructed around the Requirement Analysis Docu- ment (RAD) variables: number of screens, number of functions, and number of files. Comparison of the constructed model with the linear regression model and non-linear regression models based on the univariate normalizing transfor- mations is performed. This model, in comparison with other regression models, has a larger multiple coefficient of determination, a smaller value of the mean magnitude of relative error, a larger value of percentage of prediction, and smaller widths of the confidence and prediction intervals of regression. Such a good result for the constructed model may be explained best multivariate nor- malization of the non-Gaussian data set, which used to build the three-factor non-linear regression model based on the Johnson four-variate transformation for SB family. Keywords: Mathematical Modeling, Effort Estimation, Mobile Application, Non-linear Regression Model, Prediction Interval. 1 Introduction The problem of estimating software development effort is one of the important ones in the planning phase, which is the first of the five phases of the software develop- ment lifecycle [1]. Today, the solution of this problem is carried out, including using mathematical modeling. One of the more well-known mathematical models for esti- mating software development effort is COCOMO II. But its use for mobile apps has some difficulties. First, the main factor for this model is the size of the software, which is still unknown in the planning phase. Second, COCOMO II is a non-linear regression equation built on a univariate transformation in the form of a decimal loga- rithm, which does not always allow for proper normalization of the data. In addition, the regression equation does not include random variables [2-4] as and a effort esti- mation model based on Function Points Analysis method [5]. And, as you know, the Copyright © 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). effort is a random variable. Third, while mobile app development is similar to web app development and has its roots in more traditional software development, however, one significant difference is that mobile apps are often written specifically to take advantage of the unique features that a particular mobile device offers [6]. Therefore, over the last decade, the various models for forecasting the effort of de- veloping the mobile apps in a planning phase, including regression ones [7, 8], were constructed. It is the regression models that describe an effort as a random variable. And since the effort distribution is not Gaussian, it is necessary to use non-linear re- gression models, and their construction should be based on multivariate normalizing transformations [9]. 2 Model construction At first, the three-factor linear regression model to estimate the effort Y (in man- hours) of developing the mobile apps in a planning phase is constructed for the four- dimensional data set from Table 1. This model is constructed around the Requirement Analysis Document (RAD) variables: number of screens X1, number of functions X2, and number of files X3. Table 1. The data set and MD2 values. No Y X1 X2 X3 MD2 No Y X1 X2 X3 MD2 1 192 5 4 3 0.66 20 198 6 5 4 0.50 2 272 5 4 3 2.31 21 146 4 3 2 1.18 3 288 3 2 2 6.43 22 191 6 6 5 0.96 4 116 6 6 4 0.95 23 99 3 3 2 1.47 5 372 5 5 4 6.82 24 382 11 12 9 8.35 6 504 9 8 6 10.55 25 270 9 10 8 4.84 7 28 6 7 2 7.11 26 282 12 7 3 7.16 8 176 6 7 3 4.53 27 213 10 5 2 6.14 9 364 10 11 9 6.90 28 322 11 7 5 4.32 10 120 10 10 5 6.76 29 290 10 6 4 3.67 11 22 6 5 4 6.72 30 223 7 7 6 1.69 12 224 11 6 2 7.08 31 241 5 5 6 4.95 13 24 2 2 1 3.05 32 87 5 5 2 1.53 14 200 11 7 4 4.88 33 36 3 3 1 2.24 15 160 6 6 7 9.41 34 216 8 7 5 0.54 16 120 2 2 1 2.86 35 67 5 6 2 4.26 17 96 4 4 1 2.60 36 115 7 7 3 2.59 18 202 6 5 4 0.49 37 36 2 2 1 2.84 19 145 4 3 2 1.17 38 98 3 3 2 1.47 The data set from Table 1 was obtained by combining two data sets for 17 mobile apps from [5] and for 21 mobile apps (rows 18 to 38). Also, Table 1 contains the val- ues of squared Mahalanobis distance (MD2). We use the technique based on the squared Mahalanobis distance [10] for detecting the outliers in the data from Table 1. There are no outliers in the data from Table 1 for 0.005 significance level, since for all data rows, the MD2 values are smaller than the value of the quantile of the Chi- Square distribution, which equals to 14.86. Following [2-4] the three-factor linear regression model has the form Y  bˆ0  bˆ1 X 1  bˆ2 X 2  bˆ3 X 3   x , (1) where  x is a Gaussian random variable which defines residuals,  x  N 0,  x  ; the estimators for parameters of the model (1) are: bˆ  0.26513 , bˆ  0.23116 , 0 1 bˆ2  -0.00082 , bˆ3  0.08374 . Parameters of the model (1) were estimated by the least square method. To judge the prediction accuracy of linear regression model (1) we first used the well-known standard metrics of prediction accuracy, i.e., a multiple coefficient of determination R2, a mean magnitude of relative error MMRE and percentage of pre- diction at the level of magnitude of relative error (MRE), which equals 0.25, PRED(0.25) [11, 12]. The values of R2, MMRE, and PRED(0.25) equal respectively 0.5449, 0.5713, and 0.5789 for the linear regression model (1). These values show us bad prediction results of the regression model (1). Besides, the null hypothesis that the observed frequency distribution of residuals for the linear regression model (1) is the same as the normal distribution was tested by Pearson's chi-squared test. There is a reason to reject the null hypothesis that the dis- tribution of residuals for the model (1) is the same as the normal distribution, since the chi-squared test statistic value equals to 13.33 is higher than the critical value of the chi-square, which equals to 7.81 for 3 degrees of freedom and 0.05 significance level. Also, for the distribution of residuals in linear regression model (1), estimators of skewness and kurtosis equal to 0.78 and 5.69, respectively. Although for the Gaussian distribution, the values of skewness and kurtosis equal to 0 and 3, respec- tively. It is known [2], one of the underlying assumptions that justify the use of linear re- gression models is the normality of the distribution of residuals. But this assumption is not valid for the linear regression model (1). What leads to the need to construct a multiple non-linear regression model to estimate the effort of developing the mobile apps in a planning phase. The three-factor non-linear regression model to estimate the effort of developing the mobile apps in a planning phase was constructed based on the Johnson four- variate transformation for SB family according [9]. The three-factor non-linear regres- sion model has the form [9] Y  ˆ Y  ˆ Y 1  e ZY  Y  Y  , ˆ 1 ˆ ˆ (2)   where  is a Gaussian random variable which defines residuals,   N 0,1 ; ẐY is a prediction result by linear regression equation for normalized data, which were trans- formed using the Johnson four-variate transformation for SB family, X j j ZˆY  bˆ0  bˆ1Z1  bˆ2 Z 2  bˆ3 Z 3 ; Z j   j   j ln , j  X j  j j , j j  X j j  1,2,3 ; the estimators for parameters of the Johnson four-variate transformation for SB family are: ˆ Y  5.69898 , ˆ 1  0.524119 , ˆ 2  0.776179 , ˆ 3  0.540973 , ˆ Y  2.40219 , ˆ 1  0.743879 , ˆ 2  0.79545 , ˆ 3  0.534447 , ˆ Y  -114.5452 , ˆ 1  1.7242 , ˆ 2  1.6885 , ˆ 3  0.90 , ˆ Y  3328.564 , ˆ 1  12.3743 , ˆ 2  12.09 1 , ˆ 3  8.30648 ; the estimators for parameters of the linear regression equation for normalized data are: bˆ  0 , bˆ  0.808152 , bˆ  -0.928296 , bˆ  0.854262 . Parame- 0 1 2 3 ters of the linear regression equation for normalized data were estimated by the least square method. The values of R2, MMRE, and PRED(0.25) equal respectively 0.5789, 0.4933 and 0.5263 for non-linear regression model (2). These values show us bad prediction re- sults of the non-linear regression model (2) approximately also as for the linear re- gression model (1). Because of this, the method [13] for improving non-linear regression models was further used to construct a non-linear regression model to estimate the effort of devel- oping the mobile apps in a planning phase. The method [13] consists of four stages. In the first stage, a set of multivariate non-Gaussian data is normalized using a multivar- iate normalizing transformation. After that, normalized data are checked for outliers, and, if ones are detected, outliers are cut off. The method based on the squared Ma- halanobis distance [14] is used for outlier detection. In the second stage, the non- linear regression model is constructed based on the multivariate normalizing trans- formation [9]. In the third stage, the prediction intervals of non-linear regression is built according [9]. And finally, at the fourth stage, it is checked whether among the data for which the non-linear regression model was built, those that go beyond the found boundaries of the prediction interval. And, if the outliers are detected, they are cut off, and we repeat all the stages, starting with the first, for new data. For the non-linear regression model (2) with the parameter estimators obtained from the data in Table 1 of the 38 mobile apps, it turned out that Y values for the three apps (5, 6, and 11) go beyond the prediction interval. In Table 2, the lower bound of the prediction interval obtained in the first iteration is denoted as LB1, and the upper bound is denoted as UB1. In the second iteration, data from three mobile apps (5, 6, and 11) were cut off, and data from the remaining 35 apps were used for model con- struction. For the model (2) with the parameter estimators obtained from the data in Table 1 of the 35 mobile apps, it turned out that the value of Y for app 17 goes beyond the prediction interval. There were four such iterations, after which 30 mobile apps remained (1, 3, 4, 7, 9, 10, 12-14, 18-38). At the fifth iteration, there were no outliers; the repeat of the stages was completed, the nonlinear regression model (2) was con- structed using data from 30 apps. In Table 2, the lower bound of the prediction inter- val obtained in the fifth iteration is denoted as LB5, and the upper bound is denoted as UB5. The row numbers (i.e., mobile apps) with the outliers in data are highlighted in bold. A dash (-) depicts the exclusion of the corresponding numbers of data in the relevant iteration (i.e., iteration 5). Table 2. Lower and upper bounds nonlinear regression before and after outlier cutoff. No Y LB1 UB1 LB5 UB5 No Y LB1 UB1 LB5 UB5 1 192 60.5 377.3 131.4 228.5 20 198 70.8 402.2 148.4 248.2 2 272 60.5 377.3 - - 21 146 49.2 353.3 115.3 209.6 3 288 88.6 524.1 218.1 332.6 22 191 66.2 392.5 139.8 238.7 4 116 51.1 352.9 105.7 195.7 23 99 24.7 290.0 73.0 149.9 5 372 54.5 362.4 - - 24 382 140.1 624.9 317.2 402.2 6 504 90.1 453.3 - - 25 270 93.4 477.2 202.6 309.0 7 28 -0.7 232.5 25.1 70.9 26 282 104.6 532.5 223.9 332.1 8 176 18.9 277.8 - - 27 213 78.5 452.7 158.6 265.3 9 364 157.4 665.2 331.6 411.5 28 322 126.8 560.3 257.5 355.1 10 120 48.7 363.8 97.1 188.5 29 290 109.1 513.1 219.5 322.5 11 22 70.8 402.2 - - 30 223 78.6 425.2 164.1 266.7 12 224 73.5 447.0 148.5 255.8 31 241 84.9 449.3 194.4 299.7 13 24 -23.9 170.9 15.3 51.1 32 87 17.1 267.3 49.2 111.2 14 200 106.5 511.6 214.3 318.8 33 36 -29.0 153.6 15.2 49.6 15 160 100.6 490.0 - - 34 216 77.1 418.6 153.2 253.9 16 120 -23.9 170.9 - - 35 67 1.4 233.2 29.0 77.0 17 96 -33.4 149.2 - - 36 115 31.0 306.4 64.9 137.9 18 202 70.8 402.2 148.4 248.2 37 36 -23.9 170.9 15.3 51.1 19 145 49.2 353.3 115.3 209.6 38 98 24.7 290.0 73.0 149.9 In the fifth iteration, for the data from 30 mobile apps, the estimators of parameters of the Johnson four-variate transformation for SB family are: ˆ Y  0.58590 , ˆ 1  0.316749 , ˆ 2  0.86299 , ˆ 3  0.48606 , ˆ Y  1.01714 , ˆ 1  0.63606 , ˆ 2  0.86557 , ˆ 3  0.612856 , ˆ Y  -12.7422 , ˆ 1  1.84255 , ˆ 2  1.5560 , ˆ 3  0.73913 , ˆ Y  500.266 , ˆ 1  11.3796 , ˆ 2  13.2488 , ˆ 3  8.52637 ; the esti- mators for parameters of the linear regression equation for normalized data are: bˆ  0 , bˆ  1.1190 , bˆ  -1.3765 , bˆ  1.2027 . 0 1 2 3 The values of R2, MMRE, and PRED(0.25) equal respectively 0.965, 0.117 and 0.867 for non-linear regression model (2). These values show us good prediction re- sults of the non-linear regression model (2) with parameter estimators obtained from the data in Table 1 of the 30 mobile apps. Following [9], appropriate equations were constructed to determine the lower and upper bounds of the non-linear regression prediction intervals    12        1 1 ˆ 1  ˆ  T  T      , YPI  Y  ZY  t 2, S ZY 1   z X  Z X Z X  z X   (3)   N       where  Y is a first component of a vector of normalizing transformation, ψ  Y , 1 ,  2 ,,  k T ; k is a number of factors (regressors or independent varia- bles); t 2, is a quantile of student's t-distribution with  2 significance level and  degrees of freedom; Z X is a matrix of centered regressors that contains the values of normalized data Z1i  Z1 , Z 2i  Z 2 ,  , Z ki  Z k ; z X is a vector with components   ,   N  k 1 ; 2 1 N Z1i  Z1 , Z 2i  Z 2 ,  , Z ki  Z k for i-row; S Z2Y   ZY  ZˆYi  i 1 i Z  Z is k  k matrix  T X  X  S Z1Z1 S Z1Z 2  S Z1Z k       T   SZ Z ZX ZX   1 2  SZ2Z2   SZ2Zk     ,    SZ Z SZ2Zk  S Z k Z k   1 k    N where S Z q Z r   Z qi  Z q Z ri  Z r , q, r  1,2,, k . In our case, k=3. i 1 In the fifth iteration, for the data which normalized by the Johnson four-variate transformation for SB family from 30 mobile apps, 3 3 matrix  29.8 25.5 19.1    T  Z X Z X   25.5  30.2 24.5  .  19.1 24.5 29.7    3 Comparison of models Also, for comparison of the model (2) with other models, a linear regression model and nonlinear regression models on the basis of the univariate decimal logarithm transformation (Log10) and the Johnson univariate transformation for the SB family were constructed for data from Table 1 of the 30 mobile apps. The three-factor linear regression model for data from Table 1 of the 30 apps has the form Yˆ  40,250  28,973 X 1 - 41,798 X 2  50,665 X 3   x . (4) The three-factor non-linear regression model is constructed based on the decimal logarithm transformation for data from Table 1 of the 30 apps ˆ ˆ ˆ ˆ Y  10  x b0 X 1b1 X 2b2 X 3b3 , (5) where the estimators for parameters are: bˆ0  1.73898 , bˆ1  1.6687 , bˆ2  -2.1116 , bˆ  1.30125 . 3 The three-factor non-linear regression model based on the Johnson univariate transformation for the SB family has the form (2) with only the following parameter estimators: bˆ  1.1148 ˆ  0.25204 , ˆ  0.10255 , ˆ  0.49345 , ˆ  0.61963 , 3 Y 1 2 3 ˆ Y  0.58192 , ˆ 1  0.51359 , ˆ 2  0.63352 , ˆ 3  0.58967 , ˆ Y  19.9286 , ˆ 1  1.90 , ˆ 2  1.81688 , ˆ 3  0.90 , ˆ Y  370.175 , ˆ 1  10.20 , ˆ 2  10.6468 , ˆ  8.6277 , bˆ  0 , bˆ  0.60292 , bˆ  -0.80179 , bˆ  1,1148 . Parameters of the 3 0 1 2 3 Johnson transformation for S B family were estimated by the maximum likelihood method. The values of R2, MMRE and PRED(0.25) equal respectively 0.838, 0.237 and 0.733 for linear regression model (4), and equal respectively 0.789, 0.206 and 0.733 the model (5), and equal respectively 0.878, 0.190 and 0.767 for the model (2) with estimators of parameters for the Johnson univariate transformation. The values of R2, MMRE, and PRED(0.25), which equal respectively 0.965, 0.117, and 0.867, is better for the model (2) with estimators of parameters for the Johnson four-variate transfor- mation in comparison with all previous models. The null hypothesis that the distribution of residuals for the linear regression model (4) is the same as the normal distribution was tested by Pearson's chi-squared test. There is a reason to reject the null hypothesis that the distribution of residuals for the linear regression model (4) is the same as the normal distribution, since the chi- squared test statistic value equals to 10.78 is higher than the critical value of the chi- square, which equals to 7.81 for 3 degrees of freedom and 0.05 significance level. Also, for the distribution of residuals in linear regression model (4), estimators of skewness and kurtosis equal respectively to 1.52 and 7.73. There is no reason to reject the null hypothesis that the distribution of residuals for nonlinear regression models (2) and (5) is the same as the normal distribution, since the chi-squared test statistic values are less than the critical value of the chi-square, which equals to 7.81. The chi- squared test statistic values equal to 4.78, 2.91, and 2.30 for the distribution of residu- als in nonlinear regression models (5), (2) with estimators of parameters for the John- son univariate transformation and (2) with estimators of parameters for the Johnson four-variate transformation respectively. For the distribution of residuals in nonlinear regression models (2) and (5), estimators of skewness and kurtosis are close to 0 and 3, respectively. Only the estimator of kurtosis equals to 5.39 for the distribution of residuals in the nonlinear regression model (2) with estimators of parameters for the Johnson univariate transformation for the SB family. The lower (LB) and upper (UB) bounds of the linear regression and non-linear re- gression prediction intervals were also determined by (3) based on the decimal loga- rithm transformation, Johnson's univariate and four-variate transformations for a sig- nificance level of 0.05. These bounds are shown in Table 3. Table 3. Lower and upper bounds of prediction intervals for regressions. linear Log10 Johnson Johnson No Y regression univariate univariate four-variate LB UB LB UB LB UB LB UB 1 192 80.7 259.1 104.3 310.1 68.5 302.5 131.4 228.5 3 288 52.8 237.0 108.0 354.1 95.9 352.6 218.1 332.6 4 116 77.3 254.6 87.5 259.1 71.0 306.0 105.7 195.7 7 28 -75.2 120.8 24.4 79.7 28.3 155.6 25.1 70.9 9 364 229.5 422.9 159.5 499.1 269.4 383.7 331.6 411.5 10 120 69.9 260.7 91.9 280.6 70.5 312.2 97.1 188.5 12 224 113.5 305.5 93.0 303.6 60.0 299.9 148.5 255.8 13 24 -26.6 157.1 22.6 71.9 21.6 59.9 15.3 51.1 14 200 176.6 361.5 171.2 522.3 129.0 355.4 214.3 318.8 18 202 118.6 296.9 128.4 381.5 89.1 327.4 148.4 248.2 19 145 42.2 222.0 77.3 233.0 49.7 262.8 115.3 209.6 20 198 118.6 296.9 128.4 381.5 89.1 327.4 148.4 248.2 21 146 42.2 222.0 77.3 233.0 49.7 262.8 115.3 209.6 22 191 127.0 306.3 116.3 348.6 99.1 336.5 139.8 238.7 23 99 12.7 193.5 47.7 144.5 40.2 227.7 73.0 149.9 24 382 215.8 410.9 155.4 487.5 204.8 378.0 317.2 402.2 25 270 194.1 382.5 141.0 437.2 179.3 370.8 202.6 309.0 26 282 151.5 343.2 134.0 421.8 168.9 375.6 223.9 332.1 27 213 127.5 317.1 116.6 380.3 59.9 294.2 158.6 265.3 28 322 226.4 413.0 228.3 700.0 174.3 369.1 257.5 355.1 29 290 189.5 374.2 201.4 619.1 125.1 352.3 219.5 322.5 30 223 163.9 345.1 137.2 414.2 126.6 352.8 164.1 266.7 31 241 184.3 376.0 156.0 490.8 143.7 361.7 194.4 299.7 32 87 -13.1 168.0 38.1 115.2 33.9 191.9 49.2 111.2 33 36 -38.8 143.7 18.9 60.0 21.0 48.9 15.2 49.6 34 216 143.9 321.6 136.4 404.8 105.4 340.0 153.2 253.9 35 67 -58.8 130.2 25.3 80.4 29.4 162.2 29.0 77.0 36 115 10.6 194.3 55.7 168.0 45.5 249.8 64.9 137.9 37 36 -26.6 157.1 22.6 71.9 21.6 59.9 15.3 51.1 38 98 12.7 193.5 47.7 144.5 40.2 227.7 73.0 149.9 Note that the width of the non-linear regression prediction interval based on the John- son four-variate transformation is less than after the Johnson univariate transfor- mation for 29 from 30 data rows (except one with number 25), smaller than after dec- imal log transformation and less compared with the linear regression prediction inter- val width for all 30 data rows. Approximately the same results were obtained for the confidence intervals of regressions. Herewith a confidence interval of non-linear re- gression is defined as (3) with the only difference that in the sum in curly brackets, there will not be 1. Such good prediction results for the constructed model may be explained best mul- tivariate normalization of the non-Gaussian data set, which used to build the three- factor non-linear regression model based on the Johnson four-variate transformation for SB family. The measures of multivariate skewness 1 and kurtosis  2 [15] allow one to test two hypotheses that are compatible with the assumption of multivariate normality. In our case for 30 apps 1  4 and  2  24 . The estimators of multivariate skewness and kurtosis equal 8.42, 5.44, 12.86, 6.82, and 26.78, 23.08, 33.57, 25.71 for the data for 30 apps from Table 1, the normalized data on the basis of the decimal logarithm transformation, the Johnson univariate and four-variate transformations respectively. The values of these estimators indicate that the necessary condition for multivariate normality is approximately performed for the normalized data on the basis of the decimal logarithm and the Johnson four-variate transformation. Also, multivariate normality was tested by MD2 [16]. A multivariate normality condition is only performed for the normalized data on the basis of the decimal logarithm and the Johnson four-variate transformation, since for all 30 rows of the normalized data, the MD2 values are smaller than the value of the quantile of the Chi-Square distribution, which equals to 14.86 for 0.005 significance level. 4 Conclusions Mathematical modeling of effort of development of mobile apps by non-linear regres- sion model using multivariate normalizing transformation is performed. A three- factor non-linear regression model to estimate the effort of developing the mobile apps in a planning phase is firstly constructed on the basis of the Johnson four-variate transformation for SB family. This model, in comparison with other regression models (both linear and non-linear), has a more significant multiple coefficient of determina- tion, a smaller value of the mean magnitude of relative error, a more significant value of percentage of prediction, and smaller widths of the confidence and prediction in- tervals of regression. An example of the construction of the three-factor non-linear regression model confirms the efficiency of the method for improving non-linear regression models on the basis of multivariate normalizing transformations, the squared Mahalanobis distance, and prediction intervals. Prospects for further research may include the application of other data sets to construct the multiple non-linear regression models for estimating the effort of developing the mobile apps in a plan- ning phase. References 1. Zhu, H.: Software design methodology: From principles to architectural styles. Butter- worth-Heinemann, Elsevier, Oxford (2005). 2. Ryan T.P.: Modern regression methods. 2nd edn. John Wiley & Sons, New York (2008). 3. Chatterjee, S., Simonoff, J.S.: Handbook of Regression Analysis. John Wiley & Sons, New York (2013). 4. Drapper, N.R., Smith, H.: Applied Regression Analysis. John Wiley & Sons, New York (1998). 5. Arnuphaptrairong, T., Suksawasd, W.: An empirical validation of mobile application effort estimation models. In: Proceedings of the International MultiConference of Engineers and Computer Scientists (IMECS 2017), pp. 697-701. Newswood Limited, Hong Kong (2017). 6. Rouse, M.: Mobile application development, https://searchmicroservices.techtarget.com/definition/mobile-application-development, last accessed 2019/10/12. 7. Francese, R., Gravino, C., Risi, M., Scanniello, G., Tortora, G.: On the use of requirements measures to predict software project and product measures in the context of Android mo- bile apps: A preliminary study. In: Proceedings of the 41st Euromicro Conference on Software Engineering and Advanced Applications (SEAA 2015), pp. 357-364. IEEE Computer Society, Funchal (2015). doi: 10.1109/SEAA.2015.22 8. Shahwaiz, S.A., Malik, A.A., Sabahat N.: A parametric effort estimation model for mobile apps. In: Proceedings of the 19th International Multi-Topic Conference (INMIC 2016), pp. 1-6. IEEE, Islamabad (2016). doi: 10.1109/INMIC.2016.7840114 9. Prykhodko, N.V., Prykhodko, S.B.: Constructing the non-linear regression models on the basis of multivariate normalizing transformations. Electronic modeling 6(40), 101-110 (2018). doi: 10.15407/emodel.40.06.101 10. Johnson, R.A., Wichern, D.W.: Applied multivariate statistical analysis. Pearson Prentice Hall (2007). 11. Foss, T., Stensrud, E., Kitchenham, B., Myrtveit, I.: A simulation study of the model eval- uation criterion MMRE. IEEE Transactions on software engineering 11(29), 985–995 (2003). 12. Port, D., Korte, M.: Comparative studies of the model evaluation criterions MMRE and PRED in software cost estimation research. In: Proceedings of the 2nd ACM-IEEE Inter- national Symposium on Empirical Software Engineering and Measurement, pp. 51–60. ACM, New York (2008). 13. Prykhodko, S.B., Prykhodko, N.V.: A method for improving non-linear regression models based on multivariate normalizing transformations. In: Proceedings of the 3d International Conference on Applied Scientific and Technical Research, pp. 20. Symfoniya fortu, Ivano- Frankivsk (2019). (in Ukrainian). 14. Prykhodko, S., Prykhodko, N., Makarova, L., Pukhalevych, A.: Application of the squared Mahalanobis distance for detecting outliers in multivariate non-Gaussian data. In: Proceed- ings of the 14th International Conference on Advanced Trends in Radioelectronics, Tele- communications and Computer Engineering (TCSET), pp. 962-965. IEEE, Lviv-Slavske (2018). doi: 10.1109/TCSET.2018.8336353 15. Mardia, K.V.: Measures of multivariate skewness and kurtosis with applications. Bio- metrika 3(57), 519–530 (1970). doi: 10.1093/biomet/57.3.519 16. Olkin, I., Sampson, A.R.: Multivariate Analysis: Overview. In: Smelser, N.J., Baltes, P.B. (eds.) International encyclopedia of social & behavioral sciences. 1st edn. Elsevier, Per- gamon (2001).