Mathematical Modeling MULTI-CRITERIA OPTIMIZATION BASED ON THE REGRESSION EQUATION SYSTEMS IDENTIFICATION A.P. Kotenko1,2 , D.A. Pshenina2 1 Samara National Research University, Samara, Russia 2 Samara State Technical University, Samara, Russia Abstract. Consider the problem of multi-criteria optimization with conflicting criteria. An example is the complex chemical production with random parame- ters. We describe using regression equations dependence of targets from control actions. Investigation of the systems of interdependent regression equations re- quires description of all possible variants of their properties. The most produc- tive case is exact identification of the system parameters. It allows to find the optimal values of the control parameters for manufacturing of quality assured production. Keywords: multivariate optimization, statistical parameters, system of linear regressions, conflicting criteria. Citation: Kotenko AP, Pshenina DA. Multi-criteria optimization based on the identification systems of regression equations. CEUR Workshop Proceedings, 2016; 1638: 593-599. DOI: 10.18287/1613-0073-2016-1638-593-599 Basic notation system Let dependence of optimization criteria y1, y2,…, yn from the controlling factors x1, x2,…, xm is expressed by a system of linear regressions. In the structural form of the model (SFM) [1,2]          y1  a12 y2  a13 y3    a1n yn  b11x1  b12 x2    b1m xm  1 ,  y  a y  a y    a y  b x  b x    b x   ,  2 21 1 23 3 2n n 21 1 22 2 2m m 2  (1)           yn  an1 y1  an 2 y2    an,n1 yn1  bn1 x1  bn 2 x2    bnm xm   n ,   let us associate endogenous variables y i   y i1 , y i 2 ,  , y iN T   N , i  1, n ; and    exogenous variables x j  x j1 , x j 2 ,, x jN T   N , j  1, m . It corresponds to the reduced form of the model (RFM) [1,2] Information Technology and Nanotechnology (ITNT-2016) 593 Mathematical Modeling Kotenko A.P, Pshenina D.A. Multi-criteria… ~y   x   x     x , ~y   x   x     x , … 1 11 1 12 2 1m m 2 21 1 22 2 2m m    …, ~y n   n1 x1   n 2 x 2     nm x m (2) with regression values ~y i   ~y i1 , ~y i 2 ,  , ~y iN T   N and the reduced coefficients βij, found by the Ordinary Least Squares (OLS) method. At the same time vectors of the regression values of endogenous variables are ob- tained by orthogonal projection Pr of vectors of observations of endogenous variables    to the linear span [1] Lx1 , x2 ,, xm    N of dimension    dim Lx1 , x 2 ,  , x m   m of observed values vectors of the predefined variables ~y  Pr  y ; Lx , x ,  , x   Lx , x ,  , x  . i i 1 2 m 1 2 m Since the structure coefficients are functions of the solutions of the reduced system of linear algebraic equations (SLAE), the solvability of SFM identification problem (1) is determined by the ranks of the system’s matrix and the extended matrix of the sys- tem. Accordingly, there may be the following cases:  one solution (accurate identifiability of the system’s equation by indirect OLS);  no solution (superidentifiable equation when two-step OLS is applicable);  infinite many solutions (unidentifiable equation of the SFM system). Replacing the sample values of the endogenous variables of SFM (1) by the regres-    sion values from PFM (2), we obtain in the linear span Lx1 , x 2 ,  , x m  of system (1) predefined variables’ sample values the following independent linear equations, which enable us to identify the structural factors: ~    y1  a12 ~ y 2  a13 ~y3    a1n ~ y n  b11 x1  b12 x 2    b1m x m , ~y  a ~ ~ ~    2 21 y1  a 23 y 3    a 2 n y n  b21 x1  b22 x 2    b2 m x m , (3)  ~    y n  a n1 ~ y1  a n 2 ~ y 2    a n,n1 ~ y n1  bn1 x1  bn 2 x 2    bnm x m . From (1)-(3), we obtain the system of m linear equations to identify structural coeffi- cients of i-th equation: ~ m  n m  n m   m  yi    ik x k   aij ~ y j   bik x k   aij    jk x k    bik x k  k 1 j 1; j i k 1 j 1; j i  k 1  k 1 n n n   i1  bi1    j1aij , i 2  bi 2    j 2 aij , …, im  bim    jm aij (4) j 1; j i j 1; j i j 1; j i Let us consider the linear span L by combining multiple regression vector of endoge- nous variables values from the right hand side of the i-th equation of the system (4) and the set of vectors of sampled values of predefined variables. Information Technology and Nanotechnology (ITNT-2016) 594 Mathematical Modeling Kotenko A.P, Pshenina D.A. Multi-criteria… Identifiability of structure coefficients of the i-th equation is determined by a combi- nation of the following factors: the fact of its left hand side belonging (not belonging) to the linear span L of the SFM’s right hand side vectors and by their linear (in)dependence. In this case the two-step OLS result coincides with indirect OLS result if the consid- ered system (1) equation can be exactly identified. Setting up a problem The content of bitumen in the physical volume of asphalt is only 5-7%, but it has the most significant influence on the quality and durability of the road surface. Bitumen production is one of the most energy-intensive. Energy costs for production of oxi- dized bitumen consist of steam, fuel and electricity consumptions. [3] Residual bitumen production technology is based on the concentration of heavy petro- leum residues of tar vacuum distillation. Tars characteristics affect the quality of road asphalts, in turn, these characteristics can be influenced by changing the composition of the hydrocarbon composition of the feedstock. The main factors of the residual bitumen production process are the depth of the vac- uum, steam flow and distillation temperature. Physical and chemical properties of bitumen depend on the hydrocarbon composition of raw materials. Thus, there is an inverse problem as follows: to obtain the desired properties of the product we need to determine the characteristics of raw materials composition and production process parameters. Mathematical model For this purpose, it proposed to use the methods of mathematical modeling, in particu- larly, to set up the system of regression equations where the input parameters are the parameters of the approved standards, and the results of calculations are the character- istics of the bitumen. Input variables are as follows: х1 – penetration depth of the needle at 250C, х2 – penetration depth of the needle at 00C, х3 – extensibility at 250C, х4 – dynamic viscosity at 600C, х5 – softening temperature change after heating, х6 – extensibility after warming up. Calculations will enable us to determine the following values: у1 – sulfur in %%, у2 – paraffins and naphtha of tar in %%, у3 – heating temperature, у4 – air consumption, у5 – warming-up duration, у6 – oil tar in %%, Information Technology and Nanotechnology (ITNT-2016) 595 Mathematical Modeling Kotenko A.P, Pshenina D.A. Multi-criteria… у7 – VU80 tar index, у8 – light aromatics of tar in %%. 65 observations have been processed. [4] Setting up each of the regression equation was carried out in the frameworks of a posteriori approach, implying the inclusion of all feasible variables and sequential exclusion of insignificant ones. At the first step let us set up regression equations with six variables: у1=4,05+0,014х1–0,08х2+0,003х3–0,005х4–0,11х5–0,0014х6, R2=0,745399. Analysis of the Student’s coefficients revealed the following significant variables: у1=5,17–0,075х2–0,0054х4–0,117х5, R2=0,72951. Similarly, у2=2,61+0,17х1+0,27х2–0,043х3+0,002х4+0,89х5–0,011х6, R2=0,759902;  y2=3,34+0,23х1–0,05х3+1,35х5, R2=0,74412; у3=251,98–0,16х1–0,58х2–0,076х3+0,001х4+4,07х5–0,03х6, R2=0,219015;  у3=222,17+3,69х5, R2=0,191535; у4=7,15–0,02х1–0,02х2–0,003х3+0,0004х4–0,02х5–0,006х6, R2=0,234903;  y4=5,24–0,002х6, R2=0,05344; у5=–6,73+0,13х1+0,13х2–0,01х3+0,0075х4+0,38х5–0,0075х6, R2=0,743188;  у5=–10,73+0,18х1+0,0064х4+0,764х5, R2=0,705288; у6=11,72+0,0096х1–0,16х2–0,009х3–0,0008х4+0,18х5+0,009х6, R2=0,121453;  у6=12,12–0,16х2+0,18х5, R2=0,061457; у7=543,24–4,75х1-2,35х2–0,51х3–0,1х4–12,16х5+0,94х6, R2=0,677759;  у7=321,05–4,59х1+1,024х6, R2=0,61394; у8=13,21–0,066х1+0,1х2+0,0079х3–0,024х4+0,597х5–0,0088х6, R2=0,502146;  у8=14,96–0,0235х4, R2=0,393982. Let us set up a system of equations out of significant regression equations: у1=5,17–0,075х2–0,0054х4–0,117х5; y2=3,34+0,23х1–0,05х3+1,35х5; у3=222,17+3,69х5; y4=5,24–0,002х6; у5=–10,73+0,18х1+0,0064х4+0,764х5; у7=321,05–4,59х1+1,024х6; у8=14,96–0,0235х4. The parameters provided by the RF Standard are considered to be the given variables. They are as close as possible to Euro-requirements establishing quality standards for EU road bitumen: х1=60; х2=13; х3=80; х4=320; х5=3; х6=40. Hence, у1=1,104; у2=21,85; у3=11,07; у4=–0,08; у5=15,14; у7=–234,4; у8=7,52. The results show the statistical unreliability of production results when production process is carried out in accordance with current technologies under the specified Standard values. Information Technology and Nanotechnology (ITNT-2016) 596 Mathematical Modeling Kotenko A.P, Pshenina D.A. Multi-criteria… Forecasts confidence intervals On the basis of 10 samples there were calculated correlations (Table 1), indicating a strong linear relationship between the following parameters [5]: X –softening temperature, Y1 – dynamic viscosity at 600C, Y2 – needle penetration depth at 250C, Y3 – extensibility after warming up. This allows to find linear regressions among all pairs of the indices. As a general regressor there was chosen a softening temperature X, for which the measurement can be made in the most accurate way. Table 1. Linear correlation coefficient matrix X Y1 Y2 Y3 X 1 Y1 0,983569 1 Y2 –0,977430 0,975450 1 Y3 –0,972410 0,961180 0,988793 1 Significant regressions are obtained: Y1=–2038,63+45,29X, R2=0,9674; Y3=868,37–15,97X, R2=0,9456. The high values of determination coefficients indicate the adequacy of the proposed linear models to experimental data. The equations’ parameters are conservatively significant according to Fisher’s test at the significance level of α=0,01–0,05–0,10. The link between Y2 and X has not been studied, as it is co-directional to the link be- tween Y3 and X. There were found the forecast confidence intervals for Yregr1 (Table 2) and Yregr3 (Table 3). Regressor X values and Y1 and Y3 forecast confidence intervals boundaries satisfying Standard ( X  49 , Y1  250 , Y3  80 ) are marked in the Tables 2,3 by italics. Thus, in Table 2 only the rows with T  50 allow the values of Y1 to satisfy the Standard, while in Table 3 these are only the rows with T  50,4 . However, at significance level of   0,05 there is only the line with T=50 in both tables, which is suitable. Y3 index observed value is sufficiently shifted to the left boundaries of confidence intervals (both at α=0,05 and at α=0,01), which do not meet the Standard. Information Technology and Nanotechnology (ITNT-2016) 597 Mathematical Modeling Kotenko A.P, Pshenina D.A. Multi-criteria… Table 2. Forecast confidence intervals for dynamic viscosity Y1 according to X softening point regression Variables α=0,01 α=0,05 α=0,10 X Y1 Yregr1–Δ Yregr1+Δ Yregr1–Δ Yregr1+Δ Yregr1–Δ Yregr1+Δ 48 138 79,68 191,18 97,11 173,74 104,53 166,32 48,5 159 104,14 212,00 121,01 195,14 128,18 187,96 49 171 128,20 233,23 144,63 216,81 151,62 209,82 49,3 180 142,43 246,18 158,66 229,96 165,56 223,05 50 230 175,00 277,02 190,95 261,07 197,74 254,28 50,4 241 193,19 295,07 209,12 279,14 215,90 272,36 51 300 219,91 322,70 235,99 306,62 242,82 299,79 51,8 316 254,51 360,57 271,10 343,99 278,15 336,93 52 320 262,98 370,21 279,75 353,45 286,89 346,31 53 341 304,42 419,36 322,40 401,39 330,04 393,74 Table 3. Forecasts confidence intervals for extensibility after warming up Y3 according to X softening point regression Variables α=0,01 α=0,05 α=0,10 X Y3 Yregr3–Δ Yregr3+Δ Yregr3–Δ Yregr3+Δ Yregr3–Δ Yregr3+Δ 48 110 75,94 127,34 83,98 119,30 87,40 115,88 48,5 100 68,79 118,51 76,57 110,74 79,88 107,43 49 80 61,46 109,87 69,03 102,30 72,25 99,08 49,3 79 56,96 104,78 64,44 97,31 67,62 94,12 50 64 46,18 93,20 53,53 85,85 56,66 82,72 50,4 55 39,82 86,78 47,17 79,44 50,29 76,31 51 47 30,03 77,41 37,44 70,00 40,59 66,85 51,8 46 16,50 65,38 24,14 57,74 27,39 54,49 52 43 13,03 62,46 20,76 54,73 24,05 51,44 53 25 –4,72 48,26 3,57 39,98 7,09 36,45 Information Technology and Nanotechnology (ITNT-2016) 598 Mathematical Modeling Kotenko A.P, Pshenina D.A. Multi-criteria… Conclusion Standard boundaries are contradictory for indexes X, Y1 and Y3, which means the ne- cessity of technological upgrade of production process if it is aimed at European bi- tumen standards. This example illustrates the possibility of using identifiable systems of regression equations. They help to define the boundaries of control factors of production to en- sure product quality at a given conflicting criteria and random perturbations. References 1. Kotenko AP. Geometry of Systems of Econometric Equations. Control of organizational and economic systems. Samara State Aerospace University, 2012; 9: 35-41. [In Russian] 2. Kotenko AP, Bukarenko MB. Geometry of Systems of Linear Regression Equations. News of Samara Science Center or Russian Academy of Science, 2013; Vol. 15; 6(3): 820-823. [In Russian] 3. Kotenko AP, Kuznetsova OA. The Use of Multivariate Regression Analysis for Optimizing of the Production of Bitumen of Standardized Specifications. Modern Information Technol- ogy and IT-Education, Proc. of 10th Conf., Moscow State University, 2015: 356-359. [In Russian] 4. Dokuchaev AV, Kotenko AA. Software for Multi-criteria Optimization of Production with the Help of Systems of Regression Equations. Rospatent. Certificate of registration No 2016612606. [In Russian] 5. Tukilina PM, Melnikov VN, Tyschenko VA, Ermakov VV, Pimenov AA. The Use of Mul- tivariate Data Analysis Method in the Development of Production Technology of High- quality Road Bitumen. Chemistry and Technology of Fuels and Oils, 2015; 5: 18-23. [In Russian] Information Technology and Nanotechnology (ITNT-2016) 599