=Paper= {{Paper |id=Vol-1638/Paper72 |storemode=property |title=Multi-criteria optimization based on the regression equation systems identification |pdfUrl=https://ceur-ws.org/Vol-1638/Paper72.pdf |volume=Vol-1638 |authors=Andrey P. Kotenko,Darya A. Pshenina }} ==Multi-criteria optimization based on the regression equation systems identification == https://ceur-ws.org/Vol-1638/Paper72.pdf
Mathematical Modeling



MULTI-CRITERIA OPTIMIZATION BASED ON THE
      REGRESSION EQUATION SYSTEMS
             IDENTIFICATION

                               A.P. Kotenko1,2 , D.A. Pshenina2
                   1
                        Samara National Research University, Samara, Russia
                        2
                          Samara State Technical University, Samara, Russia



       Abstract. Consider the problem of multi-criteria optimization with conflicting
       criteria. An example is the complex chemical production with random parame-
       ters. We describe using regression equations dependence of targets from control
       actions. Investigation of the systems of interdependent regression equations re-
       quires description of all possible variants of their properties. The most produc-
       tive case is exact identification of the system parameters. It allows to find the
       optimal values of the control parameters for manufacturing of quality assured
       production.


       Keywords: multivariate optimization, statistical parameters, system of linear
       regressions, conflicting criteria.


       Citation: Kotenko AP, Pshenina DA. Multi-criteria optimization based on the
       identification systems of regression equations. CEUR Workshop Proceedings,
       2016; 1638: 593-599. DOI: 10.18287/1613-0073-2016-1638-593-599


Basic notation system
Let dependence of optimization criteria y1, y2,…, yn from the controlling factors x1,
x2,…, xm is expressed by a system of linear regressions. In the structural form of the
model (SFM) [1,2]
                                                                
 y1  a12 y2  a13 y3    a1n yn  b11x1  b12 x2    b1m xm  1 ,
 y  a y  a y    a y  b x  b x    b x   ,
 2      21 1    23 3          2n n      21 1     22 2          2m m     2
                                                                                  (1)
 
                                                                    
 yn  an1 y1  an 2 y2    an,n1 yn1  bn1 x1  bn 2 x2    bnm xm   n ,

                                               
let us associate endogenous variables y i   y i1 , y i 2 ,  , y iN T   N , i  1, n ; and
                    
                                               
exogenous variables x j  x j1 , x j 2 ,, x jN T   N , j  1, m .
It corresponds to the reduced form of the model (RFM) [1,2]


Information Technology and Nanotechnology (ITNT-2016)                                      593
Mathematical Modeling                                       Kotenko A.P, Pshenina D.A. Multi-criteria…


~y   x   x     x , ~y   x   x     x , …
  1   11 1   12 2       1m m    2   21 1   22 2       2m m

                                       
…, ~y n   n1 x1   n 2 x 2     nm x m                                                      (2)

with regression values ~y i   ~y i1 , ~y i 2 ,  , ~y iN T   N and the reduced coefficients βij,
found by the Ordinary Least Squares (OLS) method.
At the same time vectors of the regression values of endogenous variables are ob-
tained by orthogonal projection Pr of vectors of observations of endogenous variables
                                       
to the linear span [1] Lx1 , x2 ,, xm    N of dimension
                    
 dim Lx1 , x 2 ,  , x m   m

of observed values vectors of the predefined variables
~y  Pr  y ; Lx , x ,  , x   Lx , x ,  , x  .
  i         i      1 2           m         1 2           m

Since the structure coefficients are functions of the solutions of the reduced system of
linear algebraic equations (SLAE), the solvability of SFM identification problem (1)
is determined by the ranks of the system’s matrix and the extended matrix of the sys-
tem.
Accordingly, there may be the following cases:
         one solution (accurate identifiability of the system’s equation by indirect
            OLS);
         no solution (superidentifiable equation when two-step OLS is applicable);
         infinite many solutions (unidentifiable equation of the SFM system).
Replacing the sample values of the endogenous variables of SFM (1) by the regres-
                                                                                 
sion values from PFM (2), we obtain in the linear span Lx1 , x 2 ,  , x m  of system (1)
predefined variables’ sample values the following independent linear equations,
which enable us to identify the structural factors:
     ~                                                                  
      y1  a12 ~
               y 2  a13 ~y3    a1n ~ y n  b11 x1  b12 x 2    b1m x m ,
    ~y  a ~             ~               ~                               
       2    21 y1  a 23 y 3    a 2 n y n  b21 x1  b22 x 2    b2 m x m ,
                                                                                       (3)
                                         
 ~                                                                          
 y n  a n1 ~
            y1  a n 2 ~
                       y 2    a n,n1 ~
                                         y n1  bn1 x1  bn 2 x 2    bnm x m .

From (1)-(3), we obtain the system of m linear equations to identify structural coeffi-
cients of i-th equation:
 ~
       m             n             m            n      m       m 
 yi    ik x k   aij ~   y j   bik x k   aij    jk x k    bik x k 
      k 1        j 1; j i       k 1       j 1; j i  k 1      k 1
                       n                          n                                n
  i1  bi1           j1aij , i 2  bi 2    j 2 aij , …, im  bim    jm aij           (4)
                    j 1; j i                 j 1; j i                       j 1; j i


Let us consider the linear span L by combining multiple regression vector of endoge-
nous variables values from the right hand side of the i-th equation of the system (4)
and the set of vectors of sampled values of predefined variables.

Information Technology and Nanotechnology (ITNT-2016)                                             594
Mathematical Modeling                              Kotenko A.P, Pshenina D.A. Multi-criteria…


Identifiability of structure coefficients of the i-th equation is determined by a combi-
nation of the following factors: the fact of its left hand side belonging (not belonging)
to the linear span L of the SFM’s right hand side vectors and by their linear
(in)dependence.
In this case the two-step OLS result coincides with indirect OLS result if the consid-
ered system (1) equation can be exactly identified.


Setting up a problem

The content of bitumen in the physical volume of asphalt is only 5-7%, but it has the
most significant influence on the quality and durability of the road surface. Bitumen
production is one of the most energy-intensive. Energy costs for production of oxi-
dized bitumen consist of steam, fuel and electricity consumptions. [3]
Residual bitumen production technology is based on the concentration of heavy petro-
leum residues of tar vacuum distillation. Tars characteristics affect the quality of road
asphalts, in turn, these characteristics can be influenced by changing the composition
of the hydrocarbon composition of the feedstock.
The main factors of the residual bitumen production process are the depth of the vac-
uum, steam flow and distillation temperature. Physical and chemical properties of
bitumen depend on the hydrocarbon composition of raw materials. Thus, there is an
inverse problem as follows: to obtain the desired properties of the product we need to
determine the characteristics of raw materials composition and production process
parameters.


Mathematical model
For this purpose, it proposed to use the methods of mathematical modeling, in particu-
larly, to set up the system of regression equations where the input parameters are the
parameters of the approved standards, and the results of calculations are the character-
istics of the bitumen.
Input variables are as follows:
х1 – penetration depth of the needle at 250C,
х2 – penetration depth of the needle at 00C,
х3 – extensibility at 250C,
х4 – dynamic viscosity at 600C,
х5 – softening temperature change after heating,
х6 – extensibility after warming up.
Calculations will enable us to determine the following values:
у1 – sulfur in %%,
у2 – paraffins and naphtha of tar in %%,
у3 – heating temperature,
у4 – air consumption,
у5 – warming-up duration,
у6 – oil tar in %%,


Information Technology and Nanotechnology (ITNT-2016)                                    595
Mathematical Modeling                            Kotenko A.P, Pshenina D.A. Multi-criteria…


у7 – VU80 tar index,
у8 – light aromatics of tar in %%.
65 observations have been processed. [4]
Setting up each of the regression equation was carried out in the frameworks of a
posteriori approach, implying the inclusion of all feasible variables and sequential
exclusion of insignificant ones.
At the first step let us set up regression equations with six variables:
у1=4,05+0,014х1–0,08х2+0,003х3–0,005х4–0,11х5–0,0014х6, R2=0,745399.
Analysis of the Student’s coefficients revealed the following significant variables:
у1=5,17–0,075х2–0,0054х4–0,117х5, R2=0,72951.
Similarly,
у2=2,61+0,17х1+0,27х2–0,043х3+0,002х4+0,89х5–0,011х6, R2=0,759902; 
y2=3,34+0,23х1–0,05х3+1,35х5, R2=0,74412;
у3=251,98–0,16х1–0,58х2–0,076х3+0,001х4+4,07х5–0,03х6, R2=0,219015; 
у3=222,17+3,69х5, R2=0,191535;
у4=7,15–0,02х1–0,02х2–0,003х3+0,0004х4–0,02х5–0,006х6, R2=0,234903; 
y4=5,24–0,002х6, R2=0,05344;
у5=–6,73+0,13х1+0,13х2–0,01х3+0,0075х4+0,38х5–0,0075х6, R2=0,743188; 
у5=–10,73+0,18х1+0,0064х4+0,764х5, R2=0,705288;
у6=11,72+0,0096х1–0,16х2–0,009х3–0,0008х4+0,18х5+0,009х6, R2=0,121453; 
у6=12,12–0,16х2+0,18х5, R2=0,061457;
у7=543,24–4,75х1-2,35х2–0,51х3–0,1х4–12,16х5+0,94х6, R2=0,677759; 
у7=321,05–4,59х1+1,024х6, R2=0,61394;
у8=13,21–0,066х1+0,1х2+0,0079х3–0,024х4+0,597х5–0,0088х6, R2=0,502146; 
у8=14,96–0,0235х4, R2=0,393982.
Let us set up a system of equations out of significant regression equations:
у1=5,17–0,075х2–0,0054х4–0,117х5;
y2=3,34+0,23х1–0,05х3+1,35х5;
у3=222,17+3,69х5;
y4=5,24–0,002х6;
у5=–10,73+0,18х1+0,0064х4+0,764х5;
у7=321,05–4,59х1+1,024х6;
у8=14,96–0,0235х4.
The parameters provided by the RF Standard are considered to be the given variables.
They are as close as possible to Euro-requirements establishing quality standards for
EU road bitumen:
х1=60; х2=13; х3=80; х4=320; х5=3; х6=40.
Hence,
у1=1,104; у2=21,85; у3=11,07; у4=–0,08; у5=15,14; у7=–234,4; у8=7,52.
The results show the statistical unreliability of production results when production
process is carried out in accordance with current technologies under the specified
Standard values.




Information Technology and Nanotechnology (ITNT-2016)                                  596
Mathematical Modeling                                 Kotenko A.P, Pshenina D.A. Multi-criteria…



Forecasts confidence intervals
On the basis of 10 samples there were calculated correlations (Table 1), indicating a
strong linear relationship between the following parameters [5]:
X –softening temperature,
Y1 – dynamic viscosity at 600C,
Y2 – needle penetration depth at 250C,
Y3 – extensibility after warming up.
This allows to find linear regressions among all pairs of the indices.
As a general regressor there was chosen a softening temperature X, for which the
measurement can be made in the most accurate way.

                        Table 1. Linear correlation coefficient matrix
         X               Y1              Y2            Y3
 X       1
 Y1      0,983569        1
 Y2      –0,977430       0,975450        1
 Y3      –0,972410       0,961180        0,988793      1

Significant regressions are obtained:
Y1=–2038,63+45,29X, R2=0,9674; Y3=868,37–15,97X, R2=0,9456.
The high values of determination coefficients indicate the adequacy of the proposed
linear models to experimental data. The equations’ parameters are conservatively
significant according to Fisher’s test at the significance level of α=0,01–0,05–0,10.
The link between Y2 and X has not been studied, as it is co-directional to the link be-
tween Y3 and X. There were found the forecast confidence intervals for Yregr1 (Table 2)
and Yregr3 (Table 3).
Regressor X values and Y1 and Y3 forecast confidence intervals boundaries satisfying
Standard ( X  49 , Y1  250 , Y3  80 ) are marked in the Tables 2,3 by italics. Thus,
in Table 2 only the rows with T  50 allow the values of Y1 to satisfy the Standard,
while in Table 3 these are only the rows with T  50,4 .
However, at significance level of   0,05 there is only the line with T=50 in both
tables, which is suitable. Y3 index observed value is sufficiently shifted to the left
boundaries of confidence intervals (both at α=0,05 and at α=0,01), which do not meet
the Standard.




Information Technology and Nanotechnology (ITNT-2016)                                       597
Mathematical Modeling                                   Kotenko A.P, Pshenina D.A. Multi-criteria…


Table 2. Forecast confidence intervals for dynamic viscosity Y1 according to X softening point
                                          regression
   Variables             α=0,01                     α=0,05                    α=0,10
   X       Y1     Yregr1–Δ     Yregr1+Δ    Yregr1–Δ     Yregr1+Δ     Yregr1–Δ     Yregr1+Δ
  48      138      79,68       191,18       97,11        173,74       104,53       166,32
 48,5     159      104,14      212,00       121,01       195,14       128,18       187,96
  49      171      128,20      233,23       144,63       216,81       151,62       209,82
 49,3     180      142,43      246,18       158,66       229,96       165,56       223,05
  50      230      175,00      277,02       190,95       261,07       197,74       254,28
 50,4     241      193,19      295,07       209,12       279,14       215,90       272,36
  51      300      219,91      322,70       235,99       306,62       242,82       299,79
 51,8     316      254,51      360,57       271,10       343,99       278,15       336,93
  52      320      262,98      370,21       279,75       353,45       286,89       346,31
  53      341      304,42      419,36       322,40       401,39       330,04       393,74


  Table 3. Forecasts confidence intervals for extensibility after warming up Y3 according to X
                                  softening point regression
   Variables             α=0,01                     α=0,05                    α=0,10
   X       Y3     Yregr3–Δ     Yregr3+Δ    Yregr3–Δ     Yregr3+Δ     Yregr3–Δ     Yregr3+Δ
  48      110      75,94       127,34       83,98        119,30       87,40        115,88
 48,5     100      68,79       118,51       76,57        110,74       79,88        107,43
  49       80      61,46       109,87       69,03        102,30       72,25        99,08
 49,3      79      56,96       104,78       64,44        97,31        67,62        94,12
  50       64      46,18        93,20       53,53        85,85        56,66        82,72
 50,4      55      39,82        86,78       47,17        79,44        50,29        76,31
  51       47      30,03        77,41       37,44        70,00        40,59        66,85
 51,8      46      16,50        65,38       24,14        57,74        27,39        54,49
  52       43      13,03        62,46       20,76        54,73        24,05        51,44
  53       25      –4,72        48,26        3,57        39,98         7,09        36,45




Information Technology and Nanotechnology (ITNT-2016)                                         598
Mathematical Modeling                                Kotenko A.P, Pshenina D.A. Multi-criteria…



Conclusion
Standard boundaries are contradictory for indexes X, Y1 and Y3, which means the ne-
cessity of technological upgrade of production process if it is aimed at European bi-
tumen standards.
This example illustrates the possibility of using identifiable systems of regression
equations. They help to define the boundaries of control factors of production to en-
sure product quality at a given conflicting criteria and random perturbations.


References
1. Kotenko AP. Geometry of Systems of Econometric Equations. Control of organizational and
   economic systems. Samara State Aerospace University, 2012; 9: 35-41. [In Russian]
2. Kotenko AP, Bukarenko MB. Geometry of Systems of Linear Regression Equations. News
   of Samara Science Center or Russian Academy of Science, 2013; Vol. 15; 6(3): 820-823. [In
   Russian]
3. Kotenko AP, Kuznetsova OA. The Use of Multivariate Regression Analysis for Optimizing
   of the Production of Bitumen of Standardized Specifications. Modern Information Technol-
   ogy and IT-Education, Proc. of 10th Conf., Moscow State University, 2015: 356-359. [In
   Russian]
4. Dokuchaev AV, Kotenko AA. Software for Multi-criteria Optimization of Production with
   the Help of Systems of Regression Equations. Rospatent. Certificate of registration No
   2016612606. [In Russian]
5. Tukilina PM, Melnikov VN, Tyschenko VA, Ermakov VV, Pimenov AA. The Use of Mul-
   tivariate Data Analysis Method in the Development of Production Technology of High-
   quality Road Bitumen. Chemistry and Technology of Fuels and Oils, 2015; 5: 18-23. [In
   Russian]




Information Technology and Nanotechnology (ITNT-2016)                                      599