=Paper=
{{Paper
|id=Vol-1623/paperapp7
|storemode=property
|title=Application of Active Set Method for Soft Sensor Model Identification of Crude Oil Distillation Process
|pdfUrl=https://ceur-ws.org/Vol-1623/paperapp7.pdf
|volume=Vol-1623
|authors=Anton Goncharov,Andrey Torgashov
|dblpUrl=https://dblp.org/rec/conf/door/GoncharovT16
}}
==Application of Active Set Method for Soft Sensor Model Identification of Crude Oil Distillation Process==
Application of Active Set Method for Soft Sensor Model Identification of Crude Oil Distillation Process Anton Goncharov 1, Andrei Torgashov 1,2 1 Institute of Automation and Control Processes FEB RAS, Vladivostok, Russia, antalg@mail.ru 2 Far Eastern Federal University, Vladivostok, Russia, torgashov@iacp.dvo.ru Abstract. The problem of identification of the soft sensors is considered. Using the active set method the problem of least squares with inequality constraints on the variables has been solved. As a result of using soft sensors, obtained taking into account constraints on the model coefficients, their efficiency as compared with soft sensors obtained without constraints is shown. Keywords: parametric identification, optimization, soft sensor, constraints, dis- tillation column. 1 Introduction and statement of the problem The approach to solve the soft sensor model obtaining problem, taking into account inequality constraints on the model coefficients, is described [1]. The obtained results are tested on industrial oil fractionation process for atmospheric distillation column. The soft sensor model coefficients for prediction of the key product quality are identified. The technological plant with several measured inputs u1 , u 2 ,, u N and one output y ( ) is considered. The measured technological parameters (pressure, temperature, flow) are utilized as inputs. In practice, the amount of available process measured parameters to predict the quality of a product is much more than the required number of parameters. A priori knowledge of technological process allows to select necessary parameters. The problem of the soft sensor (SS) evaluation which is best predicting quality of products of crude distillation technological process is considered. The model of the soft sensor is obtained in the form of linear regression model for solution of the problem [2]: y( ) b0 b1u1 b2u2 ... bN uN , (1) Copyright © by the paper's authors. Copying permitted for private and academic purposes. In: A. Kononov et al. (eds.): DOOR 2016, Vladivostok, Russia, published at http://ceur-ws.org 724 A. Goncharov and A. Torgashov where b j – j-th model coefficient, j 0,1,..., N , b0 – constant term, N – the number of input variables, - irregular timepoints of output measurement: 1 , 2 , 3 ,... , i i 1 0 , i 2 ; 1 0 ; 0 - constant term; - random component is limited by specific range. The determination coefficient (the part of explained deviations variance of the dependent variable from its mean value): R 2 1 i( yi yi ) 2 i( yi y a ) 2 (2) and root mean squared error (RMSE): RMSE i 1 ( y i y i ) 2 / M M , 1/ 2 (3) are used as criteria of identification on a given time interval, where y i - the measured value of the output variable, y i - the value is obtained based on the SS, y a - the mean value of the measured output variable, M - the number of output measurement. The model is more consistent if the closer to unity the value of the coefficient of determination R 2 , or the closer to zero the value of the RMSE. 2 The proposed algorithm problem solution Let u [1, u1 ( ), u2 ( ),..., uN ( )]T be a combined vector of measured input variables, b = [b0 , b1 ,..., bN ]T - vector of coefficients of the same dimension, the components of which reflect the contributions of the respective input variables. Then the equation (1) takes the form: y uT b . We form the vector Y of dimension q from the output value y: Y [ y( 1 ), y( 2 ),..., y( q )]T and the matrix U, containing the measured inputs uj, corresponding to output value y from (1): 1 u1 ( 1 ) u2 ( 1 ) ....uN ( 1 ) 1 u1 ( 2 ) u2 ( 2 ) ....uN ( 2 ) U , 1 u1 ( q ) u2 ( q ) ...u N ( q ) Application of Active Set Method 725 and we write the matrix equation: Y Ub . We introduce error function: E Y Y Y Ub , where Y is the actual measurement of output, and minimize the objective function: Ψ E 2 (Y Ub ) 2 . (4) We obtain estimates of the parameters b by least squares method: b (UT U)1 UT Y . (5) The multicollinearity case, which occurs when there is almost a linear relationship between inputs, is considered. In this case the matrix C U T U is close to singular, so it is the smallest eigenvalue min 0 and the condition number is infinitely increased and causing the instability of the solution (5). If the min 0 then it corresponds the strict multicollinearity [3]. In order to obtain a stable solution of the equation (5) it is necessary to reduce the condition number of the matrix C, for example, by adding thereto a diagonal matrix B kI (k> 0). Then the solution is found in a class of ridge parameter estimates: b (UT U kI)1 UT Y . (6) The quality, obtained by (5-6) models, depends on the number of available output measurements. The length of training sample is often insufficient to obtain reliable results. Also, the available data contain significant measurement error of inputs and outputs, unmeasured influences. Taking into account constraints on the model coefficients b j allows to avoid these problems. When taking into account constraints on coefficients at input, the problem of least squares with simple constraints on the variables is solved: min Y Ub 2 (7) b min b b max . The solution of the problem (7) is obtained by the active set numerical method [4]. The given constraints are reduced to the form: Ab bˆ , 726 A. Goncharov and A. Torgashov 1 0 0 0 1 0 bˆ b min where 0 0 1 , max . A b 1 0 0 0 1 0 0 0 1 Constraint aTi b bˆi call active in acceptable point b if aTi b bˆi , and inactive if aTi b bˆi , aTi - i-th row of A . The sufficient minimum conditions for simple constraints are as follows: FR b FR b FR 1. b min b * b max , b min * max 2. UTFR (Y Ub* ) 0 3. λ min UTmin (Y Ub * ) , λ imin 0 , i 1,, t min , (8) λ max UTmax (Y Ub* ) , λ max 0 , i 1,, t max i 4. U TFR U FR is positive definite, where b* - the minimum point of the solution of problem (7); subscript FR indicates that in the vector and matrices the elements and columns with index numbers corresponding to the index numbers of b elements, that have not met the boundary values (7), are used; subscript min, max, indicates that the in matrix only the columns with index numbers corresponding to the index numbers of b elements, taking the appropriate minimum or maximum boundary value, are used. tmin,, tmax.- number of active upper and lower limits respectively; λ min , λ max - vectors of Lagrange multipliers corresponding to the lower and upper active constraints. To start the method of the active set it is necessary to determine the starting point using (6). The minimum point b* for the search algorithm for iteration k is: 1. Performance verification of the stop conditions. (Reaching the performance errors of conditions (8), constraints on the number of iterations). 2. Selection of a logic branch. Does it make sense to remove any constraint of the set of active constraints list. The condition of performance of a condition 3 in (8) is checked. If the condition is not satisfied for some of the vector element, constraint is excluded from the list of active constraints. 3. The calculation of the search direction p k . Like (4) solves the problem min Y Ub k U FRp FR . Calculate the non-zero N 1 tk - dimensional vector 2 Application of Active Set Method 727 p FR and the direction of search p k AT FR p FR , where t k - the number of active constraints on k iteration. b FR p 4. Calculate the step length k . From Ψ FR bˆ FR diagonal matrix b FR p FR Ψ is calculated, b̂ FR - consists of the elements b̂ which aren't active constraints, elements b̂ , opposite boundary values in (7) for constraints in the active set, are excluded from b̂ FR . The k min Ψii is an available minimum positive step from b k along p k . The index j of minimum positive diagonal element Ψ is remembered. If k 1 , then k 1 , otherwise k k . 5. Constraint is added in the list of active constraints. If k , then j constraint bˆ FR becomes active, it is necessary to add to the list. 6. Recalculation approximation. bk 1 bk k pk is calculated, and return to step 1 of the algorithm is carried out. The influence of the process dynamics on the quality of the products is taken into account by the dynamic SS. The predictive model is represented as a sum of convolutions of plant inputs and a finite impulse response (FIR) hi (discrete analogues of the first degree Volterra kernels): y( ) h0 k10 h1 (k 1)u1 ( k ) k 20 h2 (k 1)u 2 ( k ) ... k N0 hN (k 1)u N ( k ), (9) n 1 n 1 n 1 where h0 – constant term, - irregular timepoints of output measurement: 1 , 2 , 3 ,... , i i 1 0 , i 2 ; 1 0 ; 0 - constant component; - random component is limited by the specific range. Let u [1, u1 ( ),..., u1 ( n1 1),..., u N ( ),..., u N ( n N 1)]T - the combined vector of measured input variables of dynamic SS (DSS) with dimensionality q 1 k 1 nk where nk - is a number of values of k-th input, N h h0 , h1 (1), ..., h1 (n1 ), ..., hN (1), ..., hN (nN ) - vector FIR of the same dimension, the т components of which reflect the contributions of the respective input variables of DSS. Then the equation (9) takes the form: y uT h . We form the vector Y of dimension q from the output value y: Y [ y( 1 ), y( 2 ),..., y( q )]T and the matrix U, containing the measured inputs uj, corresponding to output value y from (9): 728 A. Goncharov and A. Torgashov 1 u1 ( 1 ) u1 ( 1 n1 1) u N ( 1 ) u N ( 1 n N 1) 1 u ( ) u ( n 1) u ( ) u ( n 1) U . 1 2 1 2 1 N 2 N 2 N 1 u1 ( q ) u1 ( q n1 1) u N ( q ) u N ( q n N 1) Then write the matrix equation: Y Uh . We introduce the error function: E Y Y Y Uh , where Y is the actual measurement of output, and minimize the objective function: Ψ E2 (Y Uh ) 2 . (10) The constraints on transient response components are written as: s min s s max , (11) where s s1 (1), ..., s1 (n1 ), ..., s N (1), ..., s N (nN ) т , s min s1min , ..., s min N т , s max s1max , ..., s max N . т The transient response components s are related with the components of the impulse response h by the relations: s j k i 1 h j (i), j 1, 2, , N , k 1, , n j . k (12) The constraints (11) are reduced to: ~ Ah sˆ , (13) where 1 0 0 1 1 0 h1 (1) , , ˆ s min h ( n ) 0 0 1 1 1 s . A ~ s max 1 0 0 h hN (1) 1 1 0 h (n ) N N 0 0 1 The sufficient minimum conditions are as follows: ~ ~ 1. Ah sˆ , A ACT h sˆ ACT Application of Active Set Method 729 2. Z T * UT ( Y Uh * ) 0 (14) 3. λ A ACT A TACT A 1 ACT U T ( Y Uh * ) , λ i 0 , i 1,, t 4. Z T UT U Z is positive definite, where h* - is the minimum point, the solution of problem (10) with constraints (13); subscript ACT indicates that in vector, matrix only the elements, rows with index numbers corresponding to the elements index numbers of active constraint in (13) are used; t - number of active constraints, λ - the vectors of Lagrange multipliers corresponding to the active constraints, Z - matrix the columns of which is basis of the feasible direction of search for equality constraints (13). The matrix Z is formed by the variable-reduction technique. [4]. In order to start the method of the active set it is necessary to determine the starting point (using a solution of the problem (10) without any constraints, with subsequent correction of coefficients h i that does not fall under the constraints (13)). The search algorithm of minimum point h* for iteration k is: 1. Performance verification of the stop conditions (reaching the performance errors of conditions (14), constraints on the number of iterations). 2. Selection of a logic branch. Does it make sense to remove any constraint of the set of active constraints list. The condition of performance of a condition 3 in (14) is checked. If the condition is not satisfied for some of the vector element, constraint is excluded from the list of active constraints and it is need to recalculate Z k . 3. The calculation of the search direction p k . Like (4) solves the problem min Y Uh k UZ k p Z . Calculate the non-zero 1 k 1 nk t k - dimensional 2 N vector p Z and the direction of search p k Z k p Z , where t k - the number of active constraints on k iteration. 4. Calculate the step length k . From A h Ψp ~ ~ sˆ diagonal matrix Ψ is k calculated. Calculated k min Ψ ii - a minimum non-negative available step from h k along p k , where i is the index number of element, which is not active constraints in (13) and is not element of opposite boundary values (11) for ~ consists from elements of p without first element. constraints in the active set, p k k The index j of minimum positive diagonal element of Ψ ii is remembered. If k 1 , then k 1 , otherwise k k . 5. Constraint is added to the list of active constraints. If k , then j constraint sˆ FR becomes active and recalculate Z k . 6. Recalculation approximation. The hk 1 hk k pk is calculated and return to step 1. 730 A. Goncharov and A. Torgashov 3 The influence of constraints of parameters of soft sensors model A priori knowledge about the process and industrial step test (when the value of the one control variables is changed at fixed others) allow to define the value of constraints. In order to investigate the influence of constraints on the quality of the obtained static soft sensor model we compare solutions of the equations (6) and constrained optimization problem (7). One and the same value of the ridge coefficient is used. The Fig. 1 and Table 1 show the results of the performance of the static models on the test sample for the bubble-point temperatures of the target product (BP) when a model obtained on the training sample, consisting of a number of measurements specified in the Table 1. In the verification sample of models the number of measurements is equal to 420. Table 1. Results of the performance of the static models The number of Without use With use measurements in the constraints constraints training sample R2 RMSE R2 RMSE 30 0,188 1,790 0,604 1,249 Fig. 1. Comparative study of static soft sensor models performance Application of Active Set Method 731 In order to investigate the influence of constraints on the quality of the obtained dynamic soft sensor model we compare solutions of the optimization problem (10) and optimization problem (10) with constrains (13). One and the same value of the ridge coefficient is used. The Fig. 2 and Table 2 show the results of the performance of the dynamic models on the test sample for the dew-point temperatures of the target product (DP) when a model obtained on the training sample, consisting of a number of measurements specified in the Table 2. In the verification sample of models the number of measurements is equal to 650. Table 2. Results of the performance of the dynamic models The number of Without use con- With use con- measurements in the straints straints training sample R2 RMSE R2 RMSE 700 0,327 1,456 0,634 1,074 Fig. 2. Comparative study of dynamic soft sensor models performance The estimation of improvements of the prediction quality by the criterion RMSE of static model obtained with the constraints on the parameters of SS is 100(1,79 - 1,249) / 1,79 30% compared to the model without constraints. The estimation of improvements of the prediction quality by the criterion RMSE of dynamic model obtained with the constraints on the parameters of SS is 100 · (1,456– 1,074)/ 1,456 25 % compared to the model without constraints. 732 A. Goncharov and A. Torgashov Conclusion The using the method of the active set, taking into account constraints on the model coefficients can improve quality of the evaluated SS models. The test of the proposed approach to solving the problem of obtaining a soft sensor model for industrial crude oil distillation unit is showed that the decrease root mean square error on the test sample can be not less than 25%. References 1. Bakhtagze. N.N.: Virtual Analyzers: Identification Approach. In: Automation and Remote Control. Vol. 65, issue 11, pp.1691-1709. (2004) 2. Draper N., Smith H.: Applied regression analysis. M .: Finance and Statistics (1986) 3. Bolshakov A.A., Karimov R.N.: Methods of processing of the multidimensional given and time numbers. M .: Hotline Telecom (2007) 4. Gill, P.E., Murray, W., and Wright, M.H.: Practical Optimization, London: Academic, 1981. Translated under the title Prakticheskaya optimizatsiya, Moscow: Mir (1985)