Identification of Parameters of the Basic Hydrophysical Characteristics of Soil Elena S. Zasukhina Dorodnicyn Computing Centre, FRC CSC RAS Vavilov st. 40, 119333 Moscow, Russia. elenazet2017@yandex.ru Sergey V. Zasukhin Moscow Institute of Physics and Technology 9 Institutskiy per., 141701, Dolgoprudny, Moscow Region, Russia s.zasukhin@yandex.ru Abstract The problem of determining parameters of the basic hydrophysical characteristics is studied. These parameters are defined by the type of the soil. To determine these parameters, a model of unsaturated water flow in porous media is considered. The modeled values of soil moisture at various depths are obtained as a result of solution of the initial boundary values problem for Richards equation. The parame- ters identification problem is stated as an optimal control problem. The objective function is mean-square deviation of simulated values of soil moisture at various points from some prescribed values. Discretized problem is proposed to solve by Marquardt-Levenberg method. 1 Introduction Models of water transfer in soils play an important role in modeling runoff in the catchment area.The hy- drophysical characteristics of soil included in these models are calculated, as a rule, by van Genuchten formulas [van Genuchten, 1980]. These formulas contain some parameters (VG-parameters). Their determination is not an easy task. The specification of the exact values of these parameters is of critical importance in modeling and predicting water flow and transfer of dissolved substances in the aeration zone. This problem has been studied by many authors. Various optimization methods were applied to obtain values of these parameters. Computer programs have been developed to determine these parameters. Here it should be noted the computer program RETC [van Genuchten et al., 1991], allowing among other things to determine them from the mea- sured function of water retention, as well as the program Rosetta [Scaap et al., 2001] which allows in particular to determine hydrological parameters with the help of pedotransfer functions obtained by neural networks. In [Pan & Wu, 1998], a hybrid algorithm based on simulated annealing was used in searching values of hydraulic Copyright ⃝ c by the paper’s authors. Copying permitted for private and academic purposes. In: Yu. G. Evtushenko, M. Yu. Khachay, O. V. Khamisov, Yu. A. Kochetov, V.U. Malkova, M.A. Posypkin (eds.): Proceedings of the OPTIMA-2017 Conference, Petrovac, Montenegro, 02-Oct-2017, published at http://ceur-ws.org 584 parameters. In [Takeshita, 1999], [Vrugt et al., 2001] the parameters were obtained by genetic algorithm. Over the past decades, to determine the parameters, many authors have applied the methods imitating the behavior of biological populations in conditions of lack of vital resources and migrating in order to find a place with favorable living conditions, algorithms imitating social behavior, see, for example, [Abbaspourt et al., 2001], [Yang & You, 2013]. In the present paper the parameter identification problem is stated as an optimal control problem in which the control is unknown parameters, and the objective function is mean-square deviation of calculated values of soil moisture at various depths from some prescribed values. Calculation of soil moisture is performed in according to the model of water transfer in soil. As a result of finite difference approximation, the problem is reduced to nonlinear programming problem. Numerical solution is obtained by Marquardt-Levenberg method. Jacobian of the moisture function is calculated by formulas of fast automatic differentiation [Aida-Zade & Evtushenko, 1989], [Griewank & Corliss, 1999], [Evtushenko, 1991], [Evtushenko, 1998]. 2 Problem Formulation Consider an one-dimensional model of vertical water transfer in soil. Suppose that soil is homogeneous isothermal non-deformable porous media. Under these assumptions vertical water transfer in soil is well described by one- dimensional nonlinear second order parabolic partial differential equation. Consider following initial boundary value problem: ( ) ∂θ ∂ ∂θ ∂K(θ) = D(θ) − , (z, t) ∈ Q, ∂t ∂z ∂z ∂z θ(z, 0) = φ(z), z ∈ (0, L), ( t) = ψ(t), t ∈)(0, T ), θ(L, (1) ∂θ − D(θ) − K(θ) = R(t) − E(t), t ∈ (0, T ), ∂z z=0 θmin ≤ θ(0, t) ≤ θmax , t ∈ (0, T ), where z is space variable; t is time; θ(z, t) is soil moisture at the point (z, t); Q = (0, L) × (0, T ); φ(z) and ψ(t) are given functions; D(θ) and K(θ) are diffusion coefficient and hydraulic conductivity – the hydrophysical characteristics of the soil; θmin = θr + ε and θmax = θs − ε, where θr and θs are, respectively, the residual moisture and the saturation moisture depending on the soil type, and ε is a constant such that 0 < ε ≪ θr ; R(t) is precipitation; E(t) is evaporation, 0 ≤ E(t) ≤ M , t ∈ (0, T ), M is a constant such that M > 0. The diffusion coefficient D(θ) and the hydraulic conductivity K(θ) appearing in this equation are found by the widely used van Genuchten formulas [van Genuchten, 1980] K(θ) = K0 S 0.5 [1 − (1 − S 1/m )m ]2 , 1−m (2) D(θ) = K0 S 0.5−1/m × [(1 − S 1/m )−m + (1 − S 1/m )m − 2], αm(θs − θr ) θ − θr where S = ; and K0 , α, m are some parameters. Described problem (1)-(2) will be called the direct θs − θr problem. Formulate the parameters identification problem. Let a function θ̂(z, t) be defined on some set Q0 ⊆ Q. Call this function θ̂(z, t) ”experimental data”. Introduce a set U = {u : u ∈ R3 ; 0 ≤ a[i] ≤ u[i] ≤ b[i], i = 1, 3}. Denote [K0 , α, m]T by u. The problem is to pick up the parameters K0 , α and m in such a way that corresponding solution of the direct problem (1)-(2) is as close as possible to the function θ̂(z, t) on the set Q0 . More precisely, the problem is to find uopt , uopt ∈ U , and corresponding solution θopt (z, t) of the direct problem (1)-(2) which minimize functional ∫ 1 J= (θ − θ̂)2 dzdt. 2 Q0 3 Discretization of the Direct Problem To approximate the direct problem (1)-(2 by finite differences we divide the intervals (0, T ) and (0, L) into N and I equal subintervals with the endpoints tn = τ n, 0 ≤ n ≤ N , and zi = hi, 0 ≤ i ≤ I, correspondingly, where τ = T /N h = L/I. Approximate the direct problem (1)-(2 by following finite differences scheme: 585 ( ) n+1 θi+1 − θi − θi−1 n+1 n+1 n+1 n+1 θin+1 − θin 1 n+1 θi = Di+1/2 − Ki+1/2 n+1 − Di−1/2 n+1 + Ki−1/2 , τ h h h 1 ≤ i < I; 0 ≤ n < N, θi0 = φi , 0 ≤ i ≤ I, θIn = ψ n , 1 ≤ n ≤ N. Here θin , Di+1/2 n n , Ki−1/2 are values of the functions θ(z, t), D(θ(z, t)), K(θ(z, t)) at the points (ih, nτ ), ((i + 1/2)h, nτ ), ((i − 1/2)h, nτ ), correspondingly. Approximate the left boundary condition in the form ( ) θ0n+1 − θ0n 2 n+1 n+1 θ1 − θ0n+1 = D1/2 − K1/2 + R n+1 n+1 −E n+1 , 0 ≤ n < N, τ h h where Rn+1 , E n+1 are values of functions R(t) and E(t) at the points t = (n + 1)τ . Thus, the discrete analog of the direct problem (1)-(2 has a form ( ) 1 2 n 2 n n 1 n−1 2 ( ) Φ0 = − n + D1/2 θ0n + D1/2 θ1 + θ0 + −K1/2 n + Rn − E n = 0, τ h h τ h θmin ≤ θ0n ≤ θmax , 1 ≤ n { ≤ N, 1 n 1 n 1 1 ( n )} Φi = 2 Di−1/2 θi−1 + 2 Di+1/2 θi+1 − n n n n + 2 Di+1/2 + Di−1/2 θin + { n−1h h } τ h θi 1( n ) (3) + + Ki−1/2 − Ki+1/2 n = 0, 1 ≤ i ≤ I − 1, 1 ≤ n ≤ N, τ h ΦnI = θIn − ψ n = 0, 1 ≤ n ≤ N, θi0 = φi , 0 ≤ i ≤ I. The diffusion coefficient and the hydraulic conductivity at the intermediate points appearing in formulas (3) are calculated by the formulas Din−1 + Di+1 n−1 Kin−1 + Ki+1 n−1 n Di+1/2 = , n Ki+1/2 = , 1 ≤ n ≤ N, 0 ≤ i < I. (4) 2 2 4 Discrete Optimal Control Problem Introduce a set Q0 = {(z, t) : z = ih, t = lτ, (i, l) ∈ A}, where A = {(i, l) : i = 0, 1, . . . , I, l = 1, . . . , d}, where 0 < d ≤ N , d is some natural number. Denote vector of desirable parameters by u, u ∈ U , where U = {u : u ∈ R3 , 0 < a[i] ≤ u[i] ≤ b[i], i = 1, 3}. Define the objective in the form 1 ∑ ( n )2 W (θ, u) = θj − θ̂jn τ h. (5) 2 (j,n)∈A The optimal control problem is to find optimal control uopt ∈ U and corresponding optimal solution θopt (z, t) of the direct problem (3)-(4) that minimize the objective function W (θ, u) (5). Earlier in ([Zasukhina & Zasukhin, 2017]), the problem of the identification of two parameters α and m was investigated. The problem was considered in the same formulation. In particular, the optimal control problem with the objective function (5) with various d was studied. Numerical solution of the problem was curried out using steepest descent method. Exact gradient of the objective function was calculated by formulas of fast automatic differentiation (FAD). As calculation experiments showed, for the values of d from 8 to 10, the obtained parameters α and m differ from their true values by 0.4% and 0.1% respectively. Here we tried to determine three parameters using steepest descent method. But these attempts have encoun- tered difficulties. The obtained parameters differed from the true ones very significantly. For this reason another algorithm of numerical optimization was applied. 586 Due to the form of the objective function, Levenberg–Marquardt algorithm [Levenberg, 1944], [Marquardt, 1963] of numerical optimization can be applied to the solution of the considered optimal control problem. This method is a combination of the Gauss-Newton algorithm with gradient descent method. And in this case, the exact values of the Jacobian of the soil moisture function [θ01 , θ11 , . . . , θI1 , . . . , θ0N , θ1N , . . . , θIN ]T is proposed also to be calculated using FAD method. 4.1 Optimization by Levenberg-Marquardt Algorithm We rewrite the objective function in the form ∑ ( )2 W (θ, u) = θjn − θ̂jn , (6) (j,n)∈A where A = {(i, l) : i = 0, 1, . . . , I, l = 1, . . . , d}, d is some natural number, d ≤ N . Denote [θ01 , θ11 , . . . , θI1 , . . . , θ0N , θ1N , . . . , θIN ]T and [θˆ01 , θˆ11 , . . . , θˆI1 , . . . , θˆ0N , θˆ1N , . . . , θˆIN ]T by Θ and Θ̂ respectively. According to the Levenberg–Marquardt optimization algorithm at each iteration step k, the displacement vector ∆(uk ) is determined from following system of equations: (J(uk )T J(uk ) + λdiag(J T (uk )J(uk ))∆uk = −J T (uk )(Θ − Θ̂), where J(uk ) is the Jacobian of the function Θ(uk ):  1 T ∂θ0 ∂θI1 ∂θ0N ∂θIN  ∂u1 . . . ... ...   ∂u1 ∂u1 ∂u1  J =  ... ... ... ... ... ... ...  . (7)  1  ∂θ0 ∂θI1 ∂θ0N ∂θIN 3 ... ... ... ∂u ∂u3 ∂u3 ∂u3 The parameter λ is positive and may be adjusted at the each iteration. The Jacobian of Θ(u) is proposed to be calculated using FAD formulas. 5 Fast Automatic Differentiation Method Fast automatic differentiation method allows to compute derivatives of complex functions whose variables are related by functional relationships. Briefly describe the essence of this method. Let for vectors z ∈ Rn and u ∈ Rr continuously differentiable functions F (z, u) and G(z, u) define mappings F : Rn × Rr −→ R1 and G : Rn × Rr −→ Rn . Let z and u satisfy the system of n scalar algebraic equations G(z, u) = 0n , (8) where 0n is zero n-dimensional vector. Suppose that the matrix GTz (z, u) is not singular. We denote the matrix transposed to the matrix Gz (z, u) by GTz (z, u). Then according to the implicit function theorem, the relations (8) define continuously differentiable function z = z(u). And according to FAD method, the gradient of the function F (z(u), u) is calculated by following formula: dF (z(u), u)/du = Fu (z(u), u) + GTu (z(u), u)p. (9) The vector p ∈ R from this formula is Lagrange multiplier which is determined as a result of the solution of n following linear system of equations : Fz (z(u), u) + GTz (z(u), u)p = 0n . (10) The system (10) is linear with respect to p and adjoint to the initial system (8). Thus, in accordance with the formulas (8)-(10), we obtain the relations for computing gradient of V = θin , i = 0, I, n = 1, N dV (θ(u), u)/du = Vu (θ(u), u) + ΦTu (θ(u), u)p, (11) Vθ (θ(u), u) + ΦTθ (θ(u), u)p = 0L , (12) where Φ = [Φ0 , Φ1 , . . . , ΦI , Φ0 , Φ1 , . . . , ΦI , . . . , Φ0 , Φ1 , . . . , ΦI ], p ∈ R is Lagrange multiplier, u ∈ U ⊂ R3 , T 1 1 1 2 2 2 N N N L θT = [θ01 , θ11 , . . . , θI1 , θ02 , θ12 , . . . , θI2 , . . . , θ0N , θ1N , . . . , θIN ], L = (I + 1)N . 587 6 Numerical Results Described approach was applied to finding the numerical solution of the discrete optimal control problem. The problem was solved with following values of input parameters: L = 100(cm), T = 1(d), θmin = 0.05(cm3 /cm3 ), θmax = 0.5(cm3 /cm3 ), φ(z) = 0.3, z ∈ (0, L), ψ(t) = 0.3, t ∈ (0, T ), a = [0, 0.005, 0.01]T , b = [300, 0.1, 0.5]T . The grid with I = 100 and N = 96 was used. The calculations were curried out in three stages. 6.1 The First Stage of Calculations At this stage, the direct problem (3)-(4) with the parameters K0true = 100(cm/d), αtrue = 0.01 and mtrue = 0.2 was solved. It is clear from the form of the system (3) and formulas for diffusion coefficient and hydraulic conductivity at intermediate points (4) that the system (3) can be split into N subsystems which of them corresponds to certain time layer. Each subsystem can be solved independently from others subsystems. For each such subsystem, the basic matrix is tridiagonal. Therefore, each subsystem was solved by tridiagonal matrix algorithm. Obtained solution was taken as a prescribed function θ̂(z, t). 6.2 The Second Stage of Calculations At the second stage the numerical solution of the optimal control problem was searched by steepest descent method. Exact gradient of the objective function (5) was calculated by formulas of FAD. Step value along the chosen direction was determined as a result of procedure of one-dimensional optimization along this direction of the function interpolating the objective function by means of splines constructed on 40 points. We considered the optimal control problems with the objective function (5), where d varied from 1 to 10. For all problems numerical optimization was curried out with various initial approximations. Each initial approximation differed one from another by value of the first component, namely initial approximations for α and m were equal to 0.03 and 0.13 correspondingly and K0init was equal to 102, 105, 110, 120, 150, 180, 200. Numerical calculations showed that for each initial approximation, the results improve slightly with increasing d from 1 to 10. At the same time, the deviation of the obtained values of the parameters αopt and mopt from the true values αtrue and mtrue depends on the initial approximation. So, with the change of K0init from 102 to 200, this deviation varies from 1.1% to 44.7% for α and – from 0.25% to 8.2% for m. As to the parameter K0 , its value does not practically change during the optimization process and remains very close to the initial value. And, the further the initial approximation of the parameter K0 from its true value, the smaller the difference between the initial value and the obtained value of the parameter K0 . This difference does not exceed 1.44·10−5 . Presumably, this inability of K0 to be optimized is due to the fact that the corresponding component of the gradient of the objective function differs from other components by 3-4 orders of magnitude. The values of the parameters K0opt , αopt and mopt obtained for various initial approximations are presented in Figure 1. Under various initial approximations, we mean that the value of K0init changes, while the values of αinit and minit remain unchanged. The graphs, following from left to right in Figure 1, refer to K0opt , αopt and mopt correspondingly. These graphs are designated by solid line. Dashed line shows true values of the parameters. 200 0.202 0.014 0.2 180 0.0135 0.198 0.013 0.196 160 0.194 0.0125 140 0.192 0.012 0.19 0.0115 120 0.188 0.011 0.186 100 0.0105 0.184 0.01 0.182 110 120 130 140 150 160 170 180 190 200 110 120 130 140 150 160 170 180 190 200 100 120 140 160 180 200 Figure 1: Dependencies of Obtained Parameters on Initial Approximation 588 Thus, these numerical calculations showed that the application of the steepest descent method to solving K0 , α, m identification problem does not lead to satisfactory results. 6.3 The Third Stage of Calculations As the numerical experiments at the second stage showed, to identify parameters K0 , α and m with good accuracy, another (not the steepest descent method) algorithm should be applied. Therefore, in order to solve the discrete optimal control problem, the Levenberg-Marquardt algorithm was applied. The objective function was defined by formula (6), where A = {(j, n) : j = 0, I, n = 1, . . . , d}. The cases of d = 2, 3, 4, 5, 6, 7, 8, 9, 10 were considered. At each iteration in determining the direction of the search, the exact derivatives of θjn , n = 1, N , j = 0, I, were calculated using FAD formulas (11)-(12). The process of the numerical optimization started with initial approximation K0init = 200, αinit = 0.03 and minit = 0.13. The parameter λ changed during the process of numerical optimization. At the beginning it was equal, as a rule, to 10−4 and then it was adjusted at each iteration. The process of the numerical optimization continued until the Chebyshev norm of the gradient of the objective function became less than 1.1·10−18 and value of the objective function became less than 2·10−24 . The results of the calculations are presented in following table: Table 1: Obtained Values of Parameters K0 , α and m d K0opt Error αopt Error mopt Error gradient function 2 100.21 0.21% 1.0011 ·10−2 0.11% 1.9995 ·10−1 0.03% 9.16·10−20 1.92·10−24 3 100.09 0.09% 1.0005 ·10−2 0.05% 1.9998 ·10−1 0.01% 1.08 ·10−18 1.02 ·10−25 4 100.09 0.09% 1.0005 ·10−2 0.05% 1.9998 ·10−1 0.01% 8.22 ·10−21 3.71 ·10−26 5 100.19 0.19% 1.0010 ·10−2 0.10% 1.9995 ·10−1 0.03% 6.41 ·10−21 6.78 ·10−26 6 100.06 0.06% 1.0003 ·10−2 0.03% 1.9998 ·10−1 0.01% 2.34 ·10−19 3.40 ·10−27 7 100.001 0.001% 1.00001 ·10−2 0.001% 1.999997 ·10−1 0.0002% 5.22 ·10−20 7.30 ·10−31 8 100.03 0.03% 1.0002 ·10−2 0.02% 1.99992 ·10−1 0.004% 2.33 ·10−19 2.21 ·10−28 9 100.008 0.008% 1.00004 ·10−2 0.004% 1.99998 ·10−1 0.001% 2.46 ·10−19 1.05 ·10−29 10 100.002 0.002% 1.000008 ·10−2 0.0008% 1.999996 ·10−1 0.0002% 1.38 ·10−19 2.25 ·10−31 It can be seen from Table 1 that optimal values of the parameters K0 , α and m are getting closer to their true values as d increases from 2 to 10. So, while d increases from 2 to 10, the deviation of K0opt from K0true decreases from 0.21% to 0.002%, the deviation of αopt from αtrue decreases from 0.11% to 0.0008%, and the deviation of mopt from mtrue decreases from 0.03% to 0.0002%. The time required to find the solution turned out to be approximately the same for all problems considered. The value of d defines the set where measured and calculated values of soil moisture are compared. Therefore, the choice of d defines initial data required for determining the parameters. Thus, analyzing results of the numerical calculation, we can estimate how choice of one or another set of initial data will influence on the accuracy of the solution obtained. Conclusion Analysis of the results of the numerical calculations leads us to following conclusion. • The application of the Levenberg-Marquardt method to solving parameters identification problem allows to determine these parameters with good accuracy. So, we can determine the parameter K0 with accuracy up to 0.002%, the parameter α – up to 0.0008% and the parameter mopt – up to 0.0002%. • The gradient method turned out to be ineffective in determining three parameters K0 , α and m. In partic- ular, the difficulties in solving this problem are due to the fact that one component of the gradient of the objective function differs from the other components by 3-4 orders of magnitude. It should be noted the disadvantage of the proposed approach: the Levenberg-Maquardt algorithm used in the process of numerical optimization is one of the local methods. And therefore, there is the question of the possibility of applying in this situation a method of global optimization, for example, of the well-known uneven coating method [Evtushenko & Posypkin, 2013]. 589 Acknowledgements This work was supported by Russian Fund of Fundamental Researches, Project 15-07-08952. References [van Genuchten, 1980] Van Genuchten, M. Th. (1980). A closed form equation for predicting the hydraulic conductivity of unsaturated soils. Soil. Sci. Soc. Am. J., 44, 892-898. [van Genuchten et al., 1991] Van Genuchten, M. Th., Leij, F. J., & Yates, S. R. (1991). The RETC code for quantifying the hydraulic functions of unsaturated soils,. U.S. Salinity Lab., USDA, ARS, Riverside, California. EPA Report 600/2-91/065. [Scaap et al., 2001] Scaap, M. G., Leij F .J., & van Genuchten, M. Th. (2001). Rosetta: a computer program for estimating soil hydraulic parameters with hierarchical pedotransfer functions. J. Hydrol, 251, 163-176. [Pan & Wu, 1998] Pan, L. H., & Wu, L. S. (1998). A hybrid global optimization method for inverse estimation of hydraulic parameters: annealing-simplex method. Water Resour. Res, 34, 2261-2269. [Takeshita, 1999] Takeshita, Y. (1999). Parameter estimation of unsaturated soil hydraulic properties from tran- sient outflow experiments using genetic algorithms. In Proc. Int. Worksh.,Riverside, CA. 22-24 Oct. 1997, U.S. Salinity Lab., Riverside, CA. (pp. 761-768). New York: U.S. Salinity Lab., Riverside. [Vrugt et al., 2001] Vrugt, J. A., Weerts A. H., & Bouten, W. (2001). Information content of data for identifying soil hydraulic parameters from outflow experiments. Soil. Sci. Soc. Am. J., 65, 19-27. [Abbaspourt et al., 2001] Abbaspour, K. C., Schulin R., & van Genuchten, M.Th . (2001). Estimating unsatu- rated soil hydraulic parameters using ant colony optimization. Adv. Water Resour., 24, 827-841. [Yang & You, 2013] Yang, X., & You, X. (2013). Estimating parameters of van Genuchten model for soil water retention curve by intelligent algorithms. Appl. Math. Inf. Sci., 7(5), 1977-1983. [Aida-Zade & Evtushenko, 1989] Aida-Zade, K. R., & Evtushenko, Yu. G. (1989). Fast automatic differentiation on a computer. Mathematical Modeling, 1(1), 121-139. [Griewank & Corliss, 1999] A. Griewank & G. F. Corliss (Ed.) (1991). Automatic Differentiation of Algorithms. Theory, Implementation and Application. Philadelphia: SIAM. [Evtushenko, 1991] Evtushenko, Yu. (1991). Automatic differentiation viewed from optimal control theory. In A. Griewank & G. F. Corliss (Ed.). Automatic Differentiation of Algorithms. Theory, Implementation and Application. (pp. 25-30). Philadelphia: SIAM. [Evtushenko, 1998] Evtushenko, Yu. (1998). Computation of exact gradients in distributed dynamic systems. Optimization Methods and Software, 9, 45-75. [Zasukhina & Zasukhin, 2017] Zasukhina, E., & Zasukhin S. (2017). Opredelenie parametrov hidrologuicheskoy modeli. International Journal of Open Informational Technonologies, 5(2), 20–28. [Levenberg, 1944] Levenberg, K. (1944). A method for the solution of certain nonlinear problems in least squares. Quarterly of Applied Mathematics, 2, 164-168. [Marquardt, 1963] Marquardt, D. (1963). An algorithm for least-squares estimation of nonlinear parameters. SIAM Journal on Applied Mathematics, 11 (2), 431-441. [Evtushenko & Posypkin, 2013] Evtushenko, Yu., & Posypkin, M. (2013). A deterministic approach to global box-constrained optimization,. Optimization Letters, 7(4), 819-829. 590