Application of Data Assimilation Algorithms
     Based on Kalman Ensemble Filters for the
                  Lorenz Attractor

     Yulia S. Timoshenkova, Nikolay T. Safiullin, and Sergey V. Porshnev

        Yeltsin Ural Federal University, Yekaterinburg, Russian Federation
 julia.timoshenkova@urfu.ru n.t.safiullin@urfu.ru s.v.porshnev@urfu.ru


      Abstract. The paper describes application of the Data Assimilation
      methods based on the Kalman ensemble filtering to forecast the Lorentz
      system. The quality of the forecast and the influence of the method for
      the parameter choise on the result of forecasting the system was assessed.
      The paper shows the advantages and disadvantages of the chosen meth-
      ods of Data Assimilation. The recommendations for increasing accuracy
      of those methods are presented.

      Keywords: EnKF, LEnKF, Data Assimilation, Lorenz attractor, Kalman
      filter, Forecast.


1   Introduction
Correction of the forecast based on mathematical models with using new ob-
servations is an actual problem. This approach avoids growth of the forecast
confidence interval as its duration increases.
    One of the instruments for solving this task is the Data Assimilation (DA)
technique.
    There are areas of science, in which the forecast models are built not on a
one-dimensional time series generated by the multidimensional one, but on a
multidimensional underlying process itself.
    The use of methods of assimilation and correction of data ensures correct
prediction and modeling of the complex systems.
    Since the DA methods are based on the mathematical model of the system,
the choice of different methods of data assimilation can significantly affect the
final results of the forecast correction.
    There are no recommendations for the choice of a particular family of DA
methods because of their novelty [1]. Therefore, the problem of comparative anal-
ysis of data assimilation methods is actual. In order to compare those methods
and to test their accuracy one needs a standard suitable system to forecast.
    The Lorentz system was designed to construct a simplified model of atmo-
spheric convection for the long-term weather forecasting. Nowadays, the Lorenz
attractor is a standard model for testing the DA methods; thus, it is used as a
model to forecast in this work.
                                                                                  83

   The mathematical model of the Lorenz attractor is described by a system
consisting of three nonlinear ordinary differential equations that represent a finite
amplitude of convection [2]


                                dx
                                   = σ(y − x),
                                dt
                                dx
                                   = x(r − y) − y,                               (1)
                                dt
                                dx
                                   = xy − bz,
                                dt
where σ = ν/k is the Prandtl number, r = Ra /Rac is the normalized Rayleigh
number (normalized), b = 4/(1 + a2 ) is the geometric factor x s the convection
intensity, y is the temperature difference between ascending and descending wind
flows, z is the deviation of the vertical temperature profile from the linear one.
    The peculiarity of this system is that when the parameters are σ = 10, b =
8/3, and r ≥24.06, the behavior of the system becomes random. In the phase
space, the attractor has the topology of the tangle of trajectories, in which two
regions can be distinguished [2]. At any time point, the solution is in one of these
regions, and the transition of the system state to another region is unpredictable.
    The purpose of the work is to carry out a comparative analysis of applying
two DA methods, such as the Ensemble Kalman Filter and Local Ensemble
Transform Kalman Filter to the Lorenz attractor.


2   Mterials and Methods

In the Data Assimilation method, the original model is formulated in accordance
with the following system of equations

                               xk+1 = M (xk ) + wk ,
                                 yk = H(xk ) + vk ,                              (2)

where M is an operator or transition function that determines the evolution of
the system in time, xk is the known model state at the time point tk , xk+1 is the
model state at the next time point tk+1 , yk is the observation vector, H is the
observation operator that describes the multi-dimensional state of the system
based on the observation vector, wk is the model error, vk is the observation
error.
     In the DA method, the analysis step is based on the system current state xfk .
The corrected system forecasting state xak for the next time point is based on
xfk . For calculation of xak , only observation vector for the current time point is
used [3].

                           xak = xfk − K(yk − H(xfk )),                          (3)
84

where K is the transmission coefficient, yk is the observation, e = yk − H(xfk ) is
the error between the known observations and values obtained using the model.
    Various DA methods minimize error e using different techniques described
in this paper. The transmission coefficient is also called the Kalman coefficient.
There is a family of DA methods based on the Kalman filter (KF). One of those
methods is the Ensemble Kalman Filter (EnKF) [4].
    The EnKF is a variant of the generalized Kalman filter, in which the covari-
ance of the prediction errors is estimated using the ensemble of forecasts [5]. The
Kalman ensemble filtration methods are widely used to assimilate observations
in dynamic models.
    The EnKF algorithm is implemented by the following sequence of steps.
    1. Forming an ensemble of the initial data at the initial time point with the
help of a mathematical model.
    2. Calculation of N predictions for ensemble of the initial data in order to
obtain the values of the observed parameters at the time point for the state
vector obtained on the previous step. Next, the method of successive corrections
for the model is used.
    3. Calculation of the covariance error matrices for the forecast correction.
The step consists of the following items: formation of a covariance error matrix
of the forecast vector; calculation of the Kalman weight operator K; update
of the ensemble model values in accordance with the weight operator and the
observation vector; update of the covariance matrix of the forecast errors in
accordance with the specified ensemble; calculation of value of the analyzed
components and the covariance matrix of the analysis errors.
    4. Steps 2 and 3 are executed sequentially for each time point with each new
observation obtained.
    The Local Ensemble Filters use localization of the observational error co-
variance, which takes into account only those observations that are located in a
certain region around the desired point. The main idea of the Local Ensemble
Transform Kalman Filter (LETKF or LEnKF) is to perform calculations not in
the physical space of the model, but in the space of the ensemble. Typically, the
ensemble space is less than physical one of the model [6].
    The LEnKF algorithm is implemented by the following sequence of steps.
    1. Formation of an ensemble of model data of size N at the initial time point
with the help of a mathematical model.
    2. Calculation of the ensemble forecast of the system state at the next instant
with the help of the model equation and the model state vector at the previous
time point.
    3. Creating a local plane in the state space, and constructing the projection
of each point of the ensemble of the system state onto this plane.
    4. For each state vector in the ensemble obtained on step 2, calculate the
distance from the ensemble average to the current state vector and project this
distance into the low-dimensional state subspace that represents the ensemble
in that region in the best way.
                                                                              85

    5. Perform the assimilation of data in each of the local low-dimensional sub-
spaces, obtaining the average analysis value and covariance of the state error in
each local region according to the EnKF.
    6. Obtaining the necessary local analysis of the state ensembles using the
mean value of the local analysis and covariance.
    7. Formation of a new global analysis ensemble using the local analyses ob-
tained on step 6.
    8. Repeat steps 1 6 until the observations are complete.


3   Results

During the study, the standard deviations of the Lorenz attractor were taken
equal to 0.5, 1, and 5, which corresponds to 4.5% 8.8% and 44% of the coordinate
amplitude of the Lorentz system. These values may seem small, but the standard
deviations greater than 50% significantly affect the system and generate very
inaccurate observations for all three components of the system. The deviation
values of the observed states are taken to be from 0.01 to 20, which correspond
to 0.08% to 176% of the Lorentz system coordinate amplitude, or approximately
from 60 dB to –5 dB in the ratio of the real coordinate to the error of its
observation.
    Evaluation of the noise influence on the result of the algorithms operation
was carried out for several simulation parameters. To analyse the quality of the
algorithm, the mean square error (MSE) parameter is calculated. Results of the
MSE parameter estimations are shown in Fig. 1.


Fig. 1. Dependence of the MSE prognosis on the variance of the observation error
dispersion
86

    Fig. 1 shows that the choice of the DA method mainly influences on the
MSE prognosis. The use of the EnKF gives the best result. For the experiment,
in which the highest values of the observation noise dispersion are taken, the
worst results of the simulation are obtained. This confirms the theory that the
observation error greatly influences the obtained forecast.
    It should be noted that the results of the experiment are also influenced by
the fact that all components of the attractor are related to each other; that is,
presence of errors in one of the coordinates affects the result of the forecast of
each other component.
    Fig. 2 shows the MSE dependence on the number of ensembles for the EnKF
and LEnKF filters. As the number of filter ensembles grows, the resulting value
of the MSE decreases.
    Under the minimum error in the covariance of observations, the effect of the
observation noise variance on the MSE is minimal, as seen in Fig. 1. However,
with an increase of the observation error variance and model error, the MSE value
becomes approximately equal for all DA methods. This dependence suggests that
with increasing the noise variance to achieve an acceptable MSE, it is necessary
to increase the number of ensembles in the KF.


Fig. 2. Dependence of the MSE prognosis on the number of filter ensembles for the
component x for the EnKF and LEnKF


   Based on the obtained results, it can be concluded that when the MSE of the
observation error is less than 11, the EnKF should be used to obtain a smaller
MSE. When the standard deviation of the observation error is greater than 11,
or more than 100% of the system coordinate amplitude, the best smaller MSE
are given by the ensemble filters. In the case of using the ensemble filters, the
value of the MSE ceases to change significantly with the number of ensembles
                                                                                      87

larger than 100. Thus, when using the EnKF, the optimum number of ensembles
is 100 for predicting the behavior of the Lorenz attractor.


4    Conclusion
During the research of the capabilities of under applying the ensemble Kalman
filter methods to perform prediction of the coordinates of the Lorenz attractor,
the following practical results were obtained.
     The increase in variance of the observation error (greater than 70%) signifi-
cantly affects the MSE prediction. Accuracy of the prediction falls also with the
increase in the observation error, as the obtained results shown.
     The result of forecast is also significantly influenced by the choice of the filter
type. To obtain the minimum MSE, it is recommended to use the LEnKF. A
small MSE (lower than 10) of the EnKF can be explained by the fact that the
system is modeled several times that averages the error of the modeled chaotic
system.
     A significant increase in MSE (up to 2–3 times) occurs when the value of
the standard deviation of system errors increases by larger than 50%. As a con-
sequence, for high deviation values (higher than 100% of the basic amplitude),
the largest possible number of ensembles in the EnKF and LEnKF should be
used. But after the value of 100, the value of the MSE ceases to change signifi-
cantly. Thus it is recommended to use the value of 100 ensembles in the EnKF
and LEnKF methods, as a balance between accuracy and speed of the Data
Assimilation.
     Compared to the simple Kalman Filtering, for higher observational errors, the
methods EnKF and LEnKF provide a better accuracy (up to 3 times for small
deviations). For observational errors below 40%, the LEnKF performs better in
accuracy than the EnKF. But with higher observational errors their accuracy is
on the same level. Thus, the choice between those two methods must be made
based on their performance and memory consumption.


References
1. P. Leeuwen, S. Vetra-Carvalho, SANGOMA: Stochastic Assimilation for the Next
   Generation Ocean Model Applications, Leeuwen Vetra-Carvalho, (2012)
2. W. Van den Bossche, Data assimilation toolbox for Matlab., (2012)
3. G. Evensen, Data assimilation: the ensemble Kalman filter. Springer Science &
   Business Media, 2009.
4. Y. Timoshenkova, N. Safiullin, On the possibility of the forecast correction for inac-
   curate observations based on data assimilation,, Dynamics of Systems, Mechanisms
   and Machines (Dynamics), 1–4, (2017). DOI: 10.1109/Dynamics.2017.8239519
5. M. Bocquet and . des P. P. CEREA, Introduction to the principles and methods of
   data assimilation in geosciences, Notes Cours c. Ponts ParisTech, (2014)
6. P. L. Houtekamer, H. L. Mitchell, Data assimilation using an ensemble Kalman filter
   technique, Mon. Weather Rev., Vol. 126, No. 3, 796–811, (1998)