<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>The ensemble algorithm for estimation of model parameters in the problem of assessing greenhouse gases fluxes</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ekaterina G. Klimova</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Federal Research Center for Information and Computational Technologies</institution>
          ,
          <addr-line>Novosibirsk</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
      </contrib-group>
      <fpage>288</fpage>
      <lpage>295</lpage>
      <abstract>
        <p>The problem of assessing the greenhouse gases fluxes from the Earth's surface based on observations is currently very urgent. To solve it, it is customary to use data assimilation systems (or a more general concept - inverse modeling), which include the observations on the concentration of greenhouse gases and models of the transport and difusion. Since such problems involve large volumes of satellite data and the global model of transport and difusion, it has a huge dimension. For this reason, the development of efective algorithms to enable the practical implementation of the task is required. The paper discusses data assimilation algorithms based on the ensemble Kalman filter and ensemble Kalman smoothing, which can be used to solve the problem of estimating greenhouse gases fluxes. Economical algorithms for estimating a parameter that is constant over a given time interval are proposed.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Data assimilation</kwd>
        <kwd>greenhouse gases</kwd>
        <kwd>fluxes</kwd>
        <kwd>ensemble Kalman smoother</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Assessment of the state of the environment based on observational data is one of the most
urgent tasks at the present time. This assessment is performed using forecast models based on
data assimilation systems. The data assimilation problem is the problem of the joint accounting
of data and a mathematical model for the most accurate assessment of the spatial-temporal
distribution of the used variables. The study of the changes in space and time of greenhouse
gases, such as CO2 and CH4, as well as the assessment of fluxes from the Earth’s surface of
these gases is one of the urgent tasks of monitoring the state of the environment. To solve this
problem, it is customary to use data assimilation systems that include observational data and
a mathematical model of the transport of gases in the atmosphere. One of the approaches to
assessing greenhouse gases fluxes is an approach based on the assumption that the fluxes are
constant in a given subdomain and on a given time interval (on the order of a week). This is
due to both the need for an eficient implementation of the algorithm and the properties of the
observational data used in such problems.</p>
      <p>
        This paper discusses an algorithm for estimating a constant in time and space greenhouse
gases fluxes by the observations from a given time interval. The proposed algorithm is a
variant of the ensemble Kalman smoothing [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. An approach based on the use of eficient
local algorithms based on previously developed ensemble filtering and smoothing algorithms
is considered [
        <xref ref-type="bibr" rid="ref2 ref3 ref4 ref5">2, 3, 4, 5</xref>
        ]. The results of model numerical experiments with a one-dimensional
transport and difusion model are presented.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Ensemble algorithms for estimating greenhouse gases fluxes</title>
      <p>
        There is a lot of papers devoted to estimating greenhouse gases fluxes using data assimilation
procedures. All these works use an approach called “inverse modeling” [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>
        Let’s consider a series of works carried out by a team of authors in which the ensemble
Kalman filter is used [
        <xref ref-type="bibr" rid="ref10 ref7 ref8 ref9">7, 8, 9, 10</xref>
        ]. In these works, the estimated variable is the greenhouse
gases flux from the Earth’s surface. The Earth’s surface is divided into squares of equal area
(1000× 1000 km) and it is assumed that it is required to estimate the mean flux value over the
subdomain. In addition, the average value over a given time period is estimated. The estimation
of the values of the averaged fluxes  over the subdomains from the observational data 0 for
a given time interval and forecast  is carried out according to the standard formulas of the
Kalman filter [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]:
      </p>
      <p>=  +  [︁0 − ( )]︁ ,
 =    (︁    + ︁) − 1
where  ,  — vectors of analysis and forecast values at grid nodes; 0 — vector of observations;
 — operator that translates forecast values into observations;   ,  — covariance matrices
of forecast (preliminary estimate) and observations errors.</p>
      <p>It is assumed that the forecast step of the algorithm has the form</p>
      <p>+1 = ,
where  is the time step number.</p>
      <p>
        To implement the ensemble Kalman filter, an ensemble of perturbations of the estimated
parameter is specified [
        <xref ref-type="bibr" rid="ref10 ref7 ref8 ref9">7, 8, 9, 10</xref>
        ]:
      </p>
      <p>Then the matrix   is estimated by the ensemble
∆  = √1</p>
      <p>[∆ 1, . . . , ∆ ] ,
  = ∆  (∆  ) ,
 = ∆  (∆ ) [︀ ∆ (∆ ) + ]︀ − 1 ,</p>
      <p>∆  = ( − ∆  ) − ( ),
where operator  includes model forecast at the time of observation, interpolation from grid
nodes to observation points, and, in the case of satellite data, vertical averaging with known
coeficients (“average kernel”),  is the number of ensemble elements.</p>
      <p>
        The period of time over which observations are used is called the assimilation window. In [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ],
the assimilation window, consisting of 12 cycles of 8 days, is considered, and the average values
of the fluxes over 8 days are estimated.
      </p>
      <p>
        From the point of view of the mathematical formulation of the problem, we highlight the
following points:
1) the fluxes are assessed without specifying the concentrations; the solution of the problem
in this formulation will not be optimal;
2) since the operator  includes a mathematical model of impurity transport and difusion,
this problem is a smoothing problem, not filtering [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], an approach to the implementation of the algorithm based on the use of a
transformation matrix is considered. In this case, computational dificulties arise associated with
the large dimension of the matrices under consideration. From the point of view of practical
implementation, to solve this problem, it is required to specify a matrix  — the covariance
matrix of the observation errors and the initial values of the fluxes.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Statistical optimization</title>
      <p>Let’s consider the problem of estimating a parameter which is constant in time, from observations
on a given time interval. Let the concentration value 0 in a given region at the time 0 and
a preliminary estimate of the parameter at a given time interval  are known. There are
observations on the time interval [0, ,  ]. It is required to estimate the values according to
the observations.</p>
      <p>If we consider the finite-diference analogue of the transport-difusion equation in operator
form +1 =  +  , where  is the time step number, 0 is the concentration value, and
 are the greenhouse gases fluxes from the Earth’s surface, then</p>
      <p>− 1 
+1 = ∏︁ 0 + ∑︁ ∏︁  = 10 + 2 = ˜  ,
=1</p>
      <p>=1 =1
where ˜  = (1, 2). Let the observations at the time  be related to the “true” value  
using the operator  :</p>
      <p>0 =    + 0.</p>
      <p>Observations can be represented as</p>
      <p>0 =  (︀ 10 + 2 )︀ + 0 = 0˜   + 0,
where 0 — interpolation to the observation point and averaging along the vertical (in the
case of satellite data, this is the “average kernel”), 0 is the random observation error with zero
mean value and covariance matrix .</p>
      <p>
        Let  = [0, . . . ,  ] — observations over the entire time interval, ˜ = [0, . . . ,  ] —
“generalized” observation operator:  = ˜   + 0, 0 = [︀ 00, . . . , 0 ]︀ — random observation
errors. As in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], we will seek an estimate based on the minimum of the functional
 [ ] = ( −   ) 1( −   ) + ( − ˜   ) 2( − ˜   ),
where   — preliminary estimate (forecast), −1 1 and −2 1 are the covariance matrixes of
forecast and observation errors.
      </p>
      <p>
        Let’s consider an ensemble approach to solving this problem. We will assume that an
ensemble of forecasts {  ,  = 1, . . . , } is given. Then the ensemble of analysis values
{  ,  = 1, . . . , } has the form [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]
      </p>
      <p>1
  =  +− 1  ︁( ˜  
︁) [︂</p>
      <p>1
− 1
˜  ︁( ˜  
︁) 
+˜︂] − 1(︁ ˜ +0 − ˜   )︁ , (1)

where   = {︁ 1 , . . . ,  }︁,   =   −  ¯  ,  ¯  = ∑︀   /, {0 ,  = 1, . . . , } —
=1
ensemble of random perturbations of observations corresponding to the covariance matrix ˜.
The following relation is used for the estimation ˜   :</p>
      <p>˜ (  ) = ˜ (  +   ) − ˜ (  ).</p>
      <p>
        The implementation of the vector   estimation algorithm according to formula (1) requires
the calculation of the inverse matrix of high dimension. For more eficient computations,
an algorithm for the implementation of the stochastic ensemble Kalman filter (ensemble 
algorithm) [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ] can be applied. The condition for applying the ensemble  -algorithm is the
fulfillment of the relation between the matrices of the Kalman filter  = ( − ˜ ) , where
 and  — covariance matrices of forecast (first guess) and analysis (estimate) [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. From
formula (1), one can obtain the ratio for deviations from the mean value of the ensemble of
analyzes:   =   + (0 − ˜   ). Hence, we can conclude that the required relation
between the matrices is satisfied if the errors of observation and the forecast do not correlate. In
this case, you can use the relation  =  ˜  ˜− 1 [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] and the formulas of the  -algorithm
can be applied [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ].
      </p>
      <p>The algorithm can be used to estimate only the parameter  (if it is assumed that 0 is given).
In the case when 0 is known, we get</p>
      <p>( ) =  [︁˜ (0,  +  ) − ˜ (0,  )]︁ .</p>
      <p>
        The ensemble  -algorithm is a stochastic Kalman filter [
        <xref ref-type="bibr" rid="ref12 ref6">6, 12</xref>
        ] in which the analysis step is
performed only for the ensemble mean:
 ¯  =  ¯  +
      </p>
      <p>1
 − 1
  ︁( ˜   ︁)  ˜− 1 [︁ − ˜  ¯  ]︁ .
(2)</p>
      <p>
        The ensemble of analysis errors   — matrix of dimension ( × ), columns of which
are vectors of dimension  {  ,  = 1, . . . , }, is obtained by transforming the ensemble of
forecast errors   — a matrix with columns {  ,  = 1, . . . , } :   = ( + Π )− 1 
while
 =
  ˜  ˜− 1(˜   + ) = 1 + 2.
where  — matrix whose columns are equal to the vector 0 — the ensemble of observation
errors,  — the identity matrix. More detailed calculations are given in [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ].
      </p>
      <p>
        The elements of matrix Π are calculated for given matrices ˜ and ˜ over an ensemble of
forecast errors   and do not depend on the grid node. This makes it possible to implement
the algorithm locally. An eficient method for implementing the ensemble  -algorithm was
proposed in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>As can be seen from the formulas of the ensemble  -algorithm, in order to use it in the case
of the implementation of the above method, the following steps must be taken.
1. Forecast by values  = (0,  ) from time 0 to a time  .
2. Calculation of residuals ( − ˜   ).
3. Calculation of ˜   = {︁1  , . . . ,    }︁.
4. Calculation of matrix  of  -algorithm. Calculation can be carried out sequentially in time:

  ˜  ˜− 1˜   = ∑︀ ( ())  − 1 (),   = { 1 , . . . ,   }.</p>
      <p>=1
5. Calculation of ˜   .</p>
      <p>6. Estimation of  = (0,  ) by the formula (2).</p>
    </sec>
    <sec id="sec-4">
      <title>4. Numerical experiments with a one-dimensional transport and difusion model</title>
      <p>With the proposed algorithm based on the method of statistical optimization, numerical
experiments with a 1-dimensional model of the transport and difusion of a passive impurity were
carried out. The following equation was considered:
˜

+ 
˜ = 2 2˜
 2 + ˜(, ),
where ˜ — the predicted variable, ˜(, ) is an unknown source of passive impurity. To solve
the equation, the semi-Lagrangian method was used, with an implicit scheme in time and a
scheme of central diferences in space. To solve the finite-diference analogue of the difusion
equation, the cyclic sweep method was used. The equation was solved on the space interval
(0, 1), while the periodic boundary conditions were considered. 240 grid points were set,  = 1,
2 = 0.6 × 10− 3.</p>
      <p>Let consider a finite-diference analogue of this equation in the form</p>
      <p>+1 =  +  ,
where  — linear operator,  — the time step number.</p>
      <p>The following numerical experiments were carried out with model data. The given initial
values 0 ,  0 were considered “true”. To obtain the initial data 0,  0 for forecasting by
the model, a disturbance was added to the “true” initial data 0 = 0 +  ,  ∼  (0, 0),
 0 =  0 +   ,   ∼  (0, 0).  (, ) denotes a random variable distributed according to
the normal law with a mathematical expectation equal to  and variance equal to .</p>
      <p>
        To organize numerical experiments, the following were set: an ensemble of initial fields: 0 =
0 +  ,   ∼  (0, 0),  = 1, . . . , ;  0 =  0 +  ,   ∼  (0,  0),  = 1, . . . , ;
observations 0 = 0 +  0;  0 ∼  (0, 0); ensemble of observations with perturbations
0 = 0 +  0,  0 ∼  (0, 0),  = 1, . . . , . Through  denotes the number of
elements of the ensemble. The observations were considered to be known throughout the
integration area. In all numerical experiments  = 20 was considered. In the analysis at grid
node , observational data from the interval ( − ,  + ) were taken. In this case, in the
analysis at the grid node , instead of the matrix , we took the matrix ′ =  ∘ − 0.5(/)2 ,
where  — distance between grid node and observation, “∘ ” — element-wise multiplication
sign. In the experiments, we took the values  = 5,  = 5∆  (∆  — the grid step). This
algorithm is called -localization. It is commonly used in ensemble methods to suppress
spurious covariances at large distances due to the small size of the sample (ensemble) [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
      </p>
      <p>
        Numerical experiments were carried out to estimate the average fluxes over a given time
interval. For this, the “true” values of the fluxes were set, as well as the initial estimate of the
lfuxes (both equal to zero and nonzero). The forecast ensembles were modeled for a given time
period with perturbed flux values. In this case, the ensemble  -algorithm was implemented.
The application of the ensemble  -algorithm in the problem of transport and difusion of a
passive impurity is described in detail in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>The following numerical experiments were carried out. In numerical experiments, the
“assimilation window”  = 10 time steps was used. To estimate the parameter values on a time
interval {, . . . , +}, we used the observations of this time interval. Numerical experiments
were carried out for the values: 0 = 0 = 0.01,  0 = 0.1,  = 40.</p>
      <p>In the formulation of the parameter estimation problem, it is assumed that the parameter
does not change at the forecasting step. In the experiments, the “true” value of the parameter
was taken constant in time, namely, it was set in the form of a discrete analogue of the function
 0(), where
 0() =
︂{ 0.1, if 0.375 ≤  ≤ 0.625,</p>
      <p>0, else.</p>
      <p>Two series of numerical experiments were carried out. In the first, 0 was considered given,
the ensemble { } was modeled. In the second series, an ensemble of initial values was set {0 }.
a
b
In this case, the ensemble of deviations from the mean corresponds to the deviation of the
estimate from “true”. It should be noted that this did not afect the result, since the residual
values in (2) are kept the same in both cases. In both cases, the root-mean-square error of
the estimate (the diference between “true” value and estimated value) was 0.009. The results
of the first series of experiments are shown in Figure 1. Figure 1, a shows the “true” value
of the time-averaged emission, Figure 1, b shows the restoration of the value after applying
the procedure described in Section 3. In this case, at the initial moment of time, the value of
the estimated parameter was set equal to zero. As can be seen from the figures, the proposed
algorithm makes it possible to estimate the flux of a passive impurity based on observations
and forecast using the transport-difusion model even if the information about it at the initial
moment is absent.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>The task of assessing the fluxes of greenhouse gases from the Earth’s surface is currently being
solved using data assimilation systems. In this case, models of the transport and difusion of
a passive impurity in the atmosphere and meteorological fields of wind speed, temperature,
etc. are used. The ensemble Kalman filter and ensemble Kalman smoother are increasingly
used as a mathematical formulation of the problem. The article discusses an algorithm for
estimating time-averaged values of greenhouse gases fluxes based on the ensemble approach
and the theory of statistical optimization. The algorithm is economical and can be implemented
locally in each given subdomain.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Evensen</surname>
            <given-names>G.</given-names>
          </string-name>
          <article-title>Data assimilation. The ensemble Kalman filter</article-title>
          . Berlin Heideberg: SprigerVerlag,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Klimova</surname>
            <given-names>E.</given-names>
          </string-name>
          <article-title>A suboptimal data assimilation algorithm based on the ensemble Kalman filter //</article-title>
          <source>Quarterly Journal of the Royal Meteorological Society</source>
          .
          <year>2012</year>
          . Vol.
          <volume>138</volume>
          . P.
          <year>2079</year>
          -
          <fpage>2085</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Klimova</surname>
            <given-names>E.G.</given-names>
          </string-name>
          <article-title>The Kalman stochastic ensemble filter with transformation of perturbation ensemble // Numerical Analysis</article-title>
          and
          <source>Applications</source>
          .
          <year>2019</year>
          . Vol.
          <volume>12</volume>
          . No. 1. P.
          <volume>26</volume>
          -
          <fpage>36</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Klimova</surname>
            <given-names>E.G.</given-names>
          </string-name>
          <article-title>Bayesian approach to data assimilation based on ensembles of forecasts</article-title>
          and observations // IOP Conf. Series: Earth and
          <string-name>
            <given-names>Environmental</given-names>
            <surname>Science</surname>
          </string-name>
          .
          <year>2019</year>
          . DOI:
          <volume>10</volume>
          .1088/
          <fpage>1755</fpage>
          - 1315/386/1/012038.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Klimova</surname>
            <given-names>E.G.</given-names>
          </string-name>
          <article-title>An eficient algorithm for stochastic ensemble smoothing // Siberian J</article-title>
          . Num. Math.
          <year>2020</year>
          . Vol.
          <volume>23</volume>
          . No. 4. P.
          <volume>381</volume>
          -
          <fpage>393</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Nakamura</surname>
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Potthast</surname>
            <given-names>R</given-names>
          </string-name>
          . Inverse modeling // IOP Publishing.
          <year>2015</year>
          . DOI:
          <volume>10</volume>
          .1088/978-0-
          <fpage>7503</fpage>
          -1218-9.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Feng</surname>
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Palmer</surname>
            <given-names>P.I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bosch</surname>
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dance</surname>
            <given-names>S.</given-names>
          </string-name>
          <article-title>Estimating surface CO2 fluxes from space-borne CO2 dry air mole fraction observations using an ensemble Kalman filter // Atmospheric Chemistry</article-title>
          and Physics.
          <year>2009</year>
          . Vol.
          <volume>9</volume>
          . P.
          <volume>2619</volume>
          -
          <fpage>2633</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Feng</surname>
            <given-names>L.</given-names>
          </string-name>
          et al.
          <article-title>Evaluating a 3-D transport model of atmospheric CO2 using ground-based, aircraft, and space-borne data // Atmospheric Chemistry</article-title>
          and Physics.
          <year>2011</year>
          . Vol.
          <volume>11</volume>
          . P.
          <volume>2789</volume>
          -
          <fpage>2803</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Feng</surname>
            <given-names>L.</given-names>
          </string-name>
          et al.
          <article-title>Estimates of European uptake of CO2 inferred from GOSAT X2 retrievals: Sensitivity to measurement bias inside and</article-title>
          outside Europe // Atmospheric Chemistry and Physics.
          <year>2016</year>
          . Vol.
          <volume>16</volume>
          . P.
          <volume>1289</volume>
          -
          <fpage>1302</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Feng</surname>
            <given-names>L.</given-names>
          </string-name>
          et al.
          <article-title>Consistent regional fluxes of CH 4 and CO2 inferred from GOSAT proxy XCH4:XCO2 retrievals,</article-title>
          <year>2010</year>
          -2014 // Atmospheric Chemistry and Physics.
          <year>2017</year>
          . Vol.
          <volume>17</volume>
          . P.
          <volume>4781</volume>
          -
          <fpage>4797</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Jazwinski</surname>
            <given-names>A.H.</given-names>
          </string-name>
          <article-title>Stochastic processes and filtering theory</article-title>
          . N.Y.: Academic Press,
          <year>1970</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Houtekamer</surname>
            <given-names>H.L.</given-names>
          </string-name>
          , Zhang F.
          <article-title>Review of the ensemble Kalman filter for atmospheric data assimilation // Monthly Weather Review</article-title>
          .
          <year>2016</year>
          . Vol.
          <volume>144</volume>
          . P.
          <volume>4489</volume>
          -
          <fpage>4532</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>