<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <article-id pub-id-type="doi">10.18287/1613-0073-2016-1638-428-436</article-id>
      <title-group>
        <article-title>NDVI TIME SERIES MODELING IN THE PROBLEM OF CROP IDENTIFICATION BY SATELLITE IMAGES</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>N.S. Vorobiova</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>A.V. Chernov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Samara National Research University, Samara, Russia Image Processing Systems Institute - Branch of the Federal Scientific Research Centre “Crystallography and Photonics” of Russian Academy of Sciences</institution>
          ,
          <addr-line>Samara</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2016</year>
      </pub-date>
      <fpage>428</fpage>
      <lpage>436</lpage>
      <abstract>
        <p>The paper deals with the problem of NDVI time series modeling and application of simulated data in task of crop identification by satellite images. The simulation is performed for six types (classes) of crops in each agricultural zone, situated in the territory of the Samara region. Simulation parameters for each class are calculated from the coefficients of approximation which are obtained by approximating the time series of real agricultural fields by the function of a certain kind. The generated sets of simulated time series are used for crop recognition on real fields, located on the territory of the Samara region.</p>
      </abstract>
      <kwd-group>
        <kwd>time series</kwd>
        <kwd>vegetation index</kwd>
        <kwd>NDVI</kwd>
        <kwd>satellite images</kwd>
        <kwd>crops identification</kwd>
        <kwd>crops recognition</kwd>
        <kwd>algorithm for calculating estimates</kwd>
        <kwd>time series approximation</kwd>
        <kwd>time series modeling</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Nowadays crop identification using satellite images is an important task of remote
sensing data application in agriculture [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. A variety of methods for solving this task
is big [
        <xref ref-type="bibr" rid="ref2 ref3">2,3</xref>
        ]. Let us consider the methods which use time series of vegetation indexes
constructed by a set of satellite images [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Such methods most often use images of
low spatial resolution but high temporal resolution (shooting at least once a day). This
allows applying them for operational monitoring of a large area. Such methods are
used in regional geoinformation systems of agroindustrial complex (hereinafter – GIS
AIC) [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] for the control of agricultural land.
      </p>
      <p>
        The quality of any recognition method depends on the probabilities of correct
classification [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Obviously, specified quality criteria for the method of crop identification
using time series will directly depend on the following factors: the date of
classification (the longer the time series, the probability is higher), training sample size and the
absence of errors in training sample.
It is a typical situation for regional GIS AIC when farmers declare information about
crops sowed on the fields and some of declared data contain significant errors and
false reporting; sometimes information may come too late – close to the end or after
the end of the growing season. Therefore, such declared data cannot be used as a
sample for evaluating the quality of classification, but only for general estimation of
distribution in feature space. In this paper we propose a method for time series
modeling and investigate the possibility of using simulated time series as a training sample
for crop identification.
      </p>
      <p>
        Boundaries of real fields and information about crops seeded on them for the years
2011-2015 are used in current work. All real fields are located in three agricultural
zones (hereinafter - the zone) of Samara region: the northern, central and southern.
Time series modeling and generating, crop recognition procedures are carried out for
each agricultural zone which are homogeneous by climatic conditions.
In this paper we consider a partition of crops for the following classes: winter crops,
early spring crops, late spring crops, fallows, perennial grasses, unused lands. The
most popular vegetation index NDVI [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] is used for time series calculation.
1
      </p>
    </sec>
    <sec id="sec-2">
      <title>Time series modeling</title>
      <p>Represent time series as a mixture of useful (ideal) signal and additive noise having
gaps in the daily observations due to cloud cover. In order to use the time series as a
training set, we simulate only the useful signal, as noise and omissions lead to a
deterioration of the classification quality. The question of choosing the shape of useful
signal is raised.</p>
      <p>
        Commonly used methods of time series reconstruction offer to approximate the time
series by the functions like this: asymmetric Gaussian [
        <xref ref-type="bibr" rid="ref8 ref9">8,9</xref>
        ], the double logistic [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
These functions and polynomials of different shape were compared by approximation
quality, flexibility and sustainability. A detailed description of the comparison results
is beyond the scope of this paper. In the end, the following function is encouraged to
use for modeling a useful component of a time series (let's call it the ideal curve):
F x  x  ax  bPn x  k x  e
where Pn x – linear combination of Legendre polynomials up to the n degree, a –
the earliest date when at least one object of the class in this zone has observation, b –
the latest date when at least one object of the class in this zone still has observation.
The coefficients k,e are obtained from the conditions:
(1)
(2)
k  a  e  ya
k  b  e  yb
where ya , yb – the values of averaged class profile approximated by polynomial
Pn x in the points a,b respectively. Averaged class profile is calculated by using all
time series for a class and consists of averaged values for each day of observation.
Thus, parameters k , e for a certain class are common for all curves of this class, but
the coefficients of polynomial Pn x are different. In other words, each ideal curve
from some class is characterized by vector of coefficients p   p1, p2 ,..., pn1 T .
Next, we consider n  4 .
      </p>
      <p>Next stage is calculation of modeling parameters for ideal curves.
2</p>
    </sec>
    <sec id="sec-3">
      <title>Calculation of modeling parameters</title>
      <p>Time series corresponded to the real fields and calculated by the data from
Terra/MODIS satellite processed up to the level MOD09GQ will be used to calculate the
modeling parameters. Total number of real time series is 20940. There is a
correspondence "time series" - "class of crops". The following table shows the total
number of real time series for each class.
least squares. The result is a number of implementations of a coefficient vector p
having multivariate normal distribution. Number of implementations is equal to the
number of real time series for fixed triple "class of crop – zone – year".
2. Calculation of the vector of mathematical expectations M and the covariance
matrix B , that characterize the law of vector p distribution.
90 different sets of parameters to simulate ideal curves defined by the formula (1)
were calculated for the years 2011-2015 for Samara Region.</p>
      <p>It should be noted that the residuals of approximation obtained in the first stage, are
the values of noise subtracted from the useful ideal signal. Uncorrelated noise is
obtained by time series approximation of the function (1) for each triple "class of
cropzone-year". Such conclusion can be drawn by analyzing the value of auto correlation
function (hereinafter – ACF), built by a sequence of residuals for selected triple "class
of crop – zone – year". All ACF values except corresponding to a zero lag do not
exceed the 0.5 value, so it is possible to speak of uncorrelated ACF values. This
means that a sufficient degree of the polynomial Pn x in the function (1) is selected,
and the residuals do not contain useful signal remains unaccounted of the function (1).
The average value of the mean square error of time series approximation for the triple
"class of crop-zone-year" amounts to 0,047.</p>
      <p>
        The figure below shows an example of time series approximation by function (1). The
X-axis represents the time coordinate – the date of the time series observation. Time
coordinate has been translated into the range [
        <xref ref-type="bibr" rid="ref1">-1, 1</xref>
        ] for the convenience of
calculation. The Y-axis represents the values of NDVI index.
1. Generate a vector   1, ..., n1 T whose components are independent random
variables having standard normal distribution.
2. Calculate the matrix A , which is a Cholesky decomposition of the covariance
matrix B .
3. Calculate the vector of coefficients p through a linear transformation of the vector
 : p  A  M .
      </p>
      <p>Next, daily values of ideal curve are calculated in an acceptable range [a,b] by
generated coefficient vector p and a set of parameters k,e that are common to the triple
"class of crop-zone-year".</p>
      <p>According to the specified algorithm ideal curves for six types of crops were
generated for the three zones of the Samara region for years 2011-2015. For each triple "class
of crop-zone-year" 4000 curves were generated.</p>
    </sec>
    <sec id="sec-4">
      <title>Crop recognition algorithm</title>
      <p>
        To detect crop types by using time series a method based on the algorithm for
calculating estimates (abbreviated – ACE) is used [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. The advantage of ACE is the
ability to classify objects with gaps in the features. Gaps in the values of time series and,
as a consequence, gaps in a set of features arise because of cloudiness. Let’s
concretize the ACE model to recognize crop types:
1. Features - the values of time series.
2. The system of reference feature sets consists of a single set including all features.
3. Proximity function for recognizable object a and reference object  is calculated
as follows:
(3)
where
an , n  0, N  1 – features of object a
(set of time series values);
n , n  0, N  1 – features of object  . Proximity function is calculated only for the
days n on which both the object  and the object a have the values of time series.
4. The value of proximity function f , a  for the reference object  and the
recognizable object a is calculated so:
      </p>
      <p>, where T is threshold of proximity.
5. The estimation of proximity of the object a to a certain class  j is calculated as
 
1 N 1</p>
      <p> n  an 2 ,
N n0</p>
      <p> 1,   T
f , a  </p>
      <p> 0,   T
follows:
 j  
 f , a .</p>
      <p> j
6. Classification of recognizable object a will be done in class c according to the
decision rule:
c  arg</p>
      <p>max
m0, M 1</p>
      <p>m 
where M</p>
      <p>– the number of classes.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Experiments</title>
      <p>Results of experiments are given below. The experiments were carried out to analyze
the practicability of using the simulated time series instead of time series built for real
fields as training sample in the recognition methods. First three experiments used
simulated time series as a reference sample in the algorithm for calculating estimates
and were carried out to identify the crops on real fields located in the Samara region
in 2014 in the northern, central and southern zones, respectively. In the fourth
experiment time series built for real fields were used as a training sample, and crop
identification was carried out without separating the fields by zones. This is due to a small
amount of real data, which is not enough in the case of the classification by zones to
build both training sample, and a set of fields for identification. The probability
matrixes of classification from class to class for each experiment are given below in
Tables 2-5, respectively. The classification method for all experiments is ACE.
1. The northern zone. Total number of recognized objects is 1341. The overall
probability of correct classification amounts to 0.58.
perennial grasses 0.33 0.51 0.05 0.01 0.04
unused lands 0.13 0.73 0.02 0.07 0.01
winter crops 0.03 0.10 0.74 0.08 0.01
fallows 0.07 0.30 0.04 0.44 0.05
early spring 0.02 0.13 0.04 0.07 0.16
crops
late spring crops 0.03 0.11 0.02 0.04 0.14 0.66
2. The central zone. Total number of recognized objects is 2324. The overall
probability of correct classification amounts to 0.73.
perennial grasses 0.25 0.53 0.11 0.08 0.02
unused lands 0.03 0.88 0.02 0.04 0.02
winter crops 0.03 0.05 0.87 0.03 0.01
fallows 0.03 0.25 0.05 0.55 0.05
early spring 0.01 0.14 0.01 0.06 0.11
crops
late spring crops 0.00 0.12 0.01 0.03 0.07 0.76
3. The southern zone. Total number of recognized objects is 2766. The overall
probability of correct classification amounts to 0.70.
4. Classification of real data on real data. Evaluation of the classification quality was
carried out by cross-validation. Sample of all real time series of year 2014 was
divided five times into training sample (reference objects) and control sample
(recognizable objects) in a ratio of 2:1. Total number of objects in sample is 6432, the number of
early
sprin</p>
      <p>g
crops
0.05
0.05
0.04
0.09
0.56
early
sprin</p>
      <p>g
crops
0.01
0.01
0.01
0.06
0.68</p>
      <p>late
spring
crops
reference objects is 4288, and the number of recognizable objects is 2144. Average
value of the overall probability of correct classification amounts to 0.69.
perennial grasses 0.40 0.42 0.08 0.03 0.01 0.03
unused lands 0.17 0.77 0.01 0.02 0.02 0.01
winter crops 0.07 0.03 0.86 0.01 0.01 0.02
fallows 0.12 0.19 0.03 0.53 0.06 0.07
early spring 0.04 0.09 0.01 0.04 0.68 0.14
crops
late spring crops 0.03 0.07 0.01 0.07 0.07 0.75
Thus, the results of the above experiments lead to the conclusion that it is possible to
use the simulated time series as a learning sample in the methods of crops
identification.</p>
    </sec>
    <sec id="sec-6">
      <title>Conclusion</title>
      <p>The idea of using the simulated time series as a training sample in the method of crop
type identification was demonstrated in this paper. This idea has the following
advantages:
1. Generating of training sample of any size by calculated modeling parameters.
2. Ability to use a set of simulated time series as a basis to assess the classification
quality of any classification algorithm.
3. Ability to improve the classification quality by choosing different approximation
functions.
4. The possibility of crop identification at the beginning of the season, when the
training sample size is not sufficient for classification, but there is a set of fields with
reliable information about the seeded crops (let's call them the support fields). This
option will appear if there is a set of time series for a number of past years. In other
words, if there is an accumulated historical statistics of the crops development on the
real fields. Modeling parameters are calculated by this historical time series, and set
of modeling time series will be generated by obtained modeling parameters. These
modeling time series, built according to past years, will be called templates of crop
development over the past years. Thus at the beginning of the season the closest
template to development of crops in support fields will be selected from a set of historical
patterns. Further, modeling time series of selected historical template can be used as a
training sample in the method of crop identification in the current year.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgements</title>
      <p>This work was financially supported by RFBR, project № 16-37-00043_mol_a
«Development of methods of using data from geoinformation systems in remote sensing
data processing» and project № 16-29-09494_ofi_m «Methods of computer
processing of multispectral remote sensing data for vegetation areas detection in special
forensics».</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Vorobiova</surname>
            <given-names>NS</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Timbay</surname>
            <given-names>EI</given-names>
          </string-name>
          .
          <article-title>Geoinformation system of agricultural lands inventory and control development</article-title>
          .
          <source>Computer Optics</source>
          ,
          <year>2009</year>
          ;
          <volume>33</volume>
          (
          <issue>3</issue>
          ):
          <fpage>340</fpage>
          -
          <lpage>344</lpage>
          . [in Russian]
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Denisova</surname>
            <given-names>AYu</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sergeyev</surname>
            <given-names>VV</given-names>
          </string-name>
          .
          <article-title>Impulse response identification for remote sensing images using GIS data</article-title>
          .
          <source>Computer Optics</source>
          ,
          <year>2015</year>
          ;
          <volume>39</volume>
          (
          <issue>4</issue>
          ):
          <fpage>557</fpage>
          -
          <lpage>563</lpage>
          [in Russian].
          <source>DOI: 10</source>
          .18287/
          <fpage>0134</fpage>
          -2452-2015-39-4-
          <fpage>557</fpage>
          -563.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Sergeyev</surname>
            <given-names>VV</given-names>
          </string-name>
          ,
          <article-title>Denisova AYu</article-title>
          .
          <article-title>Iterational method for piecewise constant images restoration with an a priory knowledges of image objects boundaries</article-title>
          .
          <source>Computer Optics</source>
          ,
          <year>2013</year>
          ;
          <volume>37</volume>
          (
          <issue>2</issue>
          ):
          <fpage>239</fpage>
          -
          <lpage>243</lpage>
          . [in Russian]
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Bartalev</surname>
            <given-names>SA</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Egorov</surname>
            <given-names>VA</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Loupian</surname>
            <given-names>EA</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Plotnikov</surname>
            <given-names>DE</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Uvarov</surname>
            <given-names>IA</given-names>
          </string-name>
          .
          <article-title>Recognition of arable lands using multi-annual satellite data from spectroradiometer MODIS and locally adaptive supervised classification</article-title>
          .
          <source>Computer Optics</source>
          ,
          <year>2011</year>
          ;
          <volume>35</volume>
          (
          <issue>1</issue>
          ):
          <fpage>103</fpage>
          -
          <lpage>116</lpage>
          . [in Russian]
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Vorobiova</surname>
            <given-names>NS</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Denisova</surname>
            <given-names>AYu</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kuznetsov</surname>
            <given-names>AV</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Belov</surname>
            <given-names>AM</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chernov</surname>
            <given-names>AV</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Myasnikov</surname>
            <given-names>VV</given-names>
          </string-name>
          .
          <article-title>How to Use Geoinformation Technologies and Space Monitoring for Controlling the Agricultural Sector in Samara Region. Pattern Recognition and Image Analysis</article-title>
          .
          <source>Advances in Mathematical Theory and Applications</source>
          ,
          <year>2015</year>
          ;
          <volume>25</volume>
          (
          <issue>2</issue>
          ):
          <fpage>347</fpage>
          -
          <lpage>353</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Kuznetsov</surname>
            <given-names>AV</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Myasnikov</surname>
            <given-names>VV</given-names>
          </string-name>
          .
          <article-title>A comparison of algorithms for supervised classification using hyperspectral data</article-title>
          .
          <source>Computer Optics</source>
          ,
          <year>2014</year>
          ;
          <volume>38</volume>
          (
          <issue>3</issue>
          ):
          <fpage>494</fpage>
          -
          <lpage>502</lpage>
          . [in Russian]
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Maiorova</surname>
            <given-names>VI</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bannikov</surname>
            <given-names>AM</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grishko</surname>
            <given-names>DA</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jarenov</surname>
            <given-names>IS</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Leonov</surname>
            <given-names>VV</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Toporkov</surname>
            <given-names>AG</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Harlan</surname>
            <given-names>AA</given-names>
          </string-name>
          .
          <article-title>Monitoring condition of agricultural fields based on prediction of NDVI with the use of multi-spectral and hyper-spectral data from space imagery</article-title>
          .
          <source>Science &amp; education: scientific edition of Bauman MSTU</source>
          ,
          <year>2013</year>
          ;
          <volume>07</volume>
          :
          <fpage>199</fpage>
          -
          <lpage>228</lpage>
          . [in Russian]
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Ozdogan</surname>
            <given-names>M.</given-names>
          </string-name>
          <article-title>The spatial distribution of crop types from MODIS data: Temporal unmixing using Independent Component Analysis</article-title>
          .
          <source>Remote Sensing of Environment</source>
          ,
          <year>2010</year>
          ;
          <volume>114</volume>
          (
          <issue>6</issue>
          ):
          <fpage>1190</fpage>
          -
          <lpage>1204</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Fischer</surname>
          </string-name>
          <article-title>A. A Model for the Seasonal Variations of Vegetation Indices in Coarse Resolution Data and Its Inversion to Extract Crop Parameters</article-title>
          .
          <source>Remote Sensing of Environment</source>
          ,
          <year>1994</year>
          ;
          <volume>48</volume>
          :
          <fpage>220</fpage>
          -
          <lpage>230</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Wei</surname>
            <given-names>W</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wu</surname>
            <given-names>W</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            <given-names>Z</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yang</surname>
            <given-names>P</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhou</surname>
            <given-names>Q. Selecting</given-names>
          </string-name>
          <article-title>the optimal NDVI time-series reconstruction technique for crop phenology detection</article-title>
          .
          <source>Intelligent Automation &amp; Soft Computing</source>
          . Special Issue: Intelligent Automation with Applications to Agriculture,
          <year>2016</year>
          ;
          <volume>2</volume>
          :
          <fpage>237</fpage>
          -
          <lpage>247</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Kolomiets</surname>
            <given-names>EI</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Myasnikov</surname>
            <given-names>VV</given-names>
          </string-name>
          .
          <article-title>Simulation of experimental data for pattern recognition tasks. Guidelines for the laboratory work No 1,</article-title>
          <year>2010</year>
          : 20 p. [in Russian]
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Vorobiova</surname>
            <given-names>NS</given-names>
          </string-name>
          .
          <article-title>Crops identification by using satellite images and algorithm for calculating estimates</article-title>
          .
          <source>Proceedings of International conference and school for young scientists “Information technology and nanotechnology (ITNT-2015)”</source>
          ,
          <year>2015</year>
          :
          <fpage>83</fpage>
          -
          <lpage>88</lpage>
          . [in Russian]
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>