<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Digital technologies for forest monitoring in the Baikal natural territory</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Igor V. Bychkov</string-name>
          <email>bychkov@icc.ru</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gennady M. Ruzhnikov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Roman K. Fedorov</string-name>
          <email>fedorov@icc.ru</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anastasia K. Popova</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Matrosov Institute for System Dynamics and Control Theory, Siberian Branch of Russian Academy of Sciences</institution>
          ,
          <addr-line>Lermontov st. 134, Irkutsk</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The paper considers the problem of forest resources monitoring over large areas on the example of the Baikal natural territory. As the main data source, we use Sentinel-2 remote sensing data due to their regularity, broad coverage, multispectral parameters of the resulting image. The Random forest and Support Vector Machines (SVM) machine learning algorithms were used to classify land cover from the Sentinel-2 products. Both methods have shown good results with a fairly high accuracy. The training was carried out with data labeled manually into 12 classes.</p>
      </abstract>
      <kwd-group>
        <kwd>1 Machine learning</kwd>
        <kwd>remote sensing</kwd>
        <kwd>forest monitoring</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Forest monitoring is an assessment and forecasting system for the forest fund state in space and
time for the rational use, protection and reproduction of forests, increasing their ecological functions.
Monitoring supports tracking the forest resources dynamics caused by forest management, natural and
anthropogenic impacts, compiling predictive and analytical models for their protection and use,
sustainable development of forest economics. The effectiveness of forest monitoring directly depends
on the completeness and accuracy of observations data of various environment elements.</p>
      <p>The Baikal Natural Territory (BNT) covers Lake Baikal, the water protection zone around it,
specially protected natural areas and adjacent areas 200 km to the west and northwest from the lake.
The area of the BNT is 386 thousand km², there are 31 specially protected natural areas, including 3
reserves, 2 national parks, 6 recreational areas and more than 128 natural monuments.</p>
      <p>Among the most important natural resources of the BNT are forest resources, which ensure the
sustainability of environment, performing water and soil protection, water regulation functions. The
area of BNT lands covered with forest vegetation is about 8350.73 thousand hectares and 92% of
these lands are covered by forests, represented by two groups of forest-forming species: coniferous
and deciduous trees. BNT forests are negatively affected by fires, forest diseases, insect pests,
unfavorable weather conditions, which can lead to the loss of forest biological stability.</p>
      <p>
        Forest monitoring of the BNT has poor efficiency and limited access to in-situ data, which
complicates the support of decision-making and the conduct of interdisciplinary research. Official
forest inventory information is not always up-to-date, there is no unified system for storing and
processing forest monitoring data at the regional level [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>This determines the relevance of forest monitoring in a digital format, which is essential for
sustainable forest management in the BNT and compliance with the requirements of continuous,
rational use of forests, their reproduction, and conservation of resource, recreational, ecological
potential and biological diversity.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Organization of digital monitoring</title>
      <p>The advantages of the digital forest monitoring system of the BNT are a large number of
participants and their information resources; the participants' emphasis on their strengths and the
transfer of non-core activities and services to outsourcing to others; efficiency of updating open
digital information resources, attraction of scientific knowledge. Planning and implementation of
several services provided by different participants, including third-party ones, reducing the cost of
obtaining services helps to increase the complexity and validity of management decisions.</p>
      <p>
        Digital forest monitoring of the BNT is based on an information and analytical environment that
provides collection, transmission, search, storage of spatio-temporal data from forest monitoring, the
ability to assess, model and forecast the state of forest resources of the BNT [
        <xref ref-type="bibr" rid="ref2 ref3">2-3</xref>
        ]. Such an
environment should contain spatial and thematic data of forest monitoring, including remote sensing
data, unified reference books and classifiers. A catalog of services is intended for processing
monitoring data: providing data, assessing forest dynamics, machine learning, publishing results in
the form of maps and diagrams. The scheme of such digital forest monitoring system is shown in the
figure 1.
      </p>
      <p>
        Remote sensing is an important source of data for digital forest monitoring [
        <xref ref-type="bibr" rid="ref4 ref5">4-5</xref>
        ]. Traditional forest
in-situ data are not always able to provide up-to-date data for a large area, such as BNT, they are often
based on a sample of small areas, or contain aggregated information without an accurate spatial
description. Remote sensing data creates opportunities to obtain forest data in a more efficient way,
provides information about their spatial species distribution with wide temporal coverage and higher
refresh rates.
      </p>
      <p>Medium resolution satellite imagery such as Landsat and Sentinel-2 allow map large areas in
economical manner. The resolution of 10 m in the main Sentinel-2 bands allows to detect a number of
forest parameters quite accurately, making them more preferable than Landsat images with a 30 m
resolution. Sentinel-2 satellites are equipped with MultiSpectral Instrument with 13 spectral bands,
covering channels from blue to short wave infrared (SWIR) with a resolution of 10 to 60 m. Provides
global coverage on average every 5 days.</p>
      <p>Intelligent analysis of remote sensing data provides an opportunity to identify changes in the forest
fund as a result of anthropogenic impacts and environmental disturbances, fires, damage to forests by
pests, diseases, windblows, etc. At the initial stage, it is necessary to classify satellite images for the
study area by compiling land cover maps. In this study, we use machine learning methods for
automated land cover classification.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Results and discussion</title>
      <p>
        Machine learning algorithms are used to process large datasets consisting of multi-temporal
images with spectral metrics [
        <xref ref-type="bibr" rid="ref6 ref7">6-7</xref>
        ]. These methods are effective for classifying complex
multidimensional data, providing nonlinear and nonparametric classifications. The most popular
machine learning algorithms used in remote sensing research are Random Forest (RF) and Support
Vector Machines (SVM).
      </p>
      <p>RF is a nonparametric ensemble machine learning algorithm based on decision trees. It can process
various data such as satellite images and numerical data. Each decision tree produces a classification
result for samples not selected as training samples. The decision tree chooses some class, and the final
class is determined by the highest number of votes.</p>
      <p>SVM is a machine learning method developed based on the theory of statistical learning and the
principle of minimizing structural risks. Compared with traditional teaching methods, it has high
accuracy, fast computation speed, and strong generalizability, which is widely used in image mapping
and land classification.</p>
      <p>Study area in this research covers south of the lake Baikal. Little cloudy Sentinel-2A MSI granules
used in the study were freely acquired on 25 June 2017 and downloaded from the Copernicus
Scientific Data Hub as a Level-1C product. Figure 2 present Sentinel-2 image RGB composite for the
study area.</p>
      <p>To monitor the forest resources of the BNT, a part of the study area was previously labeled
manually with 12 classes: felling, shrubs, coniferous forest, woodland, deciduous forest, mixed forest,
rocks, pastures, arable land, residential area, clouds, water. For this step, polygon-shaped samples
were generated based on visual interpretation of the high resolution satellite images and expert
knowledge.</p>
      <p>All 13 bands of the Sentinel-2 image were used for training. The labeled sample was randomly
divided into training and test parts in a proportion 70/30. We used implementations of machine
learning algorithms from the "scikit-learn" Python library. The Random forest method uses 200
decision trees, SVM parameters are: kernel = "linear", C = 1.0. The estimation of the accuracy of the
algorithms is given in the table 1. We used macro average values for precision, recall, and f1-score
parameters.</p>
      <p>The Random forest made the most mistakes in the Lightwood and Deciduous Forest classes. SVM
misclassifies Pasture, Woodland, and Deciduous Forest classes. Errors occurred due to the similarity
of spectral characteristics in the classes associated with vegetation. In the future, it is necessary to
expand the training dataset, filling it with a large number of samples of different forest species.</p>
      <p>The classification results are presented in Figures 3 and 4. It is visible that the Random forest
method misclassifies "Living area" class (highlighted in gray). The SVM method also misclassifies
the "Logging" class (lower left corner of the map, highlighted in red). At the same time, the value of
the accuracy of both algorithms in these classes is quite high (96-97%), which can be explained by the
insufficient size of the test sample, on which the accuracy was assessed.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusion</title>
      <p>Effective management of forest resources is impossible without full and timely information about
their condition. Remote sensing images are well suited for regular monitoring of large areas due to
their high repeatability, wide coverage, and easy accessibility. 13 spectral bands of Sentinel-2
satellites images are quite enough to distinguish various tree species.</p>
      <p>The work compares the results of two machine learning algorithms - Random forest and SVM - to
classify the land cover. The test array with 12 classes on the BNT area training samples were
generated and labeled manually. The learning showed high results: 98.92% OAA for Random forest
and 93.79% for SVM. The main calculation errors are associated with an insufficient number of test
samples, which does not allow the methods to separate accurately from each other classes with similar
spectral characteristics. We plan to expand sample dataset to improve the classification results.</p>
      <p>The resulting classification of land cover can be used for BNT forest monitoring. Fast tracking of
logging, burnt-out areas, and reforestation will allow to assess the forest resources dynamics and to
make management decisions.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Acknowledgements</title>
      <p>The results were obtained within the framework of the State Assignment of the Ministry of
Education and Science of the Russian Federation for the project "Methods and technologies of
cloud⁠based service-⁠oriented platform for collecting, storing and processing large volumes of multi-⁠format
interdisciplinary data and knowledge based upon the use of artificial intelligence, model-⁠guided
approach and machine learning" (state registration number 121030500071-⁠2). Results are achieved
using the Centre of collective usage «Integrated information network of Irkutsk scientific educational
complex».</p>
    </sec>
    <sec id="sec-6">
      <title>6. References</title>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Anastasia</surname>
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Popova</surname>
          </string-name>
          , Evgeny A.
          <string-name>
            <surname>Cherkasin</surname>
            ,
            <given-names>Igor N.</given-names>
          </string-name>
          <string-name>
            <surname>Vladimirov</surname>
          </string-name>
          ,
          <article-title>Forest Resources of the Baikal Region: Vegetation Dynamics Under Anthropogenic Use</article-title>
          , Information Technologies in the Research of Biodiversity,
          <source>Springer Proceedings in Earth and Environmental Sciences</source>
          , Springer, Cham (
          <year>2019</year>
          )
          <fpage>96</fpage>
          -
          <lpage>106</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>030</fpage>
          -11720-7_
          <fpage>14</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>I. V.</given-names>
            <surname>Bychkov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. M.</given-names>
            <surname>Ruzhnikov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. K.</given-names>
            <surname>Fedorov</surname>
          </string-name>
          and
          <string-name>
            <given-names>A. K.</given-names>
            <surname>Popova</surname>
          </string-name>
          ,
          <article-title>Digital platform for forest resources monitoring in the BAIKAL natural territory</article-title>
          ,
          <source>Journal of Physics: Conference Series</source>
          , Volume
          <volume>1864</volume>
          ,
          <article-title>13th Multiconference on Control Problems (MCCP</article-title>
          <year>2020</year>
          )
          <article-title>6-8 October 2020</article-title>
          ,
          <string-name>
            <given-names>Saint</given-names>
            <surname>Petersburg</surname>
          </string-name>
          , Russia,
          <source>J. Phys.: Conf. Ser</source>
          .
          <year>1864</year>
          012111 (
          <year>2020</year>
          ) doi:10.1088/
          <fpage>1742</fpage>
          -
          <lpage>6596</lpage>
          /
          <year>1864</year>
          /1/012111.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>I. V.</given-names>
            <surname>Bychkov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. M.</given-names>
            <surname>Ruzhnikov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. K.</given-names>
            <surname>Fedorov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. E.</given-names>
            <surname>Khmelnov</surname>
          </string-name>
          and
          <string-name>
            <given-names>A. K.</given-names>
            <surname>Popova</surname>
          </string-name>
          ,
          <article-title>Organization of digital monitoring of the Baikal natural territory</article-title>
          ,
          <source>IOP Conference Series: Earth and Environmental Science</source>
          , Volume
          <volume>629</volume>
          ,
          <string-name>
            <surname>Environmental</surname>
            <given-names>transformation</given-names>
          </string-name>
          <source>and sustainable development in Asian region 8-10 September</source>
          <year>2020</year>
          , Irkutsk, Russian Federation,
          <source>IOP Conf. Ser.: Earth Environ. Sci. 629</source>
          <volume>012067</volume>
          (
          <year>2021</year>
          ) doi:10.1088/
          <fpage>1755</fpage>
          -1315/629/1/012067.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lastovicka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Svec</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Paluba</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Kobliuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Svoboda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Hladky</surname>
          </string-name>
          , et al.,
          <article-title>Sentinel-2 data in an evaluation of the impact of the disturbances on forest vegetation</article-title>
          ,
          <source>Remote Sens</source>
          <volume>12</volume>
          :
          <year>1914</year>
          (
          <year>2020</year>
          ). doi:
          <volume>10</volume>
          .3390/rs12121914.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>L.</given-names>
            <surname>Soleimannejad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ullah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Abedi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dees</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Koch</surname>
          </string-name>
          ,
          <article-title>Evaluating the potential of Sentinel-2, Landsat-8, and IRS satellite images in tree species classification of hyrcanian forest of Iran using Random forest</article-title>
          ,
          <source>J Sustain Forest</source>
          (
          <year>2019</year>
          )
          <volume>38</volume>
          :
          <fpage>615</fpage>
          -
          <lpage>28</lpage>
          . doi:
          <volume>10</volume>
          .1080/10549811.
          <year>2019</year>
          .
          <volume>1598443</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>E.</given-names>
            <surname>Grabska</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Frantz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Ostapowicz</surname>
          </string-name>
          ,
          <article-title>Evaluation of machine learning algorithms for forest stand species mapping using Sentinel-2 imagery and environmental data in the Polish Carpathians, Remote Sens Environ (</article-title>
          <year>2020</year>
          )
          <volume>251</volume>
          :
          <fpage>112103</fpage>
          . doi:
          <volume>10</volume>
          .1016/j.rse.
          <year>2020</year>
          .
          <volume>112103</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>C.</given-names>
            <surname>Musi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Anggoro</surname>
          </string-name>
          ,
          <article-title>Sunarsih, System dynamic modelling and simulation for cultivation of forest land: Case study Perum Perhutani</article-title>
          , Central Java, Indonesia,
          <string-name>
            <surname>J Ecol Eng</surname>
          </string-name>
          (
          <year>2017</year>
          )
          <volume>18</volume>
          :
          <fpage>25</fpage>
          -
          <lpage>34</lpage>
          . doi:
          <volume>10</volume>
          .12911/22998993/74307.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>