<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Development of a geographic information system for data collection and analysis based on microservice architecture</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Alexander A. Dontsov</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Igor A. Sutorikhin</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Federal Research Center for Information and Computational Technologies</institution>
          ,
          <addr-line>Novosibirsk</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Institute for Water and Environmental Problems SB RAS</institution>
          ,
          <addr-line>Barnaul</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
      </contrib-group>
      <fpage>280</fpage>
      <lpage>287</lpage>
      <abstract>
        <p>The paper discusses the use of microservice architecture in the development of geographic information systems (GIS) for collecting, processing and analyzing data. As a rule, microservice architecture is used to build applications in information systems related to solving business problems, and is not widespread in the development of geographic information systems in the scientific field. However, its application is now becoming increasingly important. Decomposition of the software implementation and GIS infrastructure associated with computations and data processing into components in the form of microservices has a number of advantages, such as: increased fault tolerance, increased flexibility, reduced maintenance efort, simplified scaling, and others. The first results of the application of the microservice approach in the development of a geoinformation system for the collection and processing of hydrological and hydrobiological data on the state of water bodies are shown. The architecture, main components, and features of the information infrastructure are shown.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;GIS</kwd>
        <kwd>microservices</kwd>
        <kwd>satellite data</kwd>
        <kwd>geoportal</kwd>
        <kwd>cloud technologies</kwd>
        <kwd>information systems</kwd>
        <kwd>lake</kwd>
        <kwd>reservoir</kwd>
        <kwd>measuring complexes</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Monitoring the parameters of various natural objects is an urgent task of nature management.
Currently, there is a growing need to provide data on the state of natural objects to a wide
range of organizations and individuals, from government agencies to public organizations.
Currently, to solve the problems of collecting, processing and providing data, web GIS systems
are widely used in the form of geoportals, which allow automating data processing processes
and organizing access to the results of calculations. The use of microservice architecture in
the development of such web GIS makes it possible to achieve reuse of GIS components and
their independent operation [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. This approach is that the information system is implemented
as a set of small services, each of which is executed as a separate process and communicates
with others using interaction mechanisms, as a rule, technologies are used for this: REST, gRPC,
RabbitMQ, Apache Kafka [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        Our previous works [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ] show the results of the development of a GIS for the collection and
      </p>
      <sec id="sec-1-1">
        <title>1. Services are simple and easy to maintain.</title>
        <p>2. Services are deployed independently of each other.
3. Services scale independently of each other.
4. There is an opportunity to conduct technological experiments and it is relatively easy to
introduce new technologies.
5. Relatively high fault tolerance compared to monolithic architecture.</p>
        <p>
          However, the microservice architecture also has disadvantages such as.
1. Dificulty at the initial stage of development and creation of infrastructure. Distributed
systems are more dificult to develop, since you need to provide for the independence of
one microservice from failure in another component.
2. Development of distributed systems imposes additional costs on the exchange of data
between microservices: you need to choose the right communication protocols between
the components so that the interaction is as eficient as possible [
          <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
          ].
        </p>
        <p>Monitoring of water bodies, such as lakes and reservoirs, is a topical area of environmental
management. Inland water bodies play a very important role in natural and anthropogenic
processes.</p>
        <p>When developing a GIS designed for collecting and processing data on the state of water
bodies, it is necessary to take into account the fact that complete and comprehensive information
can be obtained by integrating diferent measurement methods, such as:</p>
      </sec>
      <sec id="sec-1-2">
        <title>1. Satellite monitoring.</title>
        <p>2. Data of ground measuring complexes.</p>
        <p>3. Data from expeditionary research and field observations.</p>
        <p>These are three unrelated sources of information that require diferent approaches, algorithms
and technological solutions to work with them. In this case, it is relevant to use a microservice
architecture, which allows the implementation of system components in the form of independent
software modules.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Geographic information system implementation</title>
      <sec id="sec-2-1">
        <title>2.1. Description of GIS infrastructure</title>
        <p>
          Consider the developed GIS, it can be divided into components related to data collection,
computational processing and analysis modules, data storage and provision systems. Docker
technology is used to deploy and manage application infrastructure [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. This technology implies
application containerization, that is, applications (GIS components) run in independent software
containers. Their interaction at the infrastructure level is defined using Docker Compose. It is
a tool included with Docker and is intended for solving tasks related to project deployment.
Docker Compose uses YAML files to store container group configuration. Containerization is
a lightweight type of virtualization and resource isolation at the OS kernel level. It makes it
possible to run applications with the minimum required libraries in a standardized environment.
Due to the fact that each container is an isolated environment, it can be viewed as a small
service under the control of a programmer. Any container can be customized and updated
without afecting other containers, while providing complete isolation and standardization. Since
containers are configured through special files, they can be versioned in GIT. This approach, in
general, is called IaC — Infrastructure as a Code [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ].
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. General description of GIS</title>
        <p>Figure 1 shows the technological flows of data transfer from sources to the user: data is received
through software interfaces. In the most technologically dificult verification of satellite data,
since it is necessary to make sure that the archives have been completely downloaded to the
GIS server, then they need to be unpacked and saved. In the case of the results of measurements
of automated complexes and field observations, the files are checked for integrity and format.
Satellite data atmospheric correction and thematic processing. A thematic term means the use
of a set of algorithms according to the task at hand, for example, the selection of a water surface.
The highlighting of the water surface is carried out using spectral water indices, which enhance
the contrast between water and other objects.</p>
        <p>
          The results obtained in the form of vector polygons in GeoJSON or Shapefile format are
written to the database. After saving the results of calculations and importing data, they are
available to users in the form of maps, files, graphs and tables. The JavaScript library Leaflet [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]
is used to generate maps; map generation on the server side is implemented using the MapServer
software utility [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ].
        </p>
        <p>
          As you can see from the diagram, work with each of the data sources can be organized as an
independent process. When developing the GIS architecture, this technological scheme was used
to divide the information system into microservices. Decomposing systems into microservices
is a complex task, and there are several techniques for separating services from a monolithic
system. In industrial information systems, there is still no single approach to decomposition
of systems, and in each case it is necessary to choose a technique based on the peculiarities
of the subject area, the connectivity of subsystems and other parameters [
          <xref ref-type="bibr" rid="ref10 ref11">10, 11</xref>
          ]. In this case,
an approach was used in which the sources of information were identified, the technological
stages of data processing, based on this, a list of microservices was determined that should be
present in the GIS.
        </p>
        <p>GIS consists of the following main components (microservices).
1. Service for downloading satellite data from open archives ESA (European Space Agency)
and USGS (United States Geological Survey).
2. Service of work with data of ground measuring complexes.
3. Service for working with data from expeditionary operations.
4. Service of atmospheric correction of satellite data.
5. Service of thematic processing of satellite data.
6. Service for converting files (from raster to vector formats).
7. Data cataloging service.</p>
        <p>
          All services use a common database, where, along with the processed information, various
system settings are stored, for example, the schedule of computing tasks, information about
system users and their rights. Postgresql with the Postgis extension for storing geo objects is
used as a DBMS [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ].
        </p>
        <p>In addition to services, GIS contains interfaces for interaction with other software systems
(Figure 2). For example, desktop packages QGIS, GRASS, etc., as well as a web interface and an
administration panel.</p>
        <p>An important part of GIS is the management of computing processes, management of
computing processes is carried out using the administrator panel (web interface), then the data
received from the user is processed and transmitted to the task manager to form tasks. Then
they are written to the queue — for this, the RabbitMQ technology is used, the computational
process is started. At the end of the work, the results are written to the database.</p>
        <p>When developing the system architecture, the API Gateway pattern was used. This pattern
is based on the use of a gateway that sits between the client application and microservices,
providing a single entry point for the client. The use of this pattern reduces the number of
calls, ensures client independence from the protocols used in services: REST, AMQP, gRPC, and
provides centralized management of end-to-end functionality. Figure 3 shows the structure of
the API Gateway pattern.</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Integration of GIS with ground measuring complexes</title>
        <p>The use of local automated systems for monitoring the parameters of natural objects is a
promising area of research. Such measurements are carried out in order to monitor various
natural processes, as a rule, with the further transfer of the measurement results to a wide range
of interested parties.</p>
        <p>
          Monitoring of inland water bodies is part of the monitoring of the natural environment as a
whole. According to modern international approaches, monitoring of any component of the
natural environment (including water bodies) should include a set of standardized observations
and methods of processing, analyzing and transmitting the results of these observations to
consumers [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ].
        </p>
        <p>The work uses the data of the measuring complex, which is designed to carry out systematic
complex measurements of the parameters of water bodies. The measuring complex is located in
the Altai Territory on Lake Krasilovskoye, on the shore of which the educational and scientific
station of the Altai State University operates. The measuring complex allows, in an autonomous
mode, with a period of 15 minutes, to receive information about 4 meteorological parameters of
the atmosphere at heights of 2 and 4 meters, incident and reflected solar radiation, levels of lake
and ground waters, water temperature from surface to bottom (depth 7 meters), and also the
temperature of the soil from the surface to a depth of 3 meters.</p>
        <p>The measuring complex consists of three autonomous units specially prepared for installation
in the water area of the lake on a raft, on the bottom near the water’s edge and permanently on
the shore. APIK is equipped with a GSM modem for data transmission and a logging module
(storage of measured data for subsequent download). To ensure the integration of GIS with
the measuring complex, a RESTfull-API was developed, which is based on the Django REST
framework (DRF) extension. API data is transmitted in JSON format and, after validation using
the Django form functionality, is written to the GIS database. The results of expeditionary work
can also be added to the GIS via API or web interface with an add and import form.</p>
        <p>In developing the RESTful GIS API and the data transfer format, the recommendations for
building the REST API developed by the Open API Initiative were used. The OpenAPI
specification comes with a set of guidelines for developing REST APIs. It provides a number of
interoperability benefits, but requires additional design attention to comply with the
specification. OpenAPI recommends that you start by creating a contract, not an implementation. This
means that when developing an API, a contract (interface) must first be developed and only
after that the program code for its implementation must be developed.</p>
        <p>As noted earlier, a separate microservice was developed to work with measuring complexes;
it has the following functionality: receiving data in JSON format or in the form of a CSV file,
validating and writing to a database.</p>
      </sec>
      <sec id="sec-2-4">
        <title>2.4. GIS work with satellite data</title>
        <p>
          Determination of the parameters of water bodies based on satellite imagery data is of particular
interest, since satellite data simultaneously cover a vast territory and reflect the current forms
and areas of water bodies. Due to this, satellite imagery materials are becoming more and more
popular [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ].
        </p>
        <p>Earth remote sensing data and geoinformation technologies allow solving many important
tasks, including such as:
1. Inventory of reservoirs and other water bodies.
2. Regular monitoring of the condition of dams and other water protection and hydraulic
structures. Assessment of the ecological state of water bodies, including the identification
of areas of water bodies contaminated as a result of emergency discharges and spills of
hazardous substances, identification of sources of pollution. Study of channel processes
and mapping of the bottom microrelief in shallow water.
3. Forecasting and operational monitoring of floods, modeling the processes of inundation
of the territory as a result of floods.
4. Determination of biological productivity of reservoirs, identification of aquatic biological
resources, solution of fish farming problems.
5. Determination of the area of the water area of water bodies.</p>
        <p>However, when working with satellite data, it is necessary to take into account a number of
features, which are presented below:
1. Inland water bodies, as a rule, have a relatively small area, therefore, medium and high
resolution data are suitable for efectively determining their spatial characteristics.
2. Dependence on weather conditions and time of day for satellite vehicles with measuring
equipment operating in the optical range.
3. Inland water bodies are much less studied using Earth remote sensing systems than
relatively large seas and oceans.
4. The eficiency of satellite data processing largely depends on the choice of optimal
processing algorithms, technological solutions, well-developed techniques, information
support.</p>
        <p>Based on the above features, the work uses data from the Sentinel-2 and Landsat-8 spacecraft,
which are available in the open archives of satellite information ESA (European Space Agency)
and USGS (United States Geological Survey). To solve the problem of obtaining satellite data
in an automated mode, a special software module was developed. It is based on a network
connection to the interfaces of the satellite data archive servers. In the process of connection, a
session of transmission of a request for data search is established, consisting of the coordinates
of the required data area, the date of the survey and the type of satellite. The server, in response
to the request, transmits a list of available data satisfying the request. After that, the data
is downloaded in the form of an archive to the GIS server, then, according to the scheme in
Figure 1, the files are processed.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Summary and conclusions</title>
      <p>The presented geoinformation system allows to implement the processes of processing diverse
information about the state of water bodies for solving fundamental and applied problems of
hydrology and hydrobiology. However, it can be used in other tasks related to the collection
and processing of spatial data. When developing a GIS, it is necessary to take into account the
features of data sources, their processing stages and storage features. Microservice architecture
allows you to organize a flexible system for collecting, processing and storing data on the
state of natural objects. The source code of the developed GIS is open and available at https:
//github.com/alexdontsov/sibwater.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Wang</surname>
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Han</surname>
            <given-names>W.</given-names>
          </string-name>
          , Nian Z.
          <article-title>Design of satellite ground management system based on microservices //</article-title>
          <source>Proceedings of the 2020 3rd International Conference on Computer Science and Software Engineering (CSSE</source>
          <year>2020</year>
          ). N.Y.: Association for Computing Machinery,
          <year>2020</year>
          . P.
          <volume>119</volume>
          -
          <fpage>123</fpage>
          . DOI:
          <volume>10</volume>
          .1145/3403746.3403915.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <article-title>[2] Microservice architecture</article-title>
          . Available at: https://microservices.io/patterns/microservices. html (
          <issue>accessed June 9</issue>
          ,
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Dontsov</surname>
            <given-names>A.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sutorikhin</surname>
            <given-names>I.A.</given-names>
          </string-name>
          <article-title>Specialized geoinformation system for automated monitoring of rivers</article-title>
          and reservoirs // Computational Technologies.
          <year>2017</year>
          . Vol.
          <volume>22</volume>
          . No. 5. P.
          <volume>39</volume>
          -
          <fpage>46</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Dontsov</surname>
            <given-names>A.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sutorihin</surname>
            <given-names>I.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Frolenkov</surname>
            <given-names>I.M.</given-names>
          </string-name>
          <article-title>Geographi information system for bloom monitoring inland water bodies // Limnology</article-title>
          and
          <string-name>
            <given-names>Freshwater</given-names>
            <surname>Biology</surname>
          </string-name>
          .
          <year>2020</year>
          . No.
          <article-title>4(SI:7VBC)</article-title>
          . P.
          <volume>914</volume>
          -
          <fpage>915</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Villamizar</surname>
            <given-names>M.</given-names>
          </string-name>
          <article-title>Infrastructure cost comparison of running web applications in the cloud using AWS lambda and monolithic</article-title>
          and microservice architectures // 16th IEEE/ACM International Symposium on Cluster,
          <article-title>Cloud and Grid Computing (CCGrid)</article-title>
          . IEEE,
          <year>2016</year>
          . P.
          <volume>179</volume>
          -
          <fpage>182</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Namiot</surname>
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sneps-Sneppe M.On</surname>
          </string-name>
          micro-services architecture // International Journal of Open Information Technologies.
          <year>2014</year>
          . Vol.
          <volume>2</volume>
          . No. 9. P.
          <volume>24</volume>
          -
          <fpage>27</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Docker</surname>
          </string-name>
          . Available at: https://www.docker.com (
          <issue>accessed June 9</issue>
          ,
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Leaflet</surname>
          </string-name>
          . Available at: https://leafletjs.com (
          <issue>accessed June 9</issue>
          ,
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9] MapServer. Available at: https://mapserver.org (
          <issue>accessed June 9</issue>
          ,
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Balalaie</surname>
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Heydarnoori</surname>
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jamshidi</surname>
            <given-names>P</given-names>
          </string-name>
          . Microservices migration patterns //
          <source>Technical Report TR-SUTCEASE-2015-01</source>
          . Automated Software Engineering Group, Sharif University of Technology, Tehran, Iran,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Amaral</surname>
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Polo</surname>
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Carrera</surname>
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Steinder</surname>
            <given-names>M.</given-names>
          </string-name>
          <article-title>Performance evaluation of microservices architectures using containers // 14th IEEE International Symposium on Network Computing and Applications (NCA)</article-title>
          . IEEE,
          <year>2015</year>
          . P.
          <volume>27</volume>
          -
          <fpage>34</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>PostGIS</surname>
          </string-name>
          . Available at: https://postgis.net (
          <issue>accessed June 9</issue>
          ,
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Dumitru</surname>
            <given-names>A.</given-names>
          </string-name>
          et al.
          <article-title>Approaches to monitoring and evaluation strategy development // Evaluating the Impact of Nature-Based Solutions. A Handbook for Practitioners</article-title>
          .
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Frazier</surname>
            <given-names>P.S.</given-names>
          </string-name>
          et al.
          <article-title>Water body detection and delineation with Landsat TM data // Photogrammetric Engineering</article-title>
          and
          <string-name>
            <given-names>Remote</given-names>
            <surname>Sensing</surname>
          </string-name>
          .
          <year>2000</year>
          . Vol.
          <volume>66</volume>
          . No. 12. P.
          <volume>1461</volume>
          -
          <fpage>1468</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>