=Paper= {{Paper |id=Vol-3006/33_short_paper |storemode=property |title=Development of a geographic information system for data collection and analysis based on microservice architecture |pdfUrl=https://ceur-ws.org/Vol-3006/33_short_paper.pdf |volume=Vol-3006 |authors=Alexander A. Dontsov,Igor A. Sutorikhin }} ==Development of a geographic information system for data collection and analysis based on microservice architecture== https://ceur-ws.org/Vol-3006/33_short_paper.pdf
Development of a geographic information system for
data collection and analysis based on microservice
architecture
Alexander A. Dontsov1 , Igor A. Sutorikhin1,2
1
    Institute for Water and Environmental Problems SB RAS, Barnaul, Russia
2
    Federal Research Center for Information and Computational Technologies, Novosibirsk, Russia


                                         Abstract
                                         The paper discusses the use of microservice architecture in the development of geographic information
                                         systems (GIS) for collecting, processing and analyzing data. As a rule, microservice architecture is used to
                                         build applications in information systems related to solving business problems, and is not widespread in
                                         the development of geographic information systems in the scientific field. However, its application is now
                                         becoming increasingly important. Decomposition of the software implementation and GIS infrastructure
                                         associated with computations and data processing into components in the form of microservices has
                                         a number of advantages, such as: increased fault tolerance, increased flexibility, reduced maintenance
                                         effort, simplified scaling, and others. The first results of the application of the microservice approach
                                         in the development of a geoinformation system for the collection and processing of hydrological and
                                         hydrobiological data on the state of water bodies are shown. The architecture, main components, and
                                         features of the information infrastructure are shown.

                                         Keywords
                                         GIS, microservices, satellite data, geoportal, cloud technologies, information systems, lake, reservoir,
                                         measuring complexes.




1. Introduction
Monitoring the parameters of various natural objects is an urgent task of nature management.
Currently, there is a growing need to provide data on the state of natural objects to a wide
range of organizations and individuals, from government agencies to public organizations.
Currently, to solve the problems of collecting, processing and providing data, web GIS systems
are widely used in the form of geoportals, which allow automating data processing processes
and organizing access to the results of calculations. The use of microservice architecture in
the development of such web GIS makes it possible to achieve reuse of GIS components and
their independent operation [1]. This approach is that the information system is implemented
as a set of small services, each of which is executed as a separate process and communicates
with others using interaction mechanisms, as a rule, technologies are used for this: REST, gRPC,
RabbitMQ, Apache Kafka [2].
   Our previous works [3, 4] show the results of the development of a GIS for the collection and

SDM-2021: All-Russian conference, August 24–27, 2021, Novosibirsk, Russia
" alexdontsov@yandex.ru (A. A. Dontsov)
                                       © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073       CEUR Workshop Proceedings (CEUR-WS.org)



                                                                                                         280
Alexander A. Dontsov et al. CEUR Workshop Proceedings                                    280–287


processing of hydrological and hydrobiological data on the state of water bodies; at present, a
new version of it is being developed, the main difference of which is the decomposition of the
information system into microservices.
   Microservice architecture has several advantages such as.
   1. Services are simple and easy to maintain.
   2. Services are deployed independently of each other.
   3. Services scale independently of each other.
   4. There is an opportunity to conduct technological experiments and it is relatively easy to
      introduce new technologies.
   5. Relatively high fault tolerance compared to monolithic architecture.
  However, the microservice architecture also has disadvantages such as.
   1. Difficulty at the initial stage of development and creation of infrastructure. Distributed
      systems are more difficult to develop, since you need to provide for the independence of
      one microservice from failure in another component.
   2. Development of distributed systems imposes additional costs on the exchange of data
      between microservices: you need to choose the right communication protocols between
      the components so that the interaction is as efficient as possible [5, 6].
  Monitoring of water bodies, such as lakes and reservoirs, is a topical area of environmental
management. Inland water bodies play a very important role in natural and anthropogenic
processes.
  When developing a GIS designed for collecting and processing data on the state of water
bodies, it is necessary to take into account the fact that complete and comprehensive information
can be obtained by integrating different measurement methods, such as:
   1. Satellite monitoring.
   2. Data of ground measuring complexes.
   3. Data from expeditionary research and field observations.
These are three unrelated sources of information that require different approaches, algorithms
and technological solutions to work with them. In this case, it is relevant to use a microservice
architecture, which allows the implementation of system components in the form of independent
software modules.


2. Geographic information system implementation
2.1. Description of GIS infrastructure
Consider the developed GIS, it can be divided into components related to data collection,
computational processing and analysis modules, data storage and provision systems. Docker
technology is used to deploy and manage application infrastructure [7]. This technology implies
application containerization, that is, applications (GIS components) run in independent software
containers. Their interaction at the infrastructure level is defined using Docker Compose. It is



                                              281
Alexander A. Dontsov et al. CEUR Workshop Proceedings                                      280–287


a tool included with Docker and is intended for solving tasks related to project deployment.
Docker Compose uses YAML files to store container group configuration. Containerization is
a lightweight type of virtualization and resource isolation at the OS kernel level. It makes it
possible to run applications with the minimum required libraries in a standardized environment.
Due to the fact that each container is an isolated environment, it can be viewed as a small
service under the control of a programmer. Any container can be customized and updated
without affecting other containers, while providing complete isolation and standardization. Since
containers are configured through special files, they can be versioned in GIT. This approach, in
general, is called IaC — Infrastructure as a Code [7].

2.2. General description of GIS
Figure 1 shows the technological flows of data transfer from sources to the user: data is received
through software interfaces. In the most technologically difficult verification of satellite data,
since it is necessary to make sure that the archives have been completely downloaded to the
GIS server, then they need to be unpacked and saved. In the case of the results of measurements
of automated complexes and field observations, the files are checked for integrity and format.
Satellite data atmospheric correction and thematic processing. A thematic term means the use
of a set of algorithms according to the task at hand, for example, the selection of a water surface.
The highlighting of the water surface is carried out using spectral water indices, which enhance
the contrast between water and other objects.
   The results obtained in the form of vector polygons in GeoJSON or Shapefile format are
written to the database. After saving the results of calculations and importing data, they are
available to users in the form of maps, files, graphs and tables. The JavaScript library Leaflet [8]
is used to generate maps; map generation on the server side is implemented using the MapServer
software utility [9].
   As you can see from the diagram, work with each of the data sources can be organized as an
independent process. When developing the GIS architecture, this technological scheme was used
to divide the information system into microservices. Decomposing systems into microservices
is a complex task, and there are several techniques for separating services from a monolithic
system. In industrial information systems, there is still no single approach to decomposition
of systems, and in each case it is necessary to choose a technique based on the peculiarities




Figure 1: Process data transfer descendants in GIS.




                                                282
Alexander A. Dontsov et al. CEUR Workshop Proceedings                                   280–287


of the subject area, the connectivity of subsystems and other parameters [10, 11]. In this case,
an approach was used in which the sources of information were identified, the technological
stages of data processing, based on this, a list of microservices was determined that should be
present in the GIS.
   GIS consists of the following main components (microservices).
   1. Service for downloading satellite data from open archives ESA (European Space Agency)
      and USGS (United States Geological Survey).
   2. Service of work with data of ground measuring complexes.
   3. Service for working with data from expeditionary operations.
   4. Service of atmospheric correction of satellite data.
   5. Service of thematic processing of satellite data.
   6. Service for converting files (from raster to vector formats).
   7. Data cataloging service.
   All services use a common database, where, along with the processed information, various
system settings are stored, for example, the schedule of computing tasks, information about
system users and their rights. Postgresql with the Postgis extension for storing geo objects is
used as a DBMS [12].
   In addition to services, GIS contains interfaces for interaction with other software systems
(Figure 2). For example, desktop packages QGIS, GRASS, etc., as well as a web interface and an
administration panel.
   An important part of GIS is the management of computing processes, management of com-
puting processes is carried out using the administrator panel (web interface), then the data




Figure 2: Block diagram of GIS.




                                              283
Alexander A. Dontsov et al. CEUR Workshop Proceedings                                      280–287




Figure 3: API Gateway pattern.


received from the user is processed and transmitted to the task manager to form tasks. Then
they are written to the queue — for this, the RabbitMQ technology is used, the computational
process is started. At the end of the work, the results are written to the database.
   When developing the system architecture, the API Gateway pattern was used. This pattern
is based on the use of a gateway that sits between the client application and microservices,
providing a single entry point for the client. The use of this pattern reduces the number of
calls, ensures client independence from the protocols used in services: REST, AMQP, gRPC, and
provides centralized management of end-to-end functionality. Figure 3 shows the structure of
the API Gateway pattern.

2.3. Integration of GIS with ground measuring complexes
The use of local automated systems for monitoring the parameters of natural objects is a
promising area of research. Such measurements are carried out in order to monitor various
natural processes, as a rule, with the further transfer of the measurement results to a wide range
of interested parties.
   Monitoring of inland water bodies is part of the monitoring of the natural environment as a
whole. According to modern international approaches, monitoring of any component of the
natural environment (including water bodies) should include a set of standardized observations
and methods of processing, analyzing and transmitting the results of these observations to
consumers [13].
   The work uses the data of the measuring complex, which is designed to carry out systematic
complex measurements of the parameters of water bodies. The measuring complex is located in
the Altai Territory on Lake Krasilovskoye, on the shore of which the educational and scientific
station of the Altai State University operates. The measuring complex allows, in an autonomous
mode, with a period of 15 minutes, to receive information about 4 meteorological parameters of
the atmosphere at heights of 2 and 4 meters, incident and reflected solar radiation, levels of lake
and ground waters, water temperature from surface to bottom (depth 7 meters), and also the
temperature of the soil from the surface to a depth of 3 meters.
   The measuring complex consists of three autonomous units specially prepared for installation
in the water area of the lake on a raft, on the bottom near the water’s edge and permanently on
the shore. APIK is equipped with a GSM modem for data transmission and a logging module
(storage of measured data for subsequent download). To ensure the integration of GIS with



                                               284
Alexander A. Dontsov et al. CEUR Workshop Proceedings                                     280–287


the measuring complex, a RESTfull-API was developed, which is based on the Django REST
framework (DRF) extension. API data is transmitted in JSON format and, after validation using
the Django form functionality, is written to the GIS database. The results of expeditionary work
can also be added to the GIS via API or web interface with an add and import form.
   In developing the RESTful GIS API and the data transfer format, the recommendations for
building the REST API developed by the Open API Initiative were used. The OpenAPI spec-
ification comes with a set of guidelines for developing REST APIs. It provides a number of
interoperability benefits, but requires additional design attention to comply with the specifica-
tion. OpenAPI recommends that you start by creating a contract, not an implementation. This
means that when developing an API, a contract (interface) must first be developed and only
after that the program code for its implementation must be developed.
   As noted earlier, a separate microservice was developed to work with measuring complexes;
it has the following functionality: receiving data in JSON format or in the form of a CSV file,
validating and writing to a database.

2.4. GIS work with satellite data
Determination of the parameters of water bodies based on satellite imagery data is of particular
interest, since satellite data simultaneously cover a vast territory and reflect the current forms
and areas of water bodies. Due to this, satellite imagery materials are becoming more and more
popular [14].
   Earth remote sensing data and geoinformation technologies allow solving many important
tasks, including such as:
   1. Inventory of reservoirs and other water bodies.
   2. Regular monitoring of the condition of dams and other water protection and hydraulic
      structures. Assessment of the ecological state of water bodies, including the identification
      of areas of water bodies contaminated as a result of emergency discharges and spills of
      hazardous substances, identification of sources of pollution. Study of channel processes
      and mapping of the bottom microrelief in shallow water.
   3. Forecasting and operational monitoring of floods, modeling the processes of inundation
      of the territory as a result of floods.
   4. Determination of biological productivity of reservoirs, identification of aquatic biological
      resources, solution of fish farming problems.
   5. Determination of the area of the water area of water bodies.
   However, when working with satellite data, it is necessary to take into account a number of
features, which are presented below:
   1. Inland water bodies, as a rule, have a relatively small area, therefore, medium and high
      resolution data are suitable for effectively determining their spatial characteristics.
   2. Dependence on weather conditions and time of day for satellite vehicles with measuring
      equipment operating in the optical range.
   3. Inland water bodies are much less studied using Earth remote sensing systems than
      relatively large seas and oceans.



                                               285
Alexander A. Dontsov et al. CEUR Workshop Proceedings                                      280–287


   4. The efficiency of satellite data processing largely depends on the choice of optimal
      processing algorithms, technological solutions, well-developed techniques, information
      support.
   Based on the above features, the work uses data from the Sentinel-2 and Landsat-8 spacecraft,
which are available in the open archives of satellite information ESA (European Space Agency)
and USGS (United States Geological Survey). To solve the problem of obtaining satellite data
in an automated mode, a special software module was developed. It is based on a network
connection to the interfaces of the satellite data archive servers. In the process of connection, a
session of transmission of a request for data search is established, consisting of the coordinates
of the required data area, the date of the survey and the type of satellite. The server, in response
to the request, transmits a list of available data satisfying the request. After that, the data
is downloaded in the form of an archive to the GIS server, then, according to the scheme in
Figure 1, the files are processed.


3. Summary and conclusions
The presented geoinformation system allows to implement the processes of processing diverse
information about the state of water bodies for solving fundamental and applied problems of
hydrology and hydrobiology. However, it can be used in other tasks related to the collection
and processing of spatial data. When developing a GIS, it is necessary to take into account the
features of data sources, their processing stages and storage features. Microservice architecture
allows you to organize a flexible system for collecting, processing and storing data on the
state of natural objects. The source code of the developed GIS is open and available at https:
//github.com/alexdontsov/sibwater.


References
 [1] Wang Y., Han W., Nian Z. Design of satellite ground management system based on mi-
     croservices // Proceedings of the 2020 3rd International Conference on Computer Science
     and Software Engineering (CSSE 2020). N.Y.: Association for Computing Machinery, 2020.
     P. 119–123. DOI:10.1145/3403746.3403915.
 [2] Microservice architecture. Available at: https://microservices.io/patterns/microservices.
     html (accessed June 9, 2021).
 [3] Dontsov A.A., Sutorikhin I.A. Specialized geoinformation system for automated monitoring
     of rivers and reservoirs // Computational Technologies. 2017. Vol. 22. No. 5. P. 39–46.
 [4] Dontsov A.A., Sutorihin I.A., Frolenkov I.M. Geographi information system for bloom
     monitoring inland water bodies // Limnology and Freshwater Biology. 2020. No. 4(SI:7VBC).
     P. 914–915.
 [5] Villamizar M. Infrastructure cost comparison of running web applications in the cloud
     using AWS lambda and monolithic and microservice architectures // 16th IEEE/ACM
     International Symposium on Cluster, Cloud and Grid Computing (CCGrid). IEEE, 2016.
     P. 179–182.




                                                286
Alexander A. Dontsov et al. CEUR Workshop Proceedings                                  280–287


 [6] Namiot D., Sneps-Sneppe M.On micro-services architecture // International Journal of
     Open Information Technologies. 2014. Vol. 2. No. 9. P. 24–27.
 [7] Docker. Available at: https://www.docker.com (accessed June 9, 2021).
 [8] Leaflet. Available at: https://leafletjs.com (accessed June 9, 2021).
 [9] MapServer. Available at: https://mapserver.org (accessed June 9, 2021).
[10] Balalaie A., Heydarnoori A., Jamshidi P. Microservices migration patterns // Technical
     Report TR-SUTCEASE-2015-01. Automated Software Engineering Group, Sharif University
     of Technology, Tehran, Iran, 2015.
[11] Amaral M., Polo J., Carrera D., Steinder M. Performance evaluation of microservices archi-
     tectures using containers // 14th IEEE International Symposium on Network Computing
     and Applications (NCA). IEEE, 2015. P. 27–34.
[12] PostGIS. Available at: https://postgis.net (accessed June 9, 2021).
[13] Dumitru A. et al. Approaches to monitoring and evaluation strategy development //
     Evaluating the Impact of Nature-Based Solutions. A Handbook for Practitioners. 2021.
[14] Frazier P.S. et al. Water body detection and delineation with Landsat TM data // Pho-
     togrammetric Engineering and Remote Sensing. 2000. Vol. 66. No. 12. P. 1461–1468.




                                              287