=Paper=
{{Paper
|id=Vol-2790/paper16
|storemode=property
|title=
Big Data Environmental Monitoring System in Recreational Areas
|pdfUrl=https://ceur-ws.org/Vol-2790/paper16.pdf
|volume=Vol-2790
|authors=Alexander Volkov,Andrey Kopyrin,Natalya Kondratyeva,Sagit Valeev
|dblpUrl=https://dblp.org/rec/conf/rcdl/VolkovKKV20
}}
==
Big Data Environmental Monitoring System in Recreational Areas
==
Big Data Environmental Monitoring System in Recreational Areas Alexander Volkov1, Andrey Kopyrin1, Natalya Kondratyeva1 and Sagit Valeev1 1 Sochi State University, Sochi, 354008, Russia vss2000@mail.ru Abstract. The paper discusses the architecture of the environmental monitoring system of the recreational area. It is assumed that the system uses big data tech- nologies to process heterogeneous information. Information in various formats is obtained using remote sensing of the earth, aerial photography, shooting from unmanned aerial vehicles, and internal monitoring of the state of objects of the recreational zone. The area of pollution and the volume of emissions and, conse- quently, the concentrations created by them are determined by experts consider- ing the available data. The necessity of considering the dynamics of the ecologi- cal state based on the available data generated by weather services and remote sensing data of the earth in various spectral ranges is shown. Also, within the framework of the monitoring system, statistical data and results of modeling the ecological state of the recreational zone are used. Based on the analytics of big data, it is supposed to build a forecast of thermal emissions and the amount of harmful substances in the atmosphere carried by air masses. Keywords: Environmental Monitoring, Big Data, Multi-level Information Col- lection System, Databases, Data Aggregation. 1 Introduction As it is known, the main purpose of the recreational zone is to restore the health of the population and ensure the preservation of the labor capital of any state. Assessment of the pollution of the recreational area, considering the scale of environmental pollution, allows you to control the ecological situation in recreational areas within the region. Assessment of the state of complex systems is based on a hierarchical collection of models of various details [1]. When building a monitoring system for the ecological state of a recreational zone, models of various details are used, based on the analysis of available information. Thermal emissions and emissions of harmful substances carried by air masses sig- nificantly affect the ecological state of coastal recreational zones. These circumstances affect the level of recreational services provided, as well as the ecological state of the resort area. Predicting the ecological state and reducing harmful emissions of various natures is an extremely urgent scientific and practical task [2, 3]. Determining the area of these zones of increased concentration of harmful sub- stances allows us to qualitatively assess the level of pollution of the zone based on Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). 171 expert estimates and big data analytics. The area of zones with an elevated level of harmful substances at different times is significantly different. This is due to several reasons, among which the climatic conditions of the considered recreational region and the trajectory of the air masses transporting harmful substances over long distances [4- 7]. From this we can conclude that it is necessary to take into account the dynamics of the ecological state, taking into account the available data generated by weather ser- vices and data from remote sensing of the earth in various spectral ranges. To assess the permissible concentration of harmful emissions, stationary systems for measuring air quality and monitoring based on aerial photography and satellite images are used. The area of pollution and the volumetric consumption of emissions and, conse- quently, the concentrations created by them are determined by experts considering the available data. In total, thermal emissions into the environment and emissions of harm- ful substances from megacities lead to a significant deterioration of the ecological state of the resort area, which is quite far from the places of emission. To analyze and make an effective forecast, the problem of integrating systems for collecting and processing a large amount of data is considered. Fig. 1 shows a map of emissions of harmful substances by megacities in the resort areas of Spain in the winter, shows areas of increased concentration of harmful sub- stances on the coast Z1 and Z2. Solid arrows show the vectors of movement of air masses in this period. The determination of the area of these zones allows a qualitative assessment of the level of pollution based on expert estimates. Fig. 1. Map of the distribution of harmful emissions in recreational areas of the coast of Spain in winter. As follows from Fig. 2, the area of zones with an increased level of harmful substances in the high season is much larger than in the low season. This is due to several reasons, among which the climatic conditions of the considered recreational region and the in- crease in demand for services in certain periods of the year. From this we can conclude that it is necessary to take into account the dynamics of the ecological state and possible 172 emissions of harmful substances associated with heating and cooling air in the prem- ises, taking into account the zonal climatic conditions of the recreation area. The effectiveness of solving the problem of improving energy efficiency, and thereby reducing harmful emissions during heating and cooling can be achieved using modern technologies and methods. Among them are the reconstruction of the existing construction fund, the district heating network and the introduction of new generation cooling systems [8-13]. Fig. 2. Map of the distribution of harmful emissions in recreational areas of the coast of Spain in the summer. It should be noted that a feature of the recreational zones of the seacoasts is the presence in the general fund of recreational infrastructure facilities of many small private hotels with limited opportunities for the application of innovative energy-saving technologies. Another feature of the recreational areas is the specificity of the transport infrastructure, which is overloaded during peak seasons, which leads to traffic jams and increased emissions of harmful substances. The concentration of emissions significantly depends on the wind rose, as well as on the time of day and time of year. Thus, it should be noted that in the summer season additional active measures are required to control harmful emissions by vehicles. Thus, the task of creating an information system for monitoring the level of pollution of a recreational zone based on the use of multilevel data analysis is urgent. A feature of the proposed solution is the collection and storage of data from all available infor- mation sources and their integration for a step-by-step analysis of pollution sources, as well as taking into account the dynamics of changes in the level of pollution, taking into account the movement of air masses transporting harmful substances from remote industrial regions [14-17]. The principles of designing an information system for environmental monitoring of the recreational zone, which uses data from remote sensing of the earth, aerial photog- raphy, surveys from unmanned aerial vehicles, internal monitoring of the state of the objects of the recreational zone, as well as statistical data, were discussed in [18]. 173 Next, we will consider the development of this idea based on a three-level hierar- chical information system for collecting heterogeneous information about the ecologi- cal state of the recreational zone and algorithms for filtering noise using neural network technologies. 2 The Task of Analyzing the Amount of Harmful Emissions Solving the problem of analyzing the dynamics of changes in air pollution in recrea- tional areas, taking into account seasonal temperature fluctuations and the amount of emissions of harmful substances from vehicles, taking into account the influence of the direction of movement of air masses, depending on the season, carrying harmful sub- stances from large cities and industrial enterprises, allows to determine the harm ap- plied to the recreational area. It is assumed that data are collected both at the regional level and at the level of individual buildings, which will make it possible to determine the level of pollution and their main causes at various levels of the hierarchy, which, in total, can provide an opportunity to control the situation [19-20]. As you know, processing the results of aerial photography in the infrared spectrum allows you to localize sources of heat loss, including in heating networks. The use of small-sized aircraft makes it possible to collect statistical data taking into account the time of day with higher accuracy and at lower material costs. The problem of determining the area of pollution can be solved by determining the sum of areas of pollution sources. In this case, numerical integration is not assumed. The search can be reduced to the analysis of images with the determination of the total area of contamination by the images themselves. The total level of pollution taking into account the emissions of heat and harmful substances can be estimated using the following expressions: S H xdy S H1 ... S Hm , S H S HS m H S C xdy S C1 ... S Cn , S C S CS n C (1) S O xdy S O1 ... S Ok , S O S OS k O R k S H l SC m SO where R is the overall level of pollution; SH is emissions of harmful substances associ- ated with heating; SC is emissions of harmful substances associated with cooling; SO is emissions associated with traffic flows; k, l, m are weighting factors, H– heating, C – cooling, O – other loses. Threshold values of permissible levels of contamination for (1) are determined on the basis of regulatory documents of various departments and organizations. The choice of the values of the weight coefficients k, l, m depends on many factors determined by expert assessments, various standards, etc. 174 The volume of emissions is characterized by the area and concentration of harmful substances. It is assumed that the data collection performed by stationary measurement systems allows us to obtain an average value taking into account the season. Fig. 3 presents a map of emissions of harmful substances in high season. To assess the permissible concentration of harmful emissions, stationary air quality measuring systems are used. Fig. 3. Emission distribution map. Fig. 4. High season emissions. Area of pollution Z as well as volumetric emissions consumption and, therefore, the concentrations created by them are determined by experts considering the available data (Fig. 4). In aggregate, thermal emissions into the environment and emissions of harmful substances by transport systems lead to a significant deterioration of the ecological state of the resort area. To analyze and make an effective forecast, it is necessary to actively use systems for collecting and processing a large amount of data. To solve this problem, it is proposed to use a multi-level system for collecting infor- mation on the environmental characteristics of the recreational area [4, 5]. The data collection and processing system integrates data arrays obtained by processing infor- 175 mation from satellite images, distributed monitoring data on the thermal state of infra- structure facilities, as well as the concentration of harmful substances in the resort area at different times of the day and time of the year. Considering the wind direction makes it possible to improve the quality of the fore- cast, since it considers the level of pollution, the change in which is caused by air masses carrying harmful substances. 3 The Architecture of the Environmental Monitoring System of the Recreational Area The data aggregation system includes data processing subsystems obtained on the basis of image analysis and weather observation data. Thus, the problem under consideration is primarily associated with the organization of the collection of information on the ecological state of resort facilities. The complexity of solving the problem under con- sideration is due to the need to collect process and store heterogeneous information, as well as the need to solve the problem associated with the development of a forecast of the ecological state of the region. To solve the problems of processing heterogeneous information of large volume, certain hopes are laid on their solution using big data technologies. This is due to the successful practical development of new data transfer technologies, the implementation of data management systems based on NoSQL technologies, as well as the successful implementation of distributed data processing based on cloud computing. In the framework of the use of big data technologies, the application of algorithms for determining energy losses and pollution levels based on the results of image analysis and statistical results of environmental modeling is promising. In this regard, the devel- opment of a system for collecting data on energy consumption can ensure the efficiency of the procedure for developing a plan for the generation of thermal energy for heating and cooling, and optimal plans for loading transport arteries in the region, taking into account the wind rose and climatic conditions. When collecting data by aircraft flying at different heights, the level of detail of the images obtained changes, which leads to the need to synchronize the received data in time and coordinates. At the same time, the formats for presenting graphical infor- mation and data from various sources can differ significantly. Fig. 5 presents the generalized architecture of a three-level data collection system for monitoring the energy consumption used in heating, cooling facilities of a recreation area in various climatic conditions and collecting data on air quality. A feature of the organization of computational processes and data storage in the sys- tem under consideration is the need to process data on the thermal state of the spa zone, the thermal state of spa resort facilities and the internal thermal state of objects pre- sented in various formats. The complexity of solving this problem is due to the need for reliable localization of objects and linking all available information about losses to these objects. In Fig. 5 at the city level L1 (where Oi, i = 1…m, are areas of different energy losses), based on the analysis of data on excess electricity consumption, heat losses in heating 176 networks, the concentration of harmful substances at the current time, determined by the state of the city's transport routes, it is possible to determine zones that are unfavor- able from the point of view of the ecological condition. Fig. 5. Generalized scheme of a three-level system for collecting data from monitoring results Data obtained on the basis of analysis of power consumption, temperature measure- ments, wind directions, measurements of the concentration of harmful substances with reference to a city map are stored in the DB_L1 database. To obtain information on the state of the characteristics of the ecological state of the region, statistical information is collected from urban environmental services, as well as information obtained on the basis of aerial photography and images in various spec- tral ranges obtained using meteorological satellites. The analysis results at this level make it possible to determine the level of pollution in various areas of the recreational zone (Rj, j=1…k). Data at this level is aggregated in the DB_L2 database. Sources of pollution, which can be attributed to industrial enterprises, transport sys- tems of megalopolises, to a large extent currently affect the ecological state of recrea- tional areas. This is caused by the transport of harmful substances by air masses over long distances (in Fig. 5 PC is a pollution cloud, Wd is a wind direction). Information on the direction of air masses is presented at the L3 level, data on the speed and direc- tion of air masses is in the database BD_L3. The data sources are L1 and L2 weather stations. The results of satellite imagery analysis can also be used. Data considering the heterogeneity of formats, the time of their receipt are placed in the DB_I database. These data are used to build the PS environmental forecasting system. When a request 177 for forecasting Q arrives, the forecasting system, using the integrated data on the current state of the system L1, L2, L3, issues forecast P. Several states of the ecological system should be distinguished: normal, harmful to health, and critical. Depending on the results of the analysis, the consumers of this in- formation may be residents of the recreational zone or the relevant state and municipal services. As noted earlier, data integration means the use of all available data on the ecological state of objects in the recreational zone and the zone itself, as well as solving the prob- lem of linking them to the objects under consideration and to the zone itself. Images obtained using various data collection tools have different resolutions and details. They were obtained at different times and can display different spectral components of the object state. Obtaining a general picture requires solving problems of semantic pro- cessing and using various algorithms for image analysis. Based on the data obtained, it is proposed to construct a textbook for training neural networks, with the help of which it is supposed to obtain a forecast. Determining the area of objects under study using software systems for analyzing space images involves a large proportion of manual processing using graphic tools built into these systems. When determining the level of pollution on images with a high res- olution, it is necessary to solve the problem of assessing the total area of pollution zones. It should be noted that the quality of images depends on many factors: illumina- tion, cloudiness, etc. The data aggregation system includes subsystems for image processing, determining the level of thermal state of objects, and localization of objects. Aggregated information is placed in the database DB_I. In the case of processing a video series, the problem of image recognition based on frame extraction arises. Fig. 6 shows an image analysis algorithm that allows you to determine the quality of satellite images depending on the cloud cover. The proposed algorithm consists of frame extracting, pre-processing stage, training stage and testing stage. In Fig. 6, TSN segment is a segment of the temporal segment network (TSN), a novel framework for video-based action recognition. It refers to the idea of long-range temporal structure modeling. It includes a sparse temporal sampling and video-level supervision. Data augmenta- tion is a common technique to improve results and avoid overfitting during model train- ing. To solve the problem of image analysis, a convolutional neural network is used (CNN) [18, 19]. Thus, the problem under consideration is primarily associated with the organization of the collection of information on the thermal state of resort facilities and the state of air quality. 178 Fig. 6. Algorithm of image classification (Cloudy/No Cloudy) The complexity of the solution of the problem under consideration is due to the need to store heterogeneous information and the need to solve the problem of localizing re- sort facilities, determining heat losses in different climatic conditions, taking into ac- count the peculiarities of energy use for various needs. 4 Conclusion The problem of organizing the collection of information on the ecological state of the resort area is considered. The complexity of the solution is due to the need to organize 179 the storage of heterogeneous information and solve the problem of localization of resort facilities, determine heat losses in various climatic conditions, taking into account the peculiarities of the use of energy resources, as well as the problem of analyzing the state of air quality in the dynamics of the quality of the environment. A multilevel data collection system using data from remote sensing of the Earth, aerial photography and surveying from unmanned aerial vehicles, as well as data from internal monitoring of both the thermal state of objects in the recreation area and air quality are discussed. It is assumed that the proposed solutions may be in demand when assessing the harm caused to the recreational area by industrial emissions and insufficient attention to en- ergy-saving technologies for heating and cooling. The results of the analysis can serve as a factual base when searching for solutions at various levels of decision-making, including the regional level. References 1. Thalheim, B.: Models for Communication, Understanding, Search, and Analysis. In: Pro- ceedings of the XXI International Conference on Data Analytics and Management in Data Intensive Domains (DAMDID/RCDL 2019), pp. 3-18 (2019) 2. Overall heat transfer loss from buildings - transmission, ventilation and infiltration, https://www.engineeringtoolbox.com/heat-loss-buildings-d_113.html 3. Theodore, L., Behan, K.: Introduction to Optimization for Chemical and Environmental En- gineers, 1st ed. CRC Press (2018) 4. Top six places for energy losses in commercial buildings. https://www.ec- donline.com.au/content/test-measurement/article/top-six-places-for-energy-losses-in-com- mercial-buildings-259097222 5. Building Envelope: How to Avoid Energy Loss. https://www.facilitiesnet.com/energyeffi- ciency/article/Building-Envelope-How-to-Avoid-Energy-Loss--9428 6. Aerial Infrared Inspection. http://nationwidedrones.co.uk/drone-infrared-inspection 7. Making Your Facilities A Safer Place. http://preciseir.com/ 8. Leskovec, J., Rajaraman, A., Ullman, J.D.: Mining of massive datasets. Cambridge Univer- sity Press (2014) 9. Bohlouli, M., Schulz, F., Angelis, L., Pahor, D., Brandic, I., Atlan, D., Tate, R.: Towards an integrated platform for Big Data analysis. In: Integration of Practice-Oriented Knowledge Technology: Trends and Prospectives. Springer, Berlin, pp. 47–56 (2013) 10. Fagin, R., Kolaitis, P.G., Miller, R.J., Popa, L.: DATA exchange: semantics and query an- swering, in Proc. of the 9th Int. Conf. on Database Theory (ICDT 2003), pp. 207–224 (2003) 11. Jacobs, B.: Categorical Logic and Type Theory. In: Studies in Logic and the Foundation of Mathematics. 141. Elsevier, Amsterdam (1999) 12. Jones, M., Schildhauer, M., Reichman, O., Bowers, S.: The new bioinformatics: integrating ecological data from the gene to the biosphere. In: Annu. Rev. Ecol. Evol. Syst. 37(1), pp. 519–544 (2006) 13. Labrinidis, A., Jagadish, H.V.: Challenges and opportunities with Big Data. In: Proc. VLDB 5(12), pp. 2032–2033 (2012) 14. Lenzerini, M.: Data integration: a theoretical perspective. In: Proc. of the 21st ACM SIGACT SIGMOD SIGART Symp. on Principles of Database Systems (PODS 2002), pp. 233–246 (2002) 180 15. Lenzerini, M., Majki´c, Z.: First release of the system prototype for query management. Semantic webs and agents in integrated economies, D3.3, IST-2001-34825 (2003) 16. Lenzerini, M., Majki´c, Z.: General framework for query reformulation. Semantic webs and agents in integrated economies, D3.1, IST-2001-34825, February (2003) 17. Sazontev, V.: Methods for Big Data Integration in Distributed Computation Environments. In: Proceedings of XX International Conference “Data Analytics and Management in Data Intensive Domains” (DAMDID/RCDL’2018), pp. 239-244 (2018) 18. Volkov, A.N., Kopyrin, A.S., Kondratyeva, N.V., Valeev, S.S.: Ecological Monitoring In- formation System in Recreational Zones, Scientific Journal Engineering System and Con- structions, 1(38), pp.20-24 (2020) (in Russian) 19. Kondratyeva, N.V., Valeev, S.S.: Simulation of the life cycle of a complex technical object within the concept of Big Data. In: CEUR Proceedings of 3rd Russian Conference Mathe- matical Modeling and Information Technologies, pp. 216-223 (2016) 20. Volkov, A., Kopyrin, A., Kondratyeva, N., Valeev, S.: Multilevel Data Acquisition System of Energy Losses in Recreation Areas. In: 2019 Twelfth International Conference "Manage- ment of large-scale system development" (MLSD’2019), Moscow, Russia, pp. 1-4 (2019) 181