Applied Tasks in Virtual Research Environment based on a web GIS platform ‘Climate+’ © Fazliev A.Z. © Privezentsev A.I. Institute of Atmospheric Optics SB RAS, Tomsk, Russia faz@iao.ru remake@iao.ru © Gordov E.P. © Okladnikov I.G. © Ryazanova A.A. © Titov A.G. Institute of Monitoring of Climatic and Ecological Systems SB RAS, Tomsk, Russia gordov@scert.ru okladnikov@scert.ru raa@scert.ru titov@scert.ru © Bart A.A. © Starchenko A.V. Tomsk State University, Tomsk, Russia bart@math.tsu.ru starch@math.tsu.ru Abstract. Two types of applied tasks used in the thematic virtual research environment (VRE) based on the “Climate+” platform are considered. Tasks of both types use significant amount of climatic or meteorological data. The first type of applied tasks whose solutions describe quantitatively climate of chosen territory are on-line computed and mapped using GIS technologies. The second type of applied tasks includes tasks used for decision making. Those along with the computational component, includes tools for expert selection of the initial conditions for these tasks, tools for determining the semantic homogeneity of physical quantities used in the calculations, and software for forming the A-box of the knowledge base of a decision support system (DSS). Presented are several first-type tasks and the second-type task about changing the depth of the active soil layer in the northern regions of Western and Eastern Siberia (the interfluve of the Ob and Yenisei rivers) for a period of 60 years. The solution of this task and the structure of a typical ontology individual used for the decision making are presented. The role of the ontology description of solutions of applied tasks in the VRE based on the “Climate+” platform is discussed. Keywords: Virtual Research Environment, climate and meteorological applied tasks, platform “Climate+”. types and ontology that refer to three layers (Data, 1 Introduction Metadata (Information) and Ontology (Knowledge) layers) of data processing services of the VRE under The creation of a virtual research environment (VRE) to development. Then we describe the Ontology layer and deal with large data arrays becomes popular in data the applied problem. Special attention will be paid to intensive domains [1,2]. Climate and meteorology is one semantic heterogeneity and formalization of thematic of such subject domains [3], within which a thematic domains related with applied tasks, in particular, a VRE was created on the basis of the “Climate+” platform solution of a reduction problem. [4]. The platform uses a client-server technology, where There are two types of tasks which should be solved information resources of the server component are using the thematic VRE. The first type tasks are simple represented by three layers (data, metadata, and tasks of calculation of some physical value ontologies); applications used these resources are characterizing climatic processes. Examples of those divided into three groups (computational software, solved within the previously developed “Climate” geoportal, and ontology application). The client platform can be found in paper [10]. Some additional component is connected with a GIS-client and examples are also described in the third section of the applications used databases, which are currently report. Tasks of the second type which initiated the developed. development of the “Climate+” platform are oriented to Necessity of ontologies usage in geophysical sciences the solution of applied tasks, the results of which can be and their role have been demonstrated in the papers [5- used to make practical decisions in domains crucially 9]. depending on climatic conditions. Such tasks require In this paper we describe new steps in development of thematic numerical modeling of rather complicated the ‘Climate+’ platform, namely applications of different processes occurring under influence of climatic conditions. This modeling involves data obtained by the Proceedings of the XX International Conference first type tasks solving. An example of the second type “Data Analytics and Management in Data Intensive task appears in process a road infrastructures Domains” (DAMDID/RCDL’2018), Moscow, Russia, development planning in Northern regions subjected to October 9-12, 2018 167 climate changes. One of the problems of global warming The client part of the architecture is based on modern is an increase in the active soil layer depth due to the graphical web browser. It is presented by a single ‘Client increase in the temperature in the northern latitudes and applications’ tier, respectively. melting ice in the soil. To take into account these processes one needs in a thematic decision support The data layer contains netCDF datasets and PostGIS system (DSS). The construction of DSS is connected databases while the metadata layer represents the with the construction of a knowledge base on the possible Metadata database (MDDB) describing geospatial evolution of the active layer depth and the restrictions datasets and their processing routines framework of the imposed on the road infrastructure in permafrost areas. computational backend. The A-box of this knowledge should contain facts about The computational backend developed contains data the change in the active soil layer depth for tens of years processing and visualization software components based ahead. on GNU Data Language (GDL, The report discusses the ways of forming the DSS http://gnudatalanguage.sourceforge.net/) and Python. knowledge base using OWL-ontologies that describe Geospatial datasets are processed by a specialized set of solutions of applied tasks and their properties. validated software modules running within the The visualization component of the backend generates 2 General architecture files in the following formats: GeoTIFF, ESRI Shapefile, Encapsulated PostScript, CSV, XML, netCDF, float The web GIS platform ‘Climate+’ developed at the GeoTIFF. The final results are represented by raster and IMCES SB RAS is aimed at processing and analysis of vector cartographical layers accompanied by geospatial gridded datasets in Earth system science, and corresponding binary netCDF data. online visualization of results [4]. Its architecture is The geoportal provides cartographical web services such shown in Figure 1. It represents a typical client-server as WMS, WFS, WPS, as well as server-side part of the structure, where in general the server is a set of Web-GIS client applications which comply with general geographically distributed standalone nodes providing INSPIRE (INfrastructure for SPatial InfoRmation in common interface (API), and client applications Europe, https://inspire.ec.europa.eu) requirements to (basically, Web-GIS client). The server part of the geospatial data visualization. architecture includes a high-performance computing The results of the first type applied tasks system with a data storage attached. It is presented by (computational problems describing changes of states of two tiers: spatio-temporal objects) solving are added to the data  resources tier, including data and metadata; and metadata layers and might be used later. To solve the  server applications (middleware) tier. problem of semantic heterogeneity in the ‘Climate+’ platform, an ontology layer characterizing the properties Figure 1 Platform Climate+ general architecture outline 168 of the data collections (Reanalysis, Observations, Modelling Data) is created. This ontology is used to select input data for Applied Tasks applications. 3 Developed software modules To describe extreme climate events statistics dedicated analytical software tools were integrated into the web GIS platform ‘Climate+’. Packages written in R language (https://www.r-project.org/, [11]) "extRemes" [12, 13], "quantreg" [14] and "copula" [15, 16] were used as a basis [17]. At present, the system allows to calculate basic statistical characteristics and indicators of the temporal structure of meteorological series, describing patterns of changes in time and space. The functionality of the system includes a calculation of trends, an assessment of their statistical significance and a degree of correlation of meteorological quantities. The IPCC recommended Figure 2 100-years return levels of July maximum climatic change indices are also calculated: extreme precipitation for the Southern Siberia region: a - values of daily temperature and daily rainfall and their ECMWF ERA Interim data, 0.75x0.75 horizontal grid, probabilistic characteristics. In addition, the system 1979-2007 years, b - APHRODITE JMA data, calculates characteristics of nonstationary distributions 0.25x0.25 horizontal grid, 1979-2007 years of various extreme values. The implemented set of computational procedures makes it possible to get a Results obtained demonstrate similar behavior of the complete picture of peculiarities of occurring changes in calculated characteristic, but the APHRODITE JMA climatic characteristics for the region of study. Modular results have more details and higher values in some organization of the system allows to expand its regions. functionality by adding new software components developed by both developers and users of the system. 3.2 Quantile regression The analysis of trends in meteorological observations is 3.1 Time-dependent statistics of extremes one of the most common activities in climate change A statistical description of extreme precipitation and studies. Quantile regression provides a well-defined temperature can be achieved using the concepts of statistical framework for estimating the rate of change extreme value statistics (EVS). not only in the mean as in ordinary regression, but in all Software implementation of EVS in R language parts of the data distribution. Given a random variable 𝑌 (package "extRemes") allows statistical modelling of with cumulative continuous distribution function 𝐹𝑌 (𝑦), maximum values based on a non-stationary generalized the quantile function 𝑄𝑌 (𝜏) is defined from the 𝐹𝑌 (𝑦) as extreme value distribution. Required for risk assessment 𝑄𝑌 (𝜏) = 𝐹𝑌−1 (𝜏). The quantile is defined as the value quantities can be calculated using this distribution 𝑄𝑌 (𝜏) such that 𝑃[𝑌 ≤ 𝑄𝑌 (𝜏)] = 𝜏, 0 ≤ 𝜏 ≤ 1. Then, function. In particular, it is the probability of the considering the conditional distribution of 𝑌 given 𝑋 = observed variable to exceed a certain level. These levels 𝑥, the conditional quantile function 𝑄𝑌|𝑋 (𝜏; 𝑥) verifies are frequently expressed as return levels 𝑟𝑇 for a certain 𝑃[𝑌 ≤ 𝑄𝑌|𝑋 (𝜏; 𝑥)|𝑋 = 𝑥] = 𝜏. Whereas ordinary return period 𝑇. 𝑟𝑇 is defined as the level which is regression is based on the conditional mean function 1 exceeded on average every 𝑇, i.e., with probability . 𝐸[𝑌|𝑋 = 𝑥] and minimization of the respective 𝑇 This functionality was used to calculate 100-years return residuals, quantile regression is based on the conditional levels of July maximum precipitation based on ECMWF quantile function and minimization of the sum of ERA Interim [18] (Fig. 2a) and APHRODITE JMA [18] asymmetrically weighted absolute residuals data (Fig. 2b) for the Southern Siberia region (52.5-60° ∑𝑖=1 𝜌(𝜏)|𝑦𝑖 − 𝑄𝑌|𝑋 (𝜏; 𝑥 = 𝑥𝑖 )|, where 𝜌 is the tilted N, 75-95° E). absolute value function. Quantile regression calculation is implemented in R language by the software package "quantreg" [14]. Quantile values of interest are set between 0 and 1. 169 Figure 4 Permafrost coverage of the Russian Federation territory. [Map provided by www.arcticportal.org] Figure 3 Maximum January temperature trends based on ECMWF ERA 40 data (2.5x2.5 horizontal grid, Time evolution of temperature of the frozen soil layer up 1961-2002 years): a - at quantile 0.05, b - at quantile to 20 meters depth caused by heat conductivity is 0.5, c - at quantile 0.95 considered for 1970 - 2030 years. Consideration is based on the simplified model of permafrost thaw which was The computational backend developed contains data tested using the measurement data from the station at processing and visualization software components based Cape Marre-Sale [22]. Several admissions and on GNU Data Language (GDL, simplifications have been accepted in the model http://gnudatalanguage.sourceforge.net/) and Python. formulation. In view of significant differences in the Geospatial datasets are processed by a specialized set of horizontal and vertical scales of the region under study, validated software modules running within the where the heat conductivity is non-stationary, the process framework of the computational backend. is studied in a 1D formulation. It is also assumed that the Based on ECMWF ERA 40 data [20] trends of maximum permafrost layer is a homogeneous medium with January temperature for the Southern Siberia region (50- effective thermophysical properties, which are 65° N, 60-120° E) are shown on Fig. 3. considered constant and invariable with depth. Thermal Results obtained show that maximum January effects due to changes in the permafrost phase state are temperature at quantile 0.05 is changed (both decreased ignored. The monthly average temperatures of the upper and increased) to the greater extent in comparison with soil layer 1 cm thick calculated within a climate model (b) and (c) almost everywhere. Temperature at quantile are used in the description of the atmosphere forcing the 0.95 is changed to the less extent. thermal regime of the soil as boundary conditions at the "atmosphere–frozen layer" interface. At the lower 4 Applied task “Permafrost evolution in the boundary of the region under study, no heat flow is assumed. This approach allows qualitative assessments Northern Part of the Ob–Yenisei Interfluve” of the thermal regime of permafrost in the Arctic regions depending on climate changes without excessive detail. The general formulation of the task is considered in a The condition of the soil temperature excess over the ice number of publications [21]. The region between 60 to point is used to determine the thawing boundary. 75°N and 73 to 93°E, related to the European–Western To simulate a time variation in the vertical temperature Siberian permafrost sector, is a subject of the study, see profiles in the permafrost layer, the following heat Fig.4. It is covered by a 1.5°×2° grid. conductivity equation is used: 170 T   T  PGP, for Parisento, Gydan Peninsula; VDYP, for С   , 0  z  H;t  0 . Vaskiny Dachy, Yamal Peninsula; UGF, for Urengoy t z  z  Gas Field; and HPU, for Harp, Polar Urals. Here T is the temperature;  , C ,  are the density, specific heat, and the thermal conductivity of the surface 6 Ontology description of solution layer; H is the depth of the region under study. The initial The results of the numerical solution of the task are and boundary conditions are stated as follows: numerical arrays of temperatures at different depths (24 t  0 : T   ( z), 0  z  H ; levels) for 720 months (60 years) for each of 100 z  0 : T   (t ), t  0; horizontal cells (10 in latitude and 10 in longitude). T Based on the values of the numerical arrays, parameters zH:  0, t  0. p1 and p2 are calculated, which are the values of the z properties of the applied ontology individuals generated. Here the functions φ(z) and χ(t) are the calculation results The properties of these individuals are listed in Table 2. in the soil model of Institute of Numerical Mathematics RAS. Our task statement is a simpler than the statement Table 2 Properties of individuals that characterize the suggested earlier in the work [23]. However, it requires solution of the applied task less input parameters, related to the simulation subject, Domain Property Range to be specified (e.g., density, specific heat, thermal t3:OutputDat t3:hasTime string d01 conductivity of the frozen soil, and soil humidity), does a not require specification of boundary conditions for the t3:OutputDat t3:hasLatitude float d02 soil temperature at depth, and is less demanding of a computational resources. In other words, the model t3:OutputDat t3:hasLongtitude float d03 suggested in this report is more appropriate for climate a t3:OutputDat t3: hasDepth float d04 scales. a Output data of the model are monthly mean vertical t3:OutputDat float d05 t3: has_Site_Average_ profiles of the soil temperature from the surface to a a of_the_Annual_End-of- depth of 20 m on an inhomogeneous vertical grid (0.01, Season_Thaw_Depth 0.02, 0.04, 0.08, 0.15, 0.25, 0.35, 0.45, 0.55, 0.65, 0.75, t3:OutputDat t3:hasAverage_Monthly float d06 0.85, 0.95, 1.05, 1.15, 1.25, 1.35, 1.45, 1.55, 2.0, 3.0, 5.0, a _Thaw_Soil_Temperatu 10.0, and 20.0 m) in each cell of a horizontal grid. re Several Circumpolar Active Layer Monitoring (CALM, [https://www2.gwu.edu/~calm/data/north.html]) stations Based on the numeric arrays, the characteristics selected, which measure soil temperature and frozen depth are which are the values of the properties of applied ontology situated in Western Siberia (Table 1). This resource individuals generated, are calculated. The property provides for annual data on the depth of frost penetration t3:hasAnnual_End-of-Season_Thaw_Depth is valuable in the soil (see) by the end of the thaw season. only in the case where the measurement station coordinates fall in a 1.5°×2° cell centered at a point that Table 1 Sites of thaw-depth measurements in Western corresponds to values of the properties t3:hasLatitude Siberia and t3:hasLongtitude. No individuals are generated for Code Name Location cells with negative soil temperatures. LAT LONG R1 NWS 65° 20' N 72° 55' E 7 A-Box of DSS knowledge base R3 MSYP 69° 43' N 66° 45' E R4 PGP 70° 07' N 75° 35' E Two types of individuals are constructed by 1728000 soil R5 VDYP 70° 17' N 68° 54' E temperature values calculated. Examples of the structure R5 A VDYP 70°16'31.8" 68°53'29.9" E of such individuals are given in Figs. 5 and 6. Using these N temperature values, its annual means are calculated, and R5 B VDYP 70°17'43.8" 68°53'00.5" E the number of a level at which the soil temperature N changes sign is found. The value of the maximum thaw R5 C VDYP 70°18'05" N 68°50'28.7" E depth allows one to compare the numerical simulation results with the measurement data. The mean R5 D VDYP 70°16'27" N 68°53'26.8" E temperature and depth values can be used to analyze the R50a UGF 66.31537 N 76.90772 E permafrost structure (transition from continuous to GP5 discontinuous propagation) and the character of the R50b UGF 67.477910 N 76.6952900 E GP15 phenomenon (periodicity or trend). R53 HPU 66.723483 N 66.080488 E Designation used in Table 1: NWS is used for Nadym, West Siberia; MSYP, for Marre-Sale, Yamal Peninsula; 171 References [1] European Virtual Environment for Research - Earth Science Themes: a solution (EVER-EST), Horizon 2020, grant agreement no 674907, http://ever-est.eu/ [2] Fulvio Marelli, Helen Glaves, Mirko Albani (2017) EVER-EST: a virtual research Figure 5 Scheme of the individual describing the environment for the Earth Sciences; calculated value of the annual end-of-season thaw depth Geophysical Research Abstracts Vol. 19, in the fixed point EGU2017-17847, EGU General Assembly 2017 [3] L. Kalinichenko, A. Fazliev, E. Gordov, N. Kiselyova, D. Kovaleva, O. Malkov, I. Okladnikov, N. Podkolodny, N. Ponomareva, A. Pozanenko, S. Stupnikov, A. Volnova, New Data Access Challenges for Data Intensive Research in Russia, CEUR Workshop Proceedings, v. 1536, 2015, P.215-237, 17-th International Conference on Data Analytics and Management in Data Intensive Domains, DAMDID/RCDL 2015; Obninsk; Russian Federation; 13 - 16 October 2015; Code 118237. Figure 6 Scheme of the individual describing the [4] A.A. Bart, A.Z. Fazliev, E.P. Gordov, I.G. calculated value of the soil temperature at fixed depth (5 meters) in the fixed point Okladnikov, A.I. Privezentsev, A.G. Titov, Virtual Research Environment for Regional FTS is the prefix that shows that an individual of the Climatic Processes Analysis: Ontological solution of Freeze-up and Thawing of Soil task; Approach to Spatial Data Systematization, Data are the geographical coordinates of the center of a Science Journal, 2018 (in press) computing cell; is the year; is the [5] Athanasis, N, Kalabokidis, K, Vaitis, M and month; the ending Depth/Soil_Temperature designates Soulakellis, N 2009 Towards a semantics-based which physical parameter is described. approach in the development of geographic portals. Computers & Geosciences, 35(2): 301- 308. DOI: 8 Conclusion https://doi.org/10.1016/j.cageo.2008.01.014 We presented description of the two types of application [6] Bogdanovic, M, Stanimirovic and A, used 80 Tb IMCES SB RAS collections of climatic and Stoimenov, L 2015 Methodology for geospatial meteorological data. In the first type of applications data source discovery in ontology-driven geo- statistical characteristics of climatic characteristics are information integration architectures. Web on-line calculated and relevant results are presented to Semantics: Science, Services and Agents on the user as maps of their fields. The second type applications World Wide Web, 32: 1-15. DOI: are aimed at short- or long-term prognosis of physical https://doi.org/10.1016/j.websem.2015.01.002 values evolution important for decision making. As an [7] Brodaric, B, Fox, P and McGuinness, D L 2009 example of such application the problem of long-term Geoscience knowledge representation in active soil layer evolution in Northern regions is cyberinfrastructure. Computers & Geosciences, considered. On the basis we created the database which 35 (4): 697-699. DOI: is used to form knowledge base about evolution of https://doi.org/10.1016/j.cageo.2009.01.001 physical values determining behavior natural objects and [8] Husain, M F, Al-Khateeb, T, Alam, M and a planned transport infrastructure in the Northern part of Khan, L 2011 Ontology based policy West Siberia. interoperability in geo-spatial domain. To expand web GIS platform ‘Climate+’ functionality Computer Standards & Interfaces, 33(3): 214- we plan development of a new software module utilizing 219. DOI: the powerful package "copula". It will allow one to https://doi.org/10.1016/j.csi.2010.03.011 calculate probability distributions of multivariate random variables and determine a structure of [9] Lutz, M, Sprado, J. Klien, E, Schubert, C and dependence between different climatic variables. Christ, I 2009 Overcoming semantic heterogeneity in spatial data infrastructures. Acknowledgments. The authors thank the Russian Computers & Geosciences, 35(4): 739-752. Science Foundation for the support of this work under [10] Riazanova A A, Voropay N N, Okladnikov I the grant No16-19-10257. G and Gordov E P. Development of 172 computational module of regional aridity for the web-GIS "CLIMATE" IOP Conference web-GIS "Climate" // IOP Conf. Series: Earth Series: Earth and Environmental Science 96 and Environmental Science. 2016. V. 48. doi:10.1088/1755-1315/96/1/012014 012032. doi:10.1088/1755- [18] Dee D P et al. 2011 The ERA-Interim 1315/48/1/012032DOI: reanalysis: configuration and performance of https://doi.org/10.1016/j.cageo.2007.09.017 the data assimilation system Quarterly Journal [11] Gilleland E 2011 Using R to Analyze Extremes of the Royal Meteorological Society 137 Issue National Center for Atmospheric Research 656 Part A pp 553–597 (Boulder, Colorado, U.S.A) [19] APHRODITE JMA, [12] Gilleland E 2016 Package “extRemes” The http://www.chikyu.ac.jp/precip/data/APHRO_V Comprehensive R Archive Network (CRAN) 1003R1_readme.txt https://cran.r-project.org/ [20] Kallberg P, Simmons A, Uppala S, Fuentes M web/packages/extRemes/extRemes.pdf 2007 ERA–40 Project Report Series. The ERA– [13] Gilleland E and Katz R W 2016 extRemes 2.0: 40 Archive Report of European Centre for An Extreme Value Analysis Package in R Medium Range Weather Forecasts, England Journal of Statistical Software 72 8 [21] E.E. Machul'skaya, Vasily N. Lykossov, [14] Koenker R., Portnoy S., Tian P., Zeileis A., Simulation of the thermodynamic response of Grosjean P. and Ripley B. D., 2017 Package permafrost to seasonal and interannual “quantreg” The Comprehensive R Archive variations in atmospheric parameters, Izvestiya Network (CRAN) https://cran.r- Atmospheric and Oceanic Physics 38(1):15-26, project.org/web/packages/quantreg/quantreg.pd January 2002 f [22] Sudakov I.A.,Bobylev L.P., Beresnev S.A. [15] Hofert M, Kojadinovic I., Maechler M., Yan Y., Modeling permafrost thermal regime under 2017 Package “Copula” The Comprehensive R ongoing climate change. Vestnik of SpBGU. Archive Network (CRAN) ftp://cran.r- Earth Sciences. (1), 81-88. project.org/pub/R/web/packages/copula/copula. [23] Alipova K.A., Bart A.A., Fazliev A.Z., Gordov pdf E.P., Okladnikov I.G., Privezentsev A.I., Titov [16] Yan J 2007 Enjoy the Joy of Copulas: With a A. G., "Systematization of climate data in the Package copula Journal of Statistical Software virtual research environment on the basis of 21 4 ontology approach", Proc. SPIE v.10466, 23-rd [17] A A Ryazanova, I G Okladnikov and E P International Symposium on Atmospheric and Gordov 2017 Integration of modern statistical Ocean Optics: Atmospheric Physics, 1046675 tools for the analysis of climate extremes into (2017); https://doi.org/10.1117/12.2289761 173