=Paper=
{{Paper
|id=Vol-2022/paper43
|storemode=property
|title=
Ontological Description of Meteorological and Climate Data Collections
|pdfUrl=https://ceur-ws.org/Vol-2022/paper43.pdf
|volume=Vol-2022
|authors=Andrey A. Bart,Vladislava V. Churuksaeva,Alexander Z. Fazliev,Alexey I. Privezentsev,Evgeny P. Gordov,Igor G. Okladnikov,Alexander G. Titov
|dblpUrl=https://dblp.org/rec/conf/rcdl/BartCFPGOT17
}}
==
Ontological Description of Meteorological and Climate Data Collections
==
Ontological Description of Meteorological and Climate Data
Collections
© A.A. Bart © V.V. Churuksaeva
Tomsk State University,
Tomsk, Russia
bart@math.tsu.ru
© A.Z. Fazliev © A.I. Privezentsev © E.P. Gordov © I.G. Okladnikov © A.G.Titov
Institute of Atmospheric Optics SB RAS,
Tomsk, Russia
faz@iao.ru remake@iao.ru gordov@scert.ru oig@scert.ru titov@scert.ru
Abstract. The first version of the primitive OWL-ontology of collections of climate and meteorologi-
cal data of Institute of Monitoring of Climatic and Ecological Systems SB RAS is presented. The ontology
is a component of expert and decision-making support systems intended for quick search for climate and
meteorological data suitable for solution of a certain class of applied problems.
Keywords: ontology description of object domains, systematization of domain data, climate and me-
teorological data.
the collections agrees with physical parameters advised
1 Introduction by World Meteorological Organization (WMO). They
Today every large meteorological center uses origi- are described in the taxonomy of the WMO ontology
nal meteorological models for calculation of climate Codes Registry [19], as well as in the taxonomy of the
and meteorological parameters, which can differ both in ontology of the GRIB Discipline Collection [16] in-
the level of detail and set of calculated values of physi- tended for the use in the Climate Information Platform
cal parameters. During the reanalysis of a meteorologi- for Copernicus (CLIPC).
cal situation, key meteorological parameters corre- The ontology description of data collections in the
sponding to measurements at weather stations are usual- form of a primitive (simplified) formal OWL-ontology
ly taken into account. is intended for the selection of data collections within
The results of climatic numerical simulation, weath- an expert system, which can be used during solution of
er forecast, or reanalysis of meteorological fields are an applied task of an object domain.
collections of meteorological parameters that character- The ontology approach selected for the solution of
ize the state of the atmosphere. They are represented by the problem stated consists in the following. An ontolo-
data arrays in common formats, e.g., grib [7], netCDF gy description is constructed for an applied problem. In
[12], HDF5 [8], etc. addition to the physical statement, the description
At Institute of Monitoring of Climate and Ecological should include the mathematical statement of the task,
Systems SB RAS (IMCES SB RAS), the data pro- i.e., a mathematical model with equations. Variables,
cessing environment [3] has been developed for repre- which conform the WMO classification, and limitations
senting collections of meteorological data; the environ- are described in the form of an OWL-ontology. On the
ment is provided by sets of metadata that characterize one hand, the set of parameters includes common mete-
physical parameters entering into the above collections. orological parameters, such as sea level pressure, sur-
The practice showed restriction of the use of only local- face pressure, air temperature and humidity, wind speed
ized applications in this environment. Inclusion of ex- and direction, and so on. This allows comparison of the
ternal applications resulted in creation of a new system computed values with the weather station measurement
– virtual information platform “Climate+” [17], where results. On the other hand, both meteorological and cli-
data are represented in the netCDF format. mate models supplemented by an applied task compose
a component of a more complex model, where the re-
When using climate data from different collections
sults of prognostic calculations by cli-
of numerous data manufacturers, the problem arises of
mate/meteorological parameters are used for the solu-
ambiguous identification of physical parameters from
tion of applied problems in different fields of human
these collections. The sense of physical parameters in
activity. This, in turn, enriches collections of climate
and meteorological data with values of new physical
Proceedings of the XIX International Conference parameters.
“Data Analytics and Management in Data Intensive
Domains” (DAMDID/RCDL’2017), Moscow, Russia,
October 10-13, 2017
266
2 Virtual data processing environment ECMWF ERA Interim, MRI/JMA APHRODITE’s Wa-
Approaches used in the creation of the prototype ter Resources Project data, DWD Global Precipitation
of a subject virtual data processing environment Climatology Centre data, GMAO Modern Era-
(VDPS) for the analysis, estimation, and forecast of Retrospective analysis for Research and Applications
the impacts of global climate changes on the natural (MERRA), reanalysis of the joint Project «Monitoring
environment and climate of a region were mainly atmospheric composition and climate (MACC)», NO-
developed during the design of the “Climate” web AA-CIRES Twentieth Century Global Reanalysis, ver.
GDS [4,5]. This sub-ject GDS has been designed with II, NCEP Climate Forecast System Reanalysis (CFSR),
the use of up-to-date information and communication simulation results obtained with the use of global and
technologies, is based on the conceptions of spatial regional climate and meteorological models. Observa-
data infrastructure (SDI) [2, 10], and grounds a tion data from weather stations from the territory of the
software infrastructure for the complex use of former USSR for the 20th century included in the Post-
geophysical data and information sup-port of GIS database are also accessible.
integrated multidisciplinary scientific researches in the Data processing
modern quantitative meteorology. We have se-lected
it as a subject component of VDPS for Earth sci-ences. 1. Statistical characteristics of meteorological parame-
A web geoportal [1, 9] is a single access point to ters: sample mean, variance, excess, median, mini-
subject spatial data, processing procedures and mum and maximum, and asymmetry.
results [1, 9]. The portal allows a user to search for 2. Derived climate parameters: vegetation period dura-
geoinfor-mation resources in metadata catalogues, to tion, sum of effective temperature, Selyaninov hy-
form sam-ples of spatial data according to their drothermal coefficient.
characteristics (access functionality), and to manage 3. Periodic variations: standard deviation, norms, aber-
tools and applica-tions for data processing and rations, amplitudes of diurnal and annual variations.
mapping. 4. Non-periodic variations: duration and repeatability
of atmospheric phenomena with meteorological pa-
The GDS Web Client [6, 13] is the main tool of
rameters below or above the limits specified at dif-
the user’s desktop. It ensures the fulfillment of
ferent time points.
OGC re-quirements for web services: spatial data
visualization (Web Map Service—WMS), data Then a user can either analyze the results or contin-
representation in vec-tor (Web Feature Service— ue adding new layers on the map. To study the results,
WFS) and bitmap formats (Web Coverage Service— the user is provided for a possibility of selecting a geo-
WCS), and their geospatial processing. It provides graphical region, scaling, getting values from all layers
for the access to collections of climate data and tools at a point, additionally processing earlier results (e.g.,
for their analysis and visualiza-tion of the results comparison between data from different layers). In ad-
via typical GDS graphical web browser. The Web dition to the direct analysis of geophysical data, a used
Client satisfies the general require-ments of INSPIRE can carry out joint researches with other user, share the
standards and allows selection of data set, results, and use proper data collections in the pro-
processing type, geographic region for the analysis cessing. In general, this hardware-software complex
of processes, and representation of the pro-cessing provides for distributed access, processing and visuali-
results of spatial data sets in the form of WMS/ zation of large collections of geospatial data with the
WFS map layers in bitmap (PNG, JPG, Geo-TIFF), use of cloud technologies.
vector (KML, GML, Shape), and binary formats
The data processing environment “Climate” devel-
(NetCDF).
oped at IMCES SB RAS limits possibilities of users by
Today, the VDPS prototype combines data local software applications. A current task is to extend
collec-tions (reanalyses and climate simulation the environment by external user applications. For this,
results and weather station measurements) within the the corresponding problems should be specified in gen-
unified geo-portal, supports the statistical analysis of eral. Below we describe one of possible classes of prob-
archive and required data, and provides access to lems connected with decision-making.
the WRF and «Planet Simulator» models. In
particular, a user can run a VDPS-integrated model, 4 General definition of the problem
preprocess the results, pro-cess them numerically and The “Climate+” virtual information platform in-
analyze, and gain the results in graphical cludes collections of meteorological and climate data. It
representation. The prototype provides for specialists is intended for the data representation with the use of
that participate in a multidisciplinary re-search GIS technologies. Its further development is oriented to
process prompt tools for integral study of climate and providing researchers possibilities of using selected data
ecological systems on the global and regional sets or their parts as input data. Most collections include
scales. With these tools, a user that does not know pro- data related to some (not all) spatiotemporal objects of
gramming is able of processing and graphically repre- the Earth; different collections often include different
senting multidimensional observation and simulation sets of physical parameters. To search for required spa-
data in the unified interface with the use of the tiotemporal objects and their meteorological and clime
web browser. characteristics, it was necessary to create a
3 VDPS prototype capabilities corresponding expert system on the basis of a
knowledge base on spatial objects of the data
Support of the following data sets is built in the pro-
collections and their parameters.
totype: NCEP/NCAR reanalysis, ed. II, JMA/
CRIEPI JRA-25 reanalysis, ECMWF ERA-40
reanalysis,
267
Figure 1 Simplified block-diagram of “Climate +” platform modification
Figure 1 shows a simplified block-diagram which is tem on the basis of both meteorological and climate
a basis of the “Climate +” platform modification. There data, the parameters should be matched. Therefore, the
are three groups of subsystems: meteorological and WMO classification in version [11] is included in the
climate data collections; subsystem for work with ontology. This matching allows describing applied tasks
knowledge bases (expert system for selecting input data of the domain in common terms.
for applied tasks and decision-making support system), There are climatic and meteorological resources
and applied tasks with their input and output data. The [16, 19] that use the WMO classification of names of
data representation services are omitted. meteoparameters for the GRIB format for data storage
In this work, we discuss questions of creation of a [7]. First of all, WMO Codes Registry created for the
knowledge base for the expert system. The main prob- aviation with the aim of supporting data exchange in the
lem which has been solved is substantiation of the re- AvXML format; it is based on RDF and SKOS recom-
duction problem solution [20] or, in other words, con- mendation.
struction of typical individuals of an OWL-ontology In our OWL-ontology of climate information re-
that characterize properties of spatiotemporal objects sources, we created classes and individuals that corre-
from the collections. The development of the conceptu- spond to names of meteorological parameters, e.g., the
al part of the ontology (T- and R-box) is connected in Meteorological_Products class and subclasses, accord-
our solution with classification of meteorological and ing to [11]. In the primitive OWL-ontology of climate
climate parameters and is briefly described below. information resources described below, classes and in-
5 Taxonomy of meteorological parameters dividuals are created that correspond to names of mete-
orological parameters according to [11]. Individuals that
The OWL DL language [14] is used for the ontology unambiguously characterize physical parameters by
description of the domain that generalizes, in particular, their name [11] have been created in each subclass
related spatiotemporal objects. These objects can be an Thermodynamic_Stability_category, Atmospher-
air layer over a bounded territory, upper soil layer on ic_Chemical_Constituents_category, Electrodynam-
this territory, or, in more specific cases, forests, fields, ics_category, Mass_category, Long-
or long roads. There are physical and chemical process- wave_radiation_category, Temperature_category,
es connected with the objects; they are described by Short-wave_radiation_category, Aerosols_category,
numerical models and used in calculations. Input values Moisture_category, Radiology_Imagery_category,
of the physical parameters are required for the calcula- Momentum_category, Trace_Gases_category,
tions. The processes under study can relate to different Cloud_category, and Physical_Atmospheric_category.
temporal and spatial scales and be described on differ-
For the INMCM4 collection, which corresponds to
ent levels of detail. Let us note that coupling of several
output data of the INMCM4 climate model of general
mathematical models requires knowledge of sets of
atmospheric and ocean circulation [18], classes and
input and output parameters and their spatiotemporal
subclasses were created corresponding to model varia-
characteristics.
bles. These classes agree to the corresponding WMO
The taxonomy of physical parameters allows form- classes.
ing sets of properties of spatiotemporal objects of a do-
main for solution of specific applied tasks. This taxon- 6 Primitive ontology of “Climate+” plat-
omy is used in the OWL-ontology for T-box construc- form data
tion.
The OWL DL developed and formalized ontology
When developing the decision-making support sys- of climate information resources describes the current
268
state of collections of data arrays of the data processing measurement units (Unit). They can be described by:
environment as one of the main Russian information the number of members of the array of a physical pa-
resources on climate data. Numerical data are repre- rameter (has_number_of_values), its minimal value
sented by data arrays that are stored in netCDF files. (has_minimum_value) and maximal value
The data arrays are grouped in data sets. All data arrays (has_maximum_value), or by numerical values of the
in a set should: (a) be received at one temporal or spa- parameter (has_value). A data array (Data_array) is an
tial grid; (b) cover the same time interval; (c) be re- ordered list of numerical values of a physical parameter
ceived under the same simulation or observation condi- (Physical_quantity), as a property of the spatiotemporal
tions (if possible); (d) be represented by a set of netCDF system (has_spatiotemporal_system), at each 4D point
files, which include the same physical parameters. The (longitude, latitude, height level, and time) of the spa-
data sets are grouped in data collections. A data collec- tiotemporal system (Spatiotemporal_system). In the
tion is an ensemble of data sets received by an organiza- OWL-ontology, a data array (Data_array) is a subclass
tion within a project, but represented on different spatial of the class Physical_quantity_values and, hence, is a
or temporal grids or for different model scenarios. In numerical array of values of one physical parameter
particular, a collection can consists of the only data set. (Physical_quantity) in certain measurement units (Unit);
The basic classes in the OWL-ontology are: Collec- it is described by the number of members
tion, Spatiotemporal_object, Organization, Data_set, (has_number_of_values), maximal values
Data_array, Scenario, Spatial_resolution, Physi- (has_minimum_value) and minimal values
cal_quantity, Physical_quantity_values, Unit, Longi- (has_maximum_value) of the physical parameter. A
tudes_array, Time_step, Latitudes_array, data array (Data_array) belongs (has_data_array) to a
Height_levels_array, and Times_array. The spatiotem- data set (Data_set), which differs from other data sets
poral system is a four-dimensional object determined by by the model scenario (Scenario), spatial resolution
arrays of numerical values of longitudes (Longi- (Spatial_resolution), time step (Time_step), and belong-
tudes_array), latitudes (Latitudes_array), height levels ing to one collection (Collection). A data collection
(Height_levels_array), and time labels (Times_array), (Collection) consists of (has_data_set) data sets (Da-
which are subclasses of the class of the list of values of ta_set) and belongs (has_organization) to one organiza-
a physical parameter and, therefore, numerical arrays of tion (Organization). The OWL properties of the climate
one physical parameter (Physical_quantity) in certain data ontology are represented in Tables 1 and 2.
Table 1 Object properties of the ontology of climate information resources
Domain Object Property Range id
Collection has_organization Organization o01
Collection has_data_set Data_set o02
Data_set has_scenario Scenario o03
Data_set has_spatial_resolution Spatial_resolution o04
Data_set has_time_step Time_step o05
Data_set has_data_array Data_array o06
Physical_quantity_values has_physical_quantity Physical_quantity o07
Physical_quantity_values has_unit Unit o08
Data_array has_spatiotemporal_object Spatiotemporal_object o09
Spatiotemporal_object has_longitudes_array Longitudes_array o10
Spatiotemporal_object has_latitudes_array Latitudes_array o11
Spatiotemporal_object has_height_levels_array Height_levels_array o12
Spatiotemporal_object hat_times_array Times_array o13
Definitions of object properties are given in first are given in the first three rows of Table 2; unique iden-
three rows of Table 1; their unique identifying proper- tifying properties are given in the fourth row. The range
ties, in the fourth row; the range of definition (the first of definition (the first row) and range of values (the
row) and range of values (the third row) are specified third row) are specified for each property from the sec-
for each property. Definition of the data array properties ond row.
Table 2 Data type properties in the ontology of climate information resources
Domain Datatype Property Range id
Physical_quantity_values has_number_of_values int d01
Physical_quantity_values has_minimum_value float d02
Physical_quantity_values has_maximum_value float d03
Physical_quantity_values has_value float d04
Times_array has_time_start str d05
Times_array has_time_end str d06
269
Figure 2 Simplified representation of individual describing ERAInt data collections
270
Figure 2 exemplifies a simplified individual of the teorological collections of IMCES SB RAS is con-
OWL-ontology of climate information resources, structed; it can be used for the search and selection of
used in the description of a ERAInt data collection, data for classes of applied problems in coupled deci-
within the formal description of RDF resources [15]. sion support systems. The matching of physical pa-
Individuals of the OWL-ontology are shown in rameters of applied tasks with IMCES SB RAS col-
ovals; literal values are given in rectangles; the ar- lections is carried out in WMO accepted terms.
rows show properties with unique identifiers in small
8 Acknowledgment
rectangles, taken from Tables 1 and 2. Three arrows
mean probable property cardinality higher than unity. The authors thank the Russian Science Foundation
Three overlapped ovals mean probable number of for the support of this work (developing of web-
individuals of the OWL-ontology larger than unity. services and solution of reduction problems) under
The individual “Data_collection” is connected by the the grant No16-19-10257. We also thank Russian
property “has_data_set” with the individuals “Da- Foundation for Basic Research (16-07-01028) for the
ta_set”, each of which is connected by the property support of work (conceptualization of domains)
“has_data_array” with individuals ”Data_array”. partially described in the sections 4, 5, 6 of the article.
The domain analysis of climate numerical data ar- References
rays of the “Climate+” platform, stored as NetCDF
files, allows the description of a primitive ontology of [1] Becirspahic, L., Karabegovic, A.: Web Portals
climate data of this platform in the OWL DL lan- for Visualizing and Searching Spatial Data. In-
guage. The primitive ontology is a simple and easily form. Comm. Techn., Electr. and Microelectr.
extended systematization of information resources (MIPRO), 2015, 38-th International Convention
required for the further work on the development of on, Opatija, pp. 305-311 (2015). doi:
the decision-making support system. 10.1109/MIPRO.2015.7160284
To construct the climate data ontology of the [2] Frans, J. M., van der Wel: Spatial Data Infra-
“Climate+” platform the software has been developed struc-Ture for Meteorological and Climatic Da-
for the formation of the fact-based block (A-box). An ta. Meteorol. Appls., 12 (1), pp. 7-8 (2005)
A-box has been formed for the climate data ontology [3] Gordov, E. P., Okladnikov, I. G., Titov, A. G.:
using this software. Facts have been retrieved from Application of Web Mapping Technologies for
the analysis of 80 Tb of climate data from the “Cli- Development of Information-Computational
mate+” platform over 13 numerical data collections, Systems for Georeferenced Data Analysis,
which include 36 data sets and 793 data arrays. All Vestnik NGU, Ser. Information Technologies, 9
the climate data collections include description of 170 (4), pp. 94-102 (2011) (in Russian)
spatiotemporal systems and 156 physical parameters [4] Gordov, E.P., Lykosov, V.N., Krupchat-
that characterize properties of these systems. nikov, V. N., Okladnikov, I. G., Titov, A. G.,
7 Conclusions Shulgina, T. M.: Computational-information
Technologies for Monitoring and Modeling of
The prototype of subject virtual data processing Climate Change and its Consequences. Novosi-
environment has been developed to provide for re- birsk: Nauka, 199 p. (2013) (in Russian)
searchers, specialists, and people that make decisions
an access to different geographically distributed and [5] Gordov, E. P., Okladnikov, I. G., Titov, A. G. :
georeferenced resources and climate data processing Information and Computing Web-system for In-
services via a typical web browser. It includes a geo- teractive Analysis of Georeferenced Climatic
portal, systems for distributed storage, processing, Data Sets, Vestnik NGU, Ser. Information
and providing of spatial data and results of their pro- Technologies, 14 (1), pp. 13-22 (2016) (in Rus-
cessing. In particular, it allows the simultaneous sian)
analysis of several subject sets of climate data with [6] Gordov, E., Shiklomanov, A., Okladnikov, I.,
the use of up-to-date statistical methods and, thus, Prusevich, A., Titov, A.: Development of Dis-
revealing the impacts of climate changes on ecologi- tributed Research Center for analysis of regional
cal processes and human activity. After finishing the climatic and environmental changes, IOP Conf.
work on the prototype, different interactive web tools Series: Earth and Environmental Science, 48,
are to be developed for the profound analysis of cli- 012033 (2016)
matic variables and their derivatives provided by the [7] Guide to the WMO Table Driven Code Form
subject geoportal. Used for the Representation and Exchange of
The developed software is used for processing Regularly Spaced Data In Binary Form: FM 92
spatial datasets, including observation and reanalysis GRIB Edition 2. World Meteorological Organi-
data, for the spatiotemporal analysis of recent and zation Extranet. 2003. URL:
probable climate changes, with the special focus on http://www.wmo.int/pages/prog/www/WMOCo
extreme climate phenomena in northern latitudes. des/Guides/GRIB/GRIB2_062006.pdf
The primitive OWL-ontology of climate and me-
271
[8] HDF Group - HDF5:
https://support.hdfgroup.org/HDF5/
[9] Koshkarev, A. V.: Geoportal as a Tool to Con-
trol Geospatial Data and Services, Geospatial
Data, 2, pp. 6-14 (2008) (in Russian)
[10] Koshkarev, A. V., Ryakhovskii, A. V., Serebry-
akov, V. A.: Infrastructure of Distributed Envi-
ronment of Spatial Data Storage, Search and
Processing, Open Education, 5, pp. 61-73
(2010) (in Russian)
[11] NCEP/NCO Production Management Branch.
NCEP WMO GRIB2 Documentation. National
Weather Service Organization NCEP Central
Operations. 2005. http://www.nco.ncep.noaa.
gov/ pmb/docs/grib2/grib2_doc.shtml
[12] Network Common Data Form (NetCDF).
https://www.unidata.ucar.edu/software/netcdf/
[13] Okladnikov, I. G., Gordov, E. P., Titov, A. G.:
Development of Climate Data Storage and Pro-
cessing Model. IOP Conf. Series: Earth and En-
vironmental Science, 48, 012030 (2016)
[14] OWL 2 Web Ontology Language. RDF-Based
Semantics (Second Edition), Eds: M. Schneider,
F. J. Carroll, I. Herman, P. F. Patel-Schneider.
W3C Recommendation 11 December 2012,
http://www.w3.org/TR/2012/ REC-owl2-rdf-
based-semantics-20121211/
[15] Resource Description Framework (RDF): Con-
cepts and Abstract Syntax, W3C Recommenda-
tion 10 February 2004, Eds: Graham Klyne, Jer-
emy J. Carroll, http://www.w3.org/TR/2004/
REC-rdf-concepts-20040210/
[16] The GRIB Discipline Collection: [site] (2004).
URL: http://vocab-test.ceda.ac.uk/collection/
grib/Discipline
[17] Titov, A. G., Gordov, E. P., Okladnikov, I. G.:
Hardware-Software Platform «CLIMATE» as a
Basis for Local Spacial Data Infrastructure Ge-
oportal, Vestnik NGU, Ser. Information Tech-
nologies, 10 (4), pp. 104-111 (2012) (in Rus-
sian)
[18] Volodin, E. M., Dianskii, N. A., Gusev, A. V.:
Simulating Present-day Climate with the
INMCM 4.0 Coupled Model of the Atmospheric
and Oceanic General Circulations, Izvestiya,
Atmospheric and Oceanic Physics, 46 (4),
pp. 414-431 (2010)
[19] WMO Codes Registry (2013). URL:
http://codes.wmo.int/grib2
[20] Zinov'ev, A. A.: Foundations of the Logical
Theory of Scientific Knowledge (Complex Log-
ic), D. Reidel Publishing Company, 264 p.
272