Warehouse Development of Ontology for Providing
                 Semantic Interoperability1

    Dmitry Korneev [0000-0001-7260-4768], Alexander Boichenko [0000-0003-3113-9446] and Vasily
                                  Kazakov [0000-0001-8939-2087]
                  1
                      Plekhanov Russian University of Economics, M oscow, Russia
                                      Korneev.DG@rea.ru


          Abstract. The modern stage of social development is characterized by the ever
          increasing penetration of information technologies (IT) into various spheres of
          human activity. Experts in various fields of knowledge associate the development
          of human civilization at the present stage with the development of the digital
          economy and the fourth industrial revolution, which involves the mass introduc-
          tion of cyber-physical systems in production ("Industry 4.0"), serving human
          needs, including life, work and leisure [1]. Currently, the intellectual component
          of information systems (IS), including those designed to control devices and
          mechanisms, is rapidly increasing. A significant part of this intellectual compo-
          nent is the ability of information systems to interact at the semantic level. In this
          regard, there is a growing need for a more intellectual interaction of IS among
          themselves. It is assumed that intellectual systems is should understand not only
          the syntax but the semantics (meaning) of the query addressed to it. So, for ex-
          ample, at present, in connection with the development of the Internet of Things
          technologies (IoT) there is a need to solve the problems of "communication" de-
          vices ("things"), united in information networks. To ensure semantic interopera-
          bility in the works [2, 3], an approach was proposed, which consists in the imple-
          mentation of an integrated knowledge base in each of the interacting IoT, reflect-
          ing the integrated ontologies of the subject areas of interacting systems. This ar-
          ticle discussions the possibility of using language and software tools to describe
          and create a repository of ontologies used to ensure the semantic interoperability
          of IS.


          Keywords: semantic interoperability, software intensive systems, ontological
          engineering


1         Introduction

Currently, the use of the concept of open systems in the creation of software is one of
the main trends in the field of information technology. In the documents of the Associ-


1
    T he article was prepared with the support of the Russian Foundation for Basic Research (grant № 18-07-
     01053).

Proceedings of the XXII International Conference “Enterprise Engineering and Knowledge
M anagement” April 25-26, 2019, M oscow, Russia
ation IEEE (Institute of Electrical and Electronics Engineers), under open systems re-
fers to systems that implement "a comprehensive and consistent set of basic interna-
tional standards of information technology and profiles of functional standards that
specify interfaces, services and supporting data formats to ensure the interaction and
mobility of applications, data and personnel."
   To be able to communicate with each other, open systems (see, for example, the
series of POSIX standards developed by IEEE) must have the interoperability property.
Interoperability in ISO/IEC 24765-Systems and Software Engineering-Vocabulary [4]
is defined as "the ability of two or more systems or elements to exchange informatio n
and to use the information obtained from the exchange".
   In standards and research on the problems of interoperability, discusses the various
levels of interaction systems. It is further proposed to focus on the European interoper-
ability stack (EIF - European Interoperability Framework v2.0) [5] stratifying interop-
erability functions to the next series of logical levels:

1. Normative – involves the interaction of systems in a single regulatory and legislative
   environment;
2. Organizational – refers to the organizational aspects of the functioning of info r-
   mation systems and involves common business processes and regulations of their
   functioning;
3. Semantic – the ability of systems to equally understand the meaning of the infor-
   mation they exchange;
4. Syntactic – the ability to exchange data, the ability of systems to integrate;
5. Technical – the organization of the relationship between the systems.

   At the first two levels of the EIF stack, the initial requirements for the design of
information systems are set, organizational measures are taken to unify the relevant
regulatory documents and business processes. To ensure the fourth and fifth levels of
interoperability of the EIF stack in the design and development of information systems,
they must include certain software tools. These levels of interoperability are well un-
derstood and their practical implementation is not currently causing serious difficulties.
   A new stage in the development of solutions in the field of interoperability is asso-
ciated with the emergence and development of technologies of Software Intensive Sys-
tems (SIS) - systems, the functionality of which is mostly or completely determined by
the software [6]. The development of SIS and IoT technologies is impossible without
solving the problems of interaction between systems. At the same time, it is no longer
enough to ensure interoperability only at the technical and syntactic levels. Due to the
increasing intelligence of the SIS and IoT, there is a need to ensure interaction at the
semantic level of interoperability, assuming that the information system (device) un-
derstands the meaning of incoming requests and generates answers, which, in turn, will
be correctly perceived by the system (device) that forms the request.
    In [2, 3] it was noted that the IoT systems can be considered as SIS and thus the
research results and standards related to SIS can be applied to the systems of the IoT.
To ensure semantic interoperability, the authors of this article proposed an approach
consisting in the implementation in each of the interacting SIS (IoT) using the reference
model OSE/RM (Open System Environment/Reference Model) (see, for example,
IEEE P1003.22 Draft Guide for POSIX Open Systems Environment—A Security
Framework) integrated knowledge base reflecting integrated ontologies of subject areas
of interacting systems [3].
   In the course of the study [7], the number of specific features of this type of ontology
were determined. On the basis of the analysis of the structures and methods of ontology
construction, it was concluded that it is expedient to include in the structure of the on-
tology used to provide semantic interoperability of is concepts that allow describing
both the static state and the dynamic changes in the states of objects of the subject area.
The main types of concepts used were identified, for each of which the composition of
the required attributes (properties) characterizing it and possible types of relationships
with other concepts were determined.


2      Analysis of Ontology Description Languages

Below we consider the languages of ontology description in order to choose the optimal
one for ontology creation, which are planned to be used to provide semantic interoper-
ability of is.
   There are several "languages" for writing semantic models, the main ones being
RDF, RDFS and OWL.
   RDF – the metadata description language that forms the basis of Semantic Web, is a
representative of the family to describe relational data models, the specificity o f which
is that resources and properties are identified by global identifiers - URIs. RDF de-
scribes the subject area in terms of resources, resource properties, and property values.
RDF-data can be regarded as a set of statements (triplets) – "subject", "predicate (prop-
erty)" and "object of approval", represented as a directed graph formed by such state-
ments. RDF is a universal metadata description language and requires configuration for
specific specialized tasks. The way to do this is to extend RDF with d ictionaries, one
of which is RDFS, which is used to describe ontologies.
   RDFS introduces concepts such as classes, subclasses, properties, and subproperties,
and allows you to restrict them. RDFS allows you to define resource classes and prop-
erties as elements of a data type dictionary, and specify which properties with which
classes can be used. RDFS expresses these dictionaries by means of RDF, providing a
set of predefined resources and properties with a designated semantic load for them,
which can be used to describe new RDF dictionaries.
   OWL extends the RDFS vocabulary (basically, the definition of classes and sub-
classes), by introducing relations of comparability classes (sameAs, differentFro m,
equivalentClasses, etc.), characteristics of properties (inverseOf, TransitiveProperty,
SymmetricProperty, etc.), restrictions on properties (e.g., what classes they belong to,
or cardinality properties), intersection classes, etc. On the expressive properties there
are three "dialects" OWL:

 OWL Full, which has the maximu m expressive power, but does not guarantee the
  computability of logical conclusions in the ontology created with its help. For exam-
  ple, in OWL Full, classes can act both as a class and as an instance, which can lead
  to contradictions in the description of the ontology and the inability to make logical
  inference based on the existing rules.
 OWL DL (Description Logic) guarantees computational completeness) the logical
  output is computable) and solvability (calculations are performed in finite time).
  OWL DL contains all OWL Full constructs, but their use is limited.
 OWL Lite has the least expressive properties, but can be used as an intermediate in
  the transition from simple taxonomies to ontologies.

   Along with the advantages of OWL, which is the de facto standard for describing
ontological models, we can note a number of its limitations. One of them is the inability
to perform calculations based on fuzzy logic. A complex and ambiguous description in
OWL is when the model needs to reflect an exception that clarifies a statement. For
example, the fact that "birds fly" requires clarification, as not all species of birds fly.


3      Analysis of Tools for Creating Ontology Repositories

The simplest form of ontology storage is an OWL file. W hen such a file is read in
memory, a model (a set of claims) is created and further work is done on it. The obvious
drawback of this approach is the increase in memory costs, as well as a significant
increase in the loading time of OWL files as the amount of metadata contained in the
ontology increases. The need to use special language tools to extract metadata stored in
ontologies necessitates the construction of ontology repositories based on DBMS.
   From the point of view of structural features for the sto rage of ontologies best fit
graph database. In this case, the graph vertices can be used to store ontology concepts,
and the graph edges can be used to store relationships between concepts. Vertices and
edges can contain arbitrary sets of attributes. An edge always has a start node, end node,
type, and direction. Graph database management systems support methods for creating
(Create), reading (Read), modifying (Update), and deleting (delete) data (CRUD) based
on the graph data model. As a language means of data manipulation, SPARQL-li ke
languages are used, the syntax of which is close to the SQL - language of queries to
relational databases (for example, the query language Cypher in the graph database
Neo4j). Graph DBMS began to be actively used with the dev elopment of social net-
works and is now widely used, for example, to solve problems related to the search for
fraudulent and suspicious transactions in payment systems. In such tasks, it is important
to quickly find the vertices associated with the source (for example, search for friends
on social networks or accounts to which funds were transferred from a certain account).
   Additional impetus for the use of graph DBMS was obtained by optimizing search
queries (fast graph traversal algorithms) and creating indexes. Indexes help optimize
the search for specific nodes. Although most queries to graph databases involve fetch-
ing the necessary information when traversing relationships between vertices, there are
certain situations where you want to select specific nodes directly rather than by detect-
ing them by crawling. For example, to identify nodes that serve as starting points for
crawling, you need to find one or more specific nodes based on a specified combination
of attribute values (for example, select all s ocial network members under 20). To in-
crease the efficiency of searching nodes in graph DBMS, there are tools for creating
indexes for combinations of labels and properties.
   Let us consider the possibilities of relational DBMS as a mechanism for creating and
storing ontologies. Traditionally, a disadvantage of relational DBMS when working
with graph data models was the lack of tools for implementing hierarchical queries. For
example, to find vertices connected to a given vertex through any other vertices, it re-
quired numerous JOIN operations, which significantly slowed down the execution of
queries. To solve this problem, SQL-1999 provides the ability to create a temporary
view that is described by the WITH statement compared to SQL-92. After that, the data
is selected from it with a simple SELECT command. A number of DBMS have their
own means to implement these requests. For example, in ORACLE hierarchical queries
are implemented very efficiently using the CONNECT BY command, in MsSQL-2008
you can use a set of standardized stored functions to work with hierarchy levels (Hier-
archyid).
   Summarizing the above, it can be noted that relational DBMS compared to graph
DBMS have a higher efficiency of data search by values superimposed on attributes
(the task of selecting graph nodes directly, without taking into account relationships);
graph DBMS more effectively implement queries that take into account the relationship
between the vertices.
   The article [8] describes an approach for semantic data search, taking into account
the advantages of both graph and relational DBMS. In the form of a graph in this case,
a set of interrelated concepts that reflect the semantics of the subject area is stored. Data
is stored in relational tables that contain a significant number o f records. Initially, the
query selects the vertices of the graph according to the specified conditions imposed on
the relationships between concepts. The selected vertices contain the attribute values
by which records are searched in relational tables. Search on graph model is organized
by means of Neo4j DBMS, search of records in relational tables - by means of MySQL
DBMS.
   Analyzing the possibility of using graph and relational DBMS to work with ontolo-
gies, it should be noted that they do not have specialized mechanisms that support axi-
oms and rules of derivation of statements based on the relationship of concepts. To
implement these mechanisms, it is necessary to write software packages in procedural
languages, the use of which in large amounts of stored data is ineffective.
   Separate consideration in this case requires ORACLE database, which today can be
attributed to the class of object-relational. ORACLE 11g implements mechanisms
United by the term Semantic Technologies (semantic technologies). This ve rsion pro-
vides the ability to export and import OWL structures and supports the OWLPrime
ontology description language, which is a subset of the above OWL DL and includes
the following features:
 Ontology structure creation (class, subclass, property, subp roperty, domain, range,
  type);
 Specify characteristics of properties (transitive, symmetric, functional, inverse func-
  tional, inverse);
 Comparison of the classes (equivalence and disjointness);
 Comparison of the properties of (equivalence);
 Entity comparisons (same, different);
 Setting property restrictions (hasValue, someValuesFrom, allValuesFrom).

    To support OWLPrime, more than 50 rules are implemented that are used in the
logical inference process. A rule consists of a condition ("if"), a filter (conditio n), and
an output ("then"). In ORACLE 11g added the ability to configure custom rules using
a language OWLIF (structures IF-THEN). You can set specific restrictions on the rules
that a user can create. For example, you can specify that the user can create only logical
output within the subClassOf hierarchy on the system, and you can limit the number of
output steps.
    Requests to extract information from ontologies in ORACLE 11g are made using
SPARQL. The SEM_RULEBASES construct is used to connect output rule s created
by the user in SPARQL queries.


4      Summary

The paper shows the possibility of using graph and relational DBMS to create ontology
repositories
   Based on the above, it can be concluded that currently it is most appropriate to use
DBMS ORACLE version 11g to create ontology repositories used to ensure interoper-
ability of IS. However, it should be noted that to work with ontologies of this type, pre -
built inference rules (as mentioned above) will not be used as effectively. This is due
to the fact that they are more focused on the work with ontologies, the tops of which
are connected as a class (class) and subclass (subclass). In the ontology that can be used
to provide interoperability, as shown in [7], there are peaks, e.g., the types of "Assoc i-
ation" and "Action". In this case, the processing of such vertices requires writing cus-
tom output rules, which were mentioned earlier.
   As a positive trend, it should be noted that at present both relational and graph DBMS
have mechanisms for working with ontological structures. To date, the greatest devel-
opment, as indicated, such mechanisms were in the database ORACLE 11g, which was
given the name Semantic Technologies. It can be predicted that these mechanisms will
continue to develop and at some point it may be more convenient to create ontology
repositories by means of other DBMS (for example, Neo4j) or to use the technology of
sharing graph and relational DBMS [8].


References

 1.   Schwab K. The Fourth Industrial Revolution: what it means, how to respond, World Eco-
      nomic Forum, (WEF-2016) https://www.weforum.org/agenda/2016/01/the-fourth-indus-
      trial-revolution-what-it-means-and-how-to-respond/, last accessed 2019/03/30.
2.   Boychenko A., Korneev D., Kozlova O. Approach to solving the problem of ensuring in-
     teroperability of services. CONFERENCE Enterprise Engineering and Knowledge M an-
     agement (EEKM -2013), pp. 198-206 (2013).
3.   Boychenko A., Korneev D. Description of interoperability technology using the OSE/RM
     model. CONFERENCE Enterprise Engineering and Knowledge M anagement (EEKM -
     2014), pp. 212-218 (2014).
4.   ISO/IEC 24765-Systems and Software Engineering-Vocabulary, http://www.cse.msu.edu/
     ~cse435/Handouts/Standards/IEEE24765.pdf, last accessed 2019/03/30.
5.   EIF - European Interoperability Framework, http://ec.europa.eu/idabc/en /document/ 2319/
     5938.html, last accessed 2019/03/30.
6.   ISO/IEC 42010 Systems and software engineering - Recommended practice for architec-
     tural description of software-intensive system, https://www.iso.org/standard/ 45991.html,
     last accessed 2019/03/30.
7.   Boychenko A., Korneev D., Kazakov V. Development of ontology structure for semantic
     interoperability of information systems. CONFERENCE Enterprise Engineering and
     Knowledge M anagement (EEKM -2018), pp. 163-172 (2018).
8.   Boychenko A., Korneev D., Kazakov V. Data search based on semantic model.
     CONFERERENCE Enterprise Engineering and Knowledge M anagement (EEKM -2015),
     pp. 123-132 (2015).