<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>On an Approach to Data Integration: Concept, Formal Foundations and Data Model</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>© Manuk G. Manukyan</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Proceedings of the XIX International Conference “Data Analytics and Management in Data Intensive Domains” (DAMDID/RCDL'2017)</institution>
          ,
          <addr-line>Moscow</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Yerevan State University</institution>
          ,
          <addr-line>Yerevan</addr-line>
          ,
          <country country="AM">Armenia</country>
        </aff>
      </contrib-group>
      <fpage>206</fpage>
      <lpage>213</lpage>
      <abstract>
        <p>In the frame of an extensible canonical data model a formalization of data integration concept is proposed. We provide virtual and materialized integration of data as well as the possibility to support data cubes with hierarchical dimensions. The considered approach of formalization of data integration concept is based on the so-called content dictionaries. Namely, by means of these dictionaries we are formally defining basic concepts of database theory, metadata about these concepts, and the data integration concept. A computationally complete language is used to extract data from several sources, to create the materialized view, and to effectively organize queries on the multidimensional data. In memory of Garush Manukyan, my father. This work was supported by the RA MES State Committee of Science, in the frames of the research project N 15T-18350.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>
        The emergence of a new paradigm in science and
various applications of information technology (IT) are
related to issues of big data handling [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. The concept
of big data is relatively new and involves the growing
role of data in all areas of human activity beginning
with research and ending with innovative developments
in business. Such data is difficult to process and analyze
using conventional database technologies. In this
connection, the creation of new IT is expected in which
data becomes dominant for new approaches to
conceptualization, organization, and implementation of
systems to solve problems that were previously considered
extremely hard or, in some cases, impossible to solve.
      </p>
      <sec id="sec-1-1">
        <title>Unprecedented scale of development in the big data</title>
        <p>area and the U.S. and European programs related to big
data underscore the importance of this trend in IT.</p>
        <p>
          In the above discussed context the problems of
data integration are very actual. Within of our approach to
data integration an extensible canonical model has been
developed [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]. We have published a number of papers
that are devoted to the investigation of data virtual and
materialized data integration problems, for instance [
          <xref ref-type="bibr" rid="ref15 ref17">15,
17</xref>
          ]. Our approach to data integration is based on the
works of the SYNTHESIS group (IPI RAS) [
          <xref ref-type="bibr" rid="ref10 ref11 ref12 ref2 ref22 ref23 ref24 ref25 ref9">2, 9–12,
22–25</xref>
          ], who are pioneers in the area of justifiable data
models mapping for heterogeneous databases
integration. To support materialized integration of data during
creation of a data warehouse a new dynamic index
structure for multidimensional data was proposed [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]
which is based on the grid files [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] concept. We
consider the concept of grid files as one of the adequate
formalisms for effective management of big data.
Efficient algorithms for storage and access of that directory
are proposed in order to minimize memory usage and
lookup operations complexities. Estimations of
complexities for these algorithms are presented. In fact, the
concept of grid files allows to effectively organize
queries on multidimensional data [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] and can be used for
efficient data cubes storage in data warehouses [
          <xref ref-type="bibr" rid="ref13 ref19">13,19</xref>
          ].
        </p>
      </sec>
      <sec id="sec-1-2">
        <title>A prototype to support the considered dynamic indexation scheme has been created and its performance was compared with one of the most demanded NoSQL databases [17].</title>
        <p>
          In this paper a formalization of the data integration
concept is proposed using a mechanism of the content
dictionaries (similarly ontologies) of the OPENMath
[
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. Subjects of the formalization are the basic concepts
of database theory, metadata about these concepts and
the data integration concept. The result of the
formalization are a set of content dictionaries, constructed as
XML DTDs on the base of OPENMath and are used to
model the databases concepts. With this approach,
schema of an integrated database is an instance of
content dictionary of the data integration concept. Within
the considered approach is provided virtual and
materialized integration of data as well as the possibility to
support data cubes with hierarchical dimensions. Using
        </p>
      </sec>
      <sec id="sec-1-3">
        <title>OPENMath as the kernel of the canonical data model</title>
        <p>allows us to use a rich apparatus of computational
mathematics for data analysis and management.</p>
        <p>The paper is organized as follows: Concept and
formal foundations of the considered approach to data
integration are presented briefly in Section 2. Canonical
data model and issues to support the data integration
concept are considered in Section 3. The conclusion is
provided in Section 4.
2 Brief Discussion on Data Integration
Approach</p>
        <p>The basis of our concept to data integration is based
on the idea of integrating arbitrary data models. Based
on this assumption our concept of data integration
assumes:
• applying extensible canonical model;
•
•
constructing justifiable data models mapping
for heterogeneous databases integration;
using content dictionaries.</p>
        <p>
          Choosing the extensible canonical model as
integration model allows integrating arbitrary data sources. As
we allow integration of arbitrary data sources a
necessity to check mapping correctness between data models
arises. It is reached by formalization of data model
concepts by means of AMN machines [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] and using
Btechnology to prove correctness of these mappings.
        </p>
      </sec>
      <sec id="sec-1-4">
        <title>The content dictionaries are central to our concept</title>
        <p>of data integration and semantical information of
different types can be defined based on these dictionaries.
The concept of content dictionaries allows us to extend
the canonical model by means of introducing new
concepts in these dictionaries easily. In other words,
canonical model extension only is reduced to adding new
concepts and metadata about these concepts in content
dictionaries. Our concept to data integration is oriented
as virtual and materialized integration of data as well as
to support data cubes with hierarchical dimensions. It is
important that in all cases we use the same data model.
The considered data model is an advanced XML data
model which is a more flexible data model than
relational or object-oriented data models. Among XML
data models, a distinctive feature of our model is that
we use a computationally complete language for data
definition. An important feature of our concept is the
support of data warehouses on the base of a new
dynamic indexing scheme for multidimensional data. A
new index structure developed by us allows to organize
effectively OLAP-queries on multidimensional data and
can be used for efficient data cubes storage in data
warehouses. Finally, the modern trends of the
development of database systems lead to use of different
divisions of mathematics to data analysis. Within of our
concept to data integration, this leads to the use of
corresponding content dictionaries of the OPENMath.</p>
        <sec id="sec-1-4-1">
          <title>2.1 Formal Foundations</title>
        </sec>
      </sec>
      <sec id="sec-1-5">
        <title>The above discussed concept to data integration is</title>
        <p>based on the following formalisms:
• canonical data model;
• OPENMath objects;
• multidimensional indexes;
• domain element calculus.</p>
      </sec>
      <sec id="sec-1-6">
        <title>Below we will consider these formalisms in detail.</title>
      </sec>
      <sec id="sec-1-7">
        <title>As we noted, our approach to data integration is based</title>
        <p>SCH_CM
SCH_SM
g
n
i
p
p
a
m
OP_CM
g
n
i
p
p
a
m</p>
        <p>P_SM
on the works of the SYNTHESIS group. According to
the research of this group, each data model is defined by
syntax and semantics of two languages, data definition
language (DDL) and data manipulation language
(DML). They suggested the following principles of
synthesis of the canonical model:
• Principle of axiomatic extension of data models</p>
        <p>The canonical data model must be extensible. The
kernel of the canonical model is fixed. Kernel extension
is defined axiomatically. The extension of the canonical
data model is formed during the consideration of each
new data model by adding new axioms to its DDL to
define logical data dependencies of the source model in
terms of the target model if necessary. The results of the
extension should be equivalent to the source data
model.
• Principle of commutative mappings of data
models</p>
      </sec>
      <sec id="sec-1-8">
        <title>The main principle of mapping of an arbitrary re</title>
        <p>source data model into the target one (the canonical
model) could be reached under the condition that the
diagram of DDL (schemas) mapping and the diagram of</p>
      </sec>
      <sec id="sec-1-9">
        <title>DML (operators) mapping are commutative.</title>
        <p>semantic
function
semantic
function
DB_CM
ev ign
itc pp
jie a
b m
DB_SM</p>
      </sec>
      <sec id="sec-1-10">
        <title>In Figure 1 we used the following notations:</title>
      </sec>
      <sec id="sec-1-11">
        <title>SCH_CM: Set of schemas of the canonical data model;</title>
      </sec>
      <sec id="sec-1-12">
        <title>SCH_SM: Set of schemas of the source data model;</title>
      </sec>
      <sec id="sec-1-13">
        <title>DB_CM: Database of the canonical data model; DB_SM:</title>
      </sec>
      <sec id="sec-1-14">
        <title>Database of the source model.</title>
        <p>semantic
function
semantic
function</p>
        <p>DB_CM</p>
        <p>DB_CM
ichm ten
itro ienm
lag fre
DB_SM</p>
        <p>DB_SM
Figure 2 DML mapping diagram</p>
      </sec>
      <sec id="sec-1-15">
        <title>In Figure 2 we used the following notations: OP_CM:</title>
      </sec>
      <sec id="sec-1-16">
        <title>Set of operators of the canonical data model; P_SM: Set</title>
        <p>of procedures in DML of the source model.
•</p>
        <p>Principle of synthesis of unified canonical data
model</p>
      </sec>
      <sec id="sec-1-17">
        <title>The canonical data model is synthesized as a union</title>
        <p>of extensions.
used to assign formal and informal semantics to all
symbols used in the OPENMath objects. A content
dictionary is a collection of related symbols encoded in</p>
      </sec>
      <sec id="sec-1-18">
        <title>XML format. In other words, each content dictionary defines symbols representing a concept from the specific subject domain. Figure 3 Canonical data model</title>
        <p>2.2 Mathematical Objects Representation</p>
        <p>The OpenMath is a standard for representation of
the mathematical objects, allowing them to be
exchanged between computer programs, stored in
databases, or published on the Web. The considered
formalism is oriented to represent semantic information and is
not intended to be used directly for presentation. Any
mathematical concept or fact is an example of
mathematical object. The OpenMath objects are such
representation of mathematical objects which assume an</p>
      </sec>
      <sec id="sec-1-19">
        <title>XML interpretation.</title>
        <p>
          Formally, an OpenMath object is a labeled tree
whose leaves are the basic OpenMath objects. The
compound objects are defined in terms of binding and
application of λ-calculus [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. The type system is built
on the basis of types that are defined by themselves and
certain recursive rules, whereby the compound types are
built from simpler types. To build compound types the
following type constructors are used:
•
•
•
        </p>
      </sec>
      <sec id="sec-1-20">
        <title>Attribution. If v is a basic object variable and t is a</title>
        <p>typed object, then attribution (v, type t) is a typed
object. It denotes a variable with type t.</p>
      </sec>
      <sec id="sec-1-21">
        <title>Abstraction. If v is a basic object variable and t, A</title>
        <p>are typed objects, then binding (λ, attribution (v,
type t), A) is a typed object.</p>
      </sec>
      <sec id="sec-1-22">
        <title>Application. If F and A are typed objects, then ap</title>
        <p>plication (F, A) is a typed object.</p>
        <p>The OPENMath is implemented as an XML
application. Its syntax is defined by syntactical rules of XML,
its grammar is partially defined by its own DTD. Only
syntactical validity of the OPENMath objects
representation can be provided on the DTD level. To check
semantics, in addition to general rules inherited by XML
applications, the considered application defines new
syntactical rules. This is achieved by means of
introduction of content dictionaries. Content dictionaries are</p>
        <p>attr
book type</p>
        <p>app
sequence app</p>
        <p>attr
OneOrMore attr
title type
string
author
type</p>
        <p>string
2.3 Dynamic Indexing Scheme for Multidimensional
Data</p>
        <p>
          To support the materialized integration of data
during the creation of a data warehouse and to apply very
complex OLAP-queries on it a new dynamic index
structure for multidimensional data was developed (see
more details in [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]). The considered index structure is
based on the grid file concept. The grid file can be
represented as if the space of points is partitioned into an
imaginary grid. The grid lines parallel to axis of each
dimension divide the space into stripes. The number of
grid lines in different dimensions may vary, and there
may be different spacings between adjacent grid lines,
even between lines in the same dimension. Intersections
of these stripes form cells which hold references to data
buckets containing records belonging to corresponding
space partitions.
        </p>
        <p>The weaknesses of the grid file formalism concept
are non-efficient memory usage by groups of cells
referring to the same data buckets and the possibility of
having a large number of overflow blocks for each data
buckets. In our approach, we made an attempt to
eliminate these defects of the grid file. Firstly, we introduced
the concept of the chunk: set of cells whose
corresponding records are stored in the same data bucket
(represented by single memory cells with one pointer to the
corresponding data buckets). Chunking technique is
used to solve the problem of empty cells in the grid file.
w</p>
        <p>1
w
2</p>
        <p>Y
u
1
Grid
partitions
u
2
u
3
v
1
v
3
v
2
Z
buckets</p>
        <p>X</p>
      </sec>
      <sec id="sec-1-23">
        <title>Secondly, we consider each stripe as a linear hash</title>
        <p>table which allows increasing the number of buckets
more slowly (for each stripe, the average number of
overflow blocks of chunks crossed by that stripe is less
than one). By using this technique we essentially restrict
the number of disk operations.</p>
        <p>Chunk Imaginary divisions
s
s
e
p
i
r
t
size can be estimated as 
. Compared to MDH
and MEH techniques, the directory size in our approach
 

is</p>
        <p>
          and    − times smaller correspondingly. We
have implemented a data warehouse prototype based on
the proposed dynamic indexation scheme and compared
its performance with MongoDB [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ] (see in [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]).
        </p>
        <sec id="sec-1-23-1">
          <title>2.4 Element Calculus</title>
          <p>
            In the frame of our approach to data integration as
integration model we consider an advanced XML data
model. In fact, data model defines the query language
[
            <xref ref-type="bibr" rid="ref5">5</xref>
            ]. Based on this, to give declarative queries a new
query language (domain element calculus) [
            <xref ref-type="bibr" rid="ref14">14</xref>
            ] was
developed. A query to XML - database is a formula in
element calculus language. To specify formulas a
variant of the multisorted first order predicate logic
language is used. Notice that element calculus is developed
in the style of object calculus [
            <xref ref-type="bibr" rid="ref10">10</xref>
            ]. In addition, there is a
possibility to give queries by means of λ-expressions.
          </p>
        </sec>
      </sec>
      <sec id="sec-1-24">
        <title>Generally, we can combine the considered variants of queries.</title>
        <p>3 Extensible Canonical Data Model</p>
        <p>
          The canonical model kernel is an advanced XML
data model: a minor extension of the OPENMath to
support the concept of databases. The main difference
between our XML data model and analogous XML data
models (in particular, XML Schema) is that the concept
of data types in our case is interpreted conventionally
(set of values, set of operations). More details about the
type system of the XML Schema can be found in [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. A
data model concept formalized on the kernel level is
referred to as kernel concept.
        </p>
        <sec id="sec-1-24-1">
          <title>3.1 Kernel Concepts</title>
        </sec>
      </sec>
      <sec id="sec-1-25">
        <title>In the frame of canonical data model we distinguish</title>
        <p>basic and compound concepts. Formally, a kernel
concept is a labeled tree whose leaves are basic kernel
concepts. Examples of basic kernel concepts are constants,
variables, and symbols (for instance, reserved words).</p>
      </sec>
      <sec id="sec-1-26">
        <title>The compound concepts are defined in terms of binding</title>
        <p>and application of λ-calculus. The type system is built
analogously to that in OPENMath.</p>
        <sec id="sec-1-26-1">
          <title>3.2 Extension Principle</title>
          <p>As we noted above the canonical data model must
be extensible. The extension of the canonical model is
formed during the consideration of each new data
model by adding new concepts to its DDL to define logical
data dependencies of the source model in terms of the
target model if necessary. Thus, the canonical model
extension assumes defining new symbols. The
extension result must be equivalent to the source data model.</p>
        </sec>
      </sec>
      <sec id="sec-1-27">
        <title>To apply a symbol on the canonical model level the following rule has been proposed:</title>
      </sec>
      <sec id="sec-1-28">
        <title>Concept symbol ContextDefinition.</title>
      </sec>
      <sec id="sec-1-29">
        <title>For example, to support the concept of key of relational data model, we have expanded the canonical model with the symbol key. Let us consider a relational schema example:</title>
        <p>S = {S#, Sname, Status, City}.</p>
      </sec>
      <sec id="sec-1-30">
        <title>The equivalent definition of this schema by means</title>
        <p>of extended kernel is considered below:
attribution (S, type TypeContext, constraint</p>
      </sec>
      <sec id="sec-1-31">
        <title>ConstraintContext)</title>
        <p>TypeContext application (sequence,</p>
      </sec>
      <sec id="sec-1-32">
        <title>ApplicationContext)</title>
      </sec>
      <sec id="sec-1-33">
        <title>ApplicationContext attribution (S#, type int),</title>
        <p>attribution (Sname, type string),
attribution (Status, type int),
attribution (City, type string))</p>
      </sec>
      <sec id="sec-1-34">
        <title>ConstraintContext attribution (name, key S#).</title>
      </sec>
      <sec id="sec-1-35">
        <title>It is essential that we use a computationally complete language to define the context [14]. As a result of such approach, usage of new symbols in the DDL does not lead to any changes in the DDL parser.</title>
        <sec id="sec-1-35-1">
          <title>3.3 Semantic Level</title>
          <p>The canonical model is an XML application. Only
syntactical validity of the canonical model concepts
representation can be provided on the DTD level. To
check semantics the considered application defines new
syntactical rules. We define these syntactical rules in
content dictionaries.</p>
        </sec>
        <sec id="sec-1-35-2">
          <title>3.4 Content Dictionaries</title>
        </sec>
      </sec>
      <sec id="sec-1-36">
        <title>The content dictionary is the main formalism to de</title>
        <p>fine semantical information about concepts of the
canonical data model. In other words, content dictionaries
are used to assign formal and informal semantics to all
concepts of the canonical data model. A content
dictionary is a collection of related symbols, encoded in
XML format and fixes the “meaning” of concepts
independently of the application. Three kinds of content
dictionaries are considered:
• content dictionaries to define basic concepts
(symbols);
• content dictionaries to define a signature of basic
concepts (mathematical symbols) to check the
semantic validity of their representation;
• content dictionary to define a data integration
concept.</p>
      </sec>
      <sec id="sec-1-37">
        <title>Supporting the above considered content dictionar</title>
        <p>ies assumes to develop corresponding DTDs. Instances
of such DTDs are XML documents. An instance of a</p>
      </sec>
      <sec id="sec-1-38">
        <title>DTD of a content dictionary of basic concepts is used to</title>
        <p>assign formal and informal semantics of those objects.</p>
      </sec>
      <sec id="sec-1-39">
        <title>Finally, an instance of a DTD of a content dictionary of</title>
        <p>a signature of basic concepts contains metainformation
about these concepts, and an instance of a DTD of a
content dictionary of a data integration concept is a
metadata for integrating databases.</p>
        <sec id="sec-1-39-1">
          <title>3.5 Data Integration Concept</title>
          <p>In the frame of our approach to data integration we
consider virtual as well as materialized data integration
issues within a canonical model. Therefore, we should
formalize the concepts of this subject area such as
mediator, data warehouse and data cube. We are
modelling these concepts by means of the following XML
elements: dbsch, med, whse and cube.</p>
        </sec>
      </sec>
      <sec id="sec-1-40">
        <title>Mediator. The content of element dbsch is based on</title>
        <p>the kernel attribution concept and has an attribute name.</p>
      </sec>
      <sec id="sec-1-41">
        <title>By means of this concept we can model schemas of</title>
        <p>databases. The value of attribute name is the DB's
name. The content of element med is based on the
elements msch, wrapper, constraint and has an attribute
name. The value of this attribute is the mediator's name.</p>
      </sec>
      <sec id="sec-1-42">
        <title>The element msch is interpreted analogously to element</title>
        <p>
          dbsch. Only note that this element is used during
modelling schemas of a mediator. The content of elements
wrapper and constraint is based on the kernel
application concept. By means of wrapper element mappings
from source models into a canonical model are defined.
The integrity constraints on the level of mediator are the
values of the constraints elements. It is important that
we are using a computationally complete language for
defining the mappings and integrity constraints. Below,
an example of a mediator for an automobile company
database is adduced [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] which is an instance of a
content dictionary of data integration concept. It is assumed
that the mediator with schema AutosMed = {SerialNo,
Model, Color} is integrate two relational sources: Cars
= {SerialNo, Model, Color} and Autos = {Serial,
Model}, Colors = {Serial, Color}.
&lt;cd name = ‘dic’&gt;
&lt;dbsch name = ‘Source1’&gt;
&lt;omattr&gt;
schema definition of Cars
&lt;/omattr&gt;
&lt;/dbsch&gt;
&lt;dbsch name = ‘Source2’&gt;
&lt;omattr&gt;
schema definition of Autos
&lt;/omattr&gt;
&lt;omattr&gt;
schema definition of Colors
&lt;/omattr&gt;
&lt;/dbsch&gt;
&lt;med name = ‘Example’&gt;
&lt;msch&gt;
&lt;omattr&gt;
AutosMed: schema for mediator is defined
&lt;/omattr&gt;
&lt;/msch&gt;
&lt;wrapper&gt;
&lt;oma&gt;
&lt;oms name = ‘convert_to_xml’ cd = ‘xml’/&gt;
&lt;oma&gt;
&lt;oms name = ‘union’ cd = ‘db’/&gt;
&lt;omv name = ‘Cars’/&gt;
&lt;oma&gt;
&lt;oms name = ‘join’ cd = ‘db’/&gt;
&lt;omv name = ‘Autos’/&gt;
&lt;omv name = ‘Colors’/&gt;
&lt;/oma&gt;
&lt;/oma&gt;
&lt;/oma&gt;
&lt;/wrapper&gt;
&lt;/med&gt;
&lt;/cd&gt;
        </p>
      </sec>
      <sec id="sec-1-43">
        <title>It is essential that, we use a computationally com</title>
        <p>plete language to model the mediator work.</p>
        <p>Data warehouse. As we noted above the considered
approach to support data warehousing is based on the
grid file concept and is interpreted by means of element
whse. This element is defined as kernel application
concept and is based on the elements wsch, extractor,
grid and has an attribute name. The value of this
attribute is the name of the data warehouse. The element
wsch is interpreted in the same way as the element msch
for the mediator. The element extractor is defined as
kernel application concept and is used to extract data
from source databases. The element grid is defined as
kernel application concept and is based on the elements
dim and chunk by which the grid file concept is
modelled. To model the concept of stripe of a grid file we
introduced an empty element stripe which is described
by means of five attributes: ref_to_chunk, min_val,
max_val, rec_cnt and chunk_cnt. The values of
attributes ref_to_chunk are pointers to chunks crossed by
each stripe. By means of min_val (lower boundary) and
max_val (upper boundary) attributes we define "widths"
of the stripes. The values of attributes rec_cnt and
chunk_cnt are the total number of records in a stripe and
the number of chunks that are crossed by it
correspondingly. To model the concept chunk we introduced an
element chunk which is based on the empty element avg
and is described by means of four attributes: id of type
ID, qty, ref_to_db and ref_to_chunk. Values of
attributes ref_to_db and ref_to_chunk are pointers to data
blocks and other chunks, correspondingly. Value of
attribute qty is the number of different points of the
considered chunk for fixed dimension. Element avg is
described by means of two attributes: value and dim.
Values of value attributes are used during
reorganization of the grid file and contain the average coordinates
of points, corresponding to records of the considered
chunk, for each dimension. Value of attribute dim is the
name of the corresponding dimension. To model the
concept of dimension of a grid file we introduced an
element dim which is based on the empty element stripe
and has a single attribute name: i. e. the dimension
name.</p>
        <p>
          Data cube. Materialized integration of data assumes
the creation of data warehouses. Our approach to create
data warehouses is mainly oriented to support data
cubes. Using data warehousing technologies in OLAP
applications is very important [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. Firstly, the data
warehouse is a necessary tool to organize and centralize
corporate information in order to support OLAP queries
(source data are often distributed in heterogeneous
sources). Secondly, significant is the fact that OLAP
queries, which are very complex in nature and involve
large amounts of data, require too much time to perform
in a traditional transaction processing environment. To
model the data cube concept we introduced an element
cube which is interpreted by means of the following
elements: felement, delement, fcube, rollup, mview and
granularity. In typical OLAP applications, some
collection of data called fact_table which represent events or
objects of interest are used [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. Usually, fact_table
contains several attributes representing dimensions, and one
or more dependent attributes that represent properties
for the point as a whole. To model the fact_table
concept we introduced an element felement which is based
on the kernel attribution concept. To model the concept
of dimension we introduced an element delement. This
element is based on the empty element element which is
described by means of attribute name. Value of attribute
name is the dimension name. The creation of the data
cube requires generation of the power set (set of all
subset) of the aggregation attributes. To implement the
formal data cube concept in literature the CUBE
operator is considered [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. In addition to the CUBE operator
in [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] the operator ROLLUP is produced as a special
variety of the CUBE operator which produces the
additional aggregated information only if they aggregate
over a tail of the sequence of grouping attributes. To
support these operators we introduced cube and rollup
symbols correspondingly. In this context, it is assumed
that all independent attributes are grouping attributes.
For some dimensions there are many degrees of
granularity that could be chosen for a grouping on that
dimension. When the number of choices for grouping
along each dimension grows, it becomes non-effective
to store the results of aggregating based on all the
subsets of groupings. Thus, it becomes reasonable to
introduce materialized views.
        </p>
        <p>All
State
City</p>
        <p>Dealer
All
Days</p>
        <p>Years
Quarters</p>
        <p>Months</p>
        <p>Weeks
Figure 7 Examples of lattices partitions for time
intervals and automobile dealers</p>
        <p>
          Materialized views. A materialized view is the result
of some query which is stored in the database, and
which does not contain all aggregated values. To model
the materialized view concept we introduce an element
mview which is interpreted by means of an element
view, and the last is based on the kernel attribution
concept. When implementing the query over hierarchical
dimension, a problem to choose an effective
materialized view arises. In other words, if we have aggregated
values regarding to granularity Months and Quarters
then for aggregation regarding to granularity on Years it
will be effective to apply query over materialized view
with granularity Quarters. As in [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], we also consider
the lattice (a partially ordered set) as a relevant
construction to formalize the hierarchical dimension. The
lattice nodes correspond to the units of the partitions of
a dimension. In general, the set of partitions of a
dimension is a partially ordered set. We say that partition P1 is
precedes partition P2, written P1 ≤ P2 if and only if there
is a path from node P1 to node P2. Based on the lattices
for each dimension we can define a lattice for all the
possible materialized views of a data cube which are
created by means of grouping according to some
partition in each dimension. Let V1 and V2 be views, then V1
≤ V2 if and only if for each dimension of V1 with
partition P1 and analogous dimension of V2 with partition P2
holds P1 ≤ P2. Finally, let V be a view and Q be a query.
We can implement this query over the considered view
if and only if V ≤ Q. To model the concept of
hierarchical dimension we introduced an element granularity
which is based on an empty element partition, and the
latter is described by means of attribute name. The
value of attribute name is the name of the granularity.
Below, an example of data cube for an automobile
company database is adduced [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] which is an instance of
content dictionary of data integration concept. We consider
Sales = {SerialNo, Dealer, Date, Price} as a data cube
schema. The considered data cube is implemented on
the base of materialized views and is based on three
dimensions: Auto, Dealer and Date and has one
dependent attribute: Value Set of partitions of dimension
Date form a partially ordered set. We are using two
granularity elements to represent this set.
&lt;cd name = ‘dic’&gt;
...
&lt;cube name = ‘example’&gt;
&lt;felement&gt;
&lt;omattr&gt;
schema definition of Sales
&lt;/omattr&gt;
&lt;/felement&gt;
&lt;delement&gt;
&lt;element name = ‘Auto’/&gt;
&lt;element name = ‘Dealer’/&gt;
&lt;element name = ‘Date’/&gt;
&lt;/delement&gt;
&lt;mview&gt;
&lt;view name = ‘View1’&gt;
&lt;omattr&gt;
definition of materialized view Sales1
&lt;/omattr&gt;
&lt;/view&gt;
&lt;view name = ‘View2’&gt;
&lt;omattr&gt;
definition of materialized view Sales2
&lt;/omattr&gt;
&lt;/view&gt;
&lt;/mview&gt;
&lt;granularity name = ‘Date’&gt;
&lt;partition name = ‘days’/&gt;
&lt;partition name = ’months’/&gt;
&lt;partition name = ‘quarters’/&gt;
&lt;partition name = ‘years’/&gt;
&lt;/granularity&gt;
&lt;granularity name = ‘Date’&gt;
&lt;partition name = ‘days’/&gt;
&lt;partition name = ’weeks’/&gt;
&lt;/granularity&gt;
&lt;/cube&gt;
&lt;/cd&gt;
The detailed discussion of the issues connected with
applying the query language to integrated data is
beyond the topic of this paper. Below the
XMLformalization of data integration concept is presented.
&lt;!-- include dtd for extended OPENManth objects --&gt;
&lt;!ELEMENT cd (dbsch|med|whse|cube)*&gt;
&lt;!ATTLIST cd name CDATA #REQUIRED&gt;
&lt;!ELEMENT dbsch (omattr)+&gt;
&lt;!ATTLIST dbsch name CDATA #REQUIRED&gt;
&lt;!ELEMENT med (msch,wrapper,constraint*)&gt;
&lt;!ELEMENT msch (omattr)&gt;
&lt;!ELEMENT wrapper (oma)&gt;
&lt;!ELEMENT constraint (oma)&gt;
&lt;!ATTLIST med name CDATA #REQUIRED&gt;
&lt;!ELEMENT whse (wsch,extractor,grid)&gt;
&lt;!ELEMENT wsch (omattr)&gt;
&lt;!ELEMENT extractor (oma)&gt;
&lt;!ATTLIST whse name CDATA #REQUIRED&gt;
&lt;!ELEMENT grid (dim+,chunk+)&gt;
&lt;!ELEMENT dim (stripe)+&gt;
&lt;!ELEMENT stripe EMPTY&gt;
&lt;!ELEMENT chunk (avg)+&gt;
&lt;!ELEMENT avg EMPTY&gt;
&lt;!ATTLIST dim name CDATA #REQUIRED&gt;
&lt;!ATTLIST avg value CDATA #IMPLIED
        </p>
        <p>dim CDATA #REQUIRED&gt;
&lt;!ATTLIST chunk id ID #REQUIRED
qty CDATA #REQUIRED
ref_to_db CDATA #REQUIRED
ref_to_chunk IDREFS #IMPLIED&gt;
&lt;!ATTLIST stripe ref_to_chunk IDREFS #IMPLIED
min_val_CDATA #REQUIRED
rec_cnt CDATA #REQUIRED
max_val_CDATA #REQUIRED
chunk_cnt CDATA #REQUIRED&gt;
&lt;!ELEMENT cube (felement,delement,mview?,
granularity*)&gt;
&lt;!ELEMENT felement (omattr)&gt;
&lt;!ELEMENT delement (element)+&gt;
&lt;!ELEMENT element EMPTY&gt;
&lt;!ATTLIST element name CDATA #REQUIRED&gt;
&lt;!ELEMENT mview (view)+&gt;
&lt;!ELEMENT view (omattr)&gt;
&lt;!ELEMENT granularity (partition)+&gt;
&lt;!ELEMENT partition EMPTY&gt;
&lt;!ATTLIST view name CDATA #REQUIRED&gt;
&lt;!ATTLIST granularity name CDATA #REQUIRED&gt;
&lt;!ATTLIST partition name CDATA #REQUIRED&gt;</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>4 Conclusion</title>
      <p>The data integration concept formalization
problems were investigated. The outcome of this
investigation is a definition language of integrable data, which is
based on the formalization of the data integration
concept using a mechanism of the content dictionaries of
the OPENMath. Supporting the concept of data
integration is achieved by the creation of content dictionaries,
each of which contains formal definitions of concepts
of a specific area of databases.</p>
      <p>The data integration concept is represented as a set
of XML DTDs which are based on the OPENMath
formalism. By means of such DTDs were formalized the
basic concepts of database theory, metadata about these
concepts and the data integration concept. Within our
approach to data integration, an integrated schema is
represented as an XML document which is an instance
of an XML DTD of the data integration concept. Thus,
modelling of the integrated data based on the
OPEN</p>
      <sec id="sec-2-1">
        <title>Math formalism leads to the creation of the correspond</title>
        <p>ing XML DTDs.</p>
        <p>By means of the developed content dictionary of
the data integration concept we are modelling the
mediator and the data warehouse concepts. The considered
approach provides virtual and materialized integration
of data as well as the possibility to support data cubes
with hierarchical dimensions. Within our concept of
data cube, the operators CUBE and ROLLUP are
implemented. If necessary, in data integrated schemas new
super-aggregate operators can be define. We use a
computationally complete language to create schemas of
integrated data. Applying the query language to the
integrated data is generated a reduction problem.
Supporting the query language over such data requires
additional investigations.</p>
      </sec>
      <sec id="sec-2-2">
        <title>Finally, modern trends of the development of data</title>
        <p>base systems lead to the application of different
divisions of mathematics to data analysis and management.</p>
      </sec>
      <sec id="sec-2-3">
        <title>In the frame of our approach to data integration, this leads to the use of corresponding content dictionaries of the OPENMath.</title>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Abrial</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.-R.: The B-Book</surname>
          </string-name>
          :
          <article-title>Assigning programs to meaning</article-title>
          . Cambridge University Press (
          <year>1996</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Briukhov</surname>
            ,
            <given-names>D. O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vovchenko</surname>
            ,
            <given-names>A. E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zakharov</surname>
            ,
            <given-names>V. N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhelenkova</surname>
            ,
            <given-names>O. P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kalinichenko</surname>
            ,
            <given-names>L. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Martynov</surname>
            ,
            <given-names>D. O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Skvortsov</surname>
            ,
            <given-names>N. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stupnikov</surname>
            ,
            <given-names>S. A.</given-names>
          </string-name>
          :
          <article-title>The Middleware Architecture of the Subject Mediators for Problem Solving over a Set of Integrated Heterogeneous Distributed Information Resources in the Hybrid GridInfrastructure of Virtual Observatories</article-title>
          .
          <source>Informatics and Applications</source>
          ,
          <volume>2</volume>
          (
          <issue>1</issue>
          ), pp.
          <fpage>2</fpage>
          -
          <lpage>34</lpage>
          , (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Date</surname>
            ,
            <given-names>C. J.:</given-names>
          </string-name>
          <article-title>An Introduction to Database Systems</article-title>
          . Addison Wesley, USA (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Drawar</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>OpenMath: An overview</article-title>
          .
          <source>ACM SIGSAM Bulletin</source>
          ,
          <volume>34</volume>
          (
          <issue>2</issue>
          ), (
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Garcia-Molina</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ullman</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Widom</surname>
          </string-name>
          , J.:
          <article-title>Database Systems: The Complete Book</article-title>
          . Prentice Hall, USA (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Gevorgyan</surname>
            ,
            <given-names>G. R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manukyan</surname>
            ,
            <given-names>M. G.</given-names>
          </string-name>
          :
          <article-title>Effective Algorithms to Support Grid Files</article-title>
          .
          <source>RAU Bulletin, (2)</source>
          , pp.
          <fpage>22</fpage>
          -
          <lpage>38</lpage>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Gray</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bosworth</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Layman</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pirahesh</surname>
            ,
            <given-names>H.: Data</given-names>
          </string-name>
          <string-name>
            <surname>Cube: A Relational Aggregation Operator Generalizing</surname>
          </string-name>
          Group-By,
          <article-title>Cross-Tab, and Sub-Tab</article-title>
          .
          <source>In ICDE</source>
          , pp.
          <fpage>152</fpage>
          -
          <lpage>159</lpage>
          (
          <year>1996</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Hindley</surname>
            ,
            <given-names>J. R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Seldin</surname>
            ,
            <given-names>J. P.</given-names>
          </string-name>
          : Introduction to Combinators and λ-Calculus. Cambridge University Press (
          <year>1986</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Kalinichenko</surname>
            ,
            <given-names>L. A.</given-names>
          </string-name>
          :
          <article-title>Methods and Tools for Equivalent Data Model Mapping Construction</article-title>
          . In EDBT, pp.
          <fpage>92</fpage>
          -
          <lpage>119</lpage>
          , Springer (
          <year>1990</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Kalinichenko</surname>
            ,
            <given-names>L. A.</given-names>
          </string-name>
          :
          <article-title>Integration of Heterogeneous Semistructured Data Models in the Canonical One</article-title>
          .
          <source>In RCDL</source>
          , pp.
          <fpage>3</fpage>
          -
          <lpage>15</lpage>
          (
          <year>1990</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Kalinichenko</surname>
            ,
            <given-names>L. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stupnikov</surname>
            ,
            <given-names>S. A.</given-names>
          </string-name>
          :
          <article-title>Constructing of Mappings of Heterogeneous Information Models into the Canonical Models of Integrated Information Systems</article-title>
          .
          <source>In Proc. of the 12th EastEuropean Conference</source>
          , pp.
          <fpage>106</fpage>
          -
          <lpage>122</lpage>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Kalinichenko</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stupnikov</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Synthesis of the Canonical Models for Database Integration Preserving Semantics of the Value Inventive Data Models</article-title>
          .
          <source>In Proc. of the 16th East European Conference. LNCS 7503</source>
          , pp.
          <fpage>223</fpage>
          -
          <lpage>239</lpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Luo</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hou</surname>
            ,
            <given-names>W. C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>C. F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Want</surname>
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yu</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          :
          <article-title>Grid File for Efficient Data Cube Storage</article-title>
          .
          <source>Computers and their Applications</source>
          , pp.
          <fpage>424</fpage>
          -
          <lpage>429</lpage>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Manukyan</surname>
            ,
            <given-names>M. G.</given-names>
          </string-name>
          :
          <article-title>Extensible Data Model</article-title>
          .
          <source>In ADBIS'08</source>
          , pp.
          <fpage>42</fpage>
          -
          <lpage>57</lpage>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Manukyan</surname>
            ,
            <given-names>M. G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gevorgyan</surname>
            ,
            <given-names>G. R.:</given-names>
          </string-name>
          <article-title>An Approach to Information Integration Based on the AMN Formalism</article-title>
          .
          <source>In First Workshop on Programming the Semantic Web</source>
          . Available: https://web.archive.org/web/20121226215425/http ://www.inf.puc-rio.br/~psw12/program.html, pp.
          <fpage>1</fpage>
          -
          <lpage>13</lpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Manukyan</surname>
            ,
            <given-names>M. G.</given-names>
          </string-name>
          :
          <article-title>Canonical Data Model: Construction Principles</article-title>
          .
          <source>In iiWAS'14</source>
          , pp.
          <fpage>320</fpage>
          -
          <lpage>329</lpage>
          , ACM (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Manukyan</surname>
            ,
            <given-names>M. G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gevorgyan</surname>
            ,
            <given-names>G. R.</given-names>
          </string-name>
          :
          <article-title>Canonical Data Model for Data Warehouse</article-title>
          .
          <source>In New Trends in Databases and Information Systems, Communications in Computer and Information Science</source>
          ,
          <volume>637</volume>
          , pp.
          <fpage>72</fpage>
          -
          <lpage>79</lpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Nievergelt</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hinterberger</surname>
          </string-name>
          , H.:
          <article-title>The Grid File: An Adaptable, Symmetric, Multikey File Structure</article-title>
          .
          <source>ACM Transaction on Database Systems</source>
          ,
          <volume>9</volume>
          (
          <issue>1</issue>
          ), pp.
          <fpage>38</fpage>
          -
          <lpage>71</lpage>
          (
          <year>1984</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Papadopoulos</surname>
            ,
            <given-names>A. N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manolopoulos</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Theodoridis</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tsoras</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Grid File (and family)</article-title>
          .
          <source>In Encyclopedia of Database Systems</source>
          , pp.
          <fpage>1279</fpage>
          -
          <lpage>1282</lpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Regnier</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>Analysis of Grid File Algorithms</article-title>
          , BIT,
          <volume>25</volume>
          (
          <issue>2</issue>
          ), pp.
          <fpage>335</fpage>
          -
          <lpage>358</lpage>
          (
          <year>1985</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>Sharma</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tim</surname>
            ,
            <given-names>U. S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wong</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gadia</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sharma</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>A Brief Review on Leading Big Data Models</article-title>
          . Data
          <source>Science Journal</source>
          , (
          <volume>13</volume>
          ), pp.
          <fpage>138</fpage>
          -
          <lpage>157</lpage>
          , (
          <year>2014</year>
          ). Doi: http/doi.org/10.2481/dsj.14-
          <fpage>041</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>Stupnikov</surname>
            ,
            <given-names>S. A.</given-names>
          </string-name>
          :
          <article-title>A Varifiable Mapping of a Multidimensional Array Data Model into an Object Data Model</article-title>
          ,
          <source>Informatics and Applications</source>
          ,
          <volume>7</volume>
          (
          <issue>3</issue>
          ), pp.
          <fpage>22</fpage>
          -
          <lpage>34</lpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <surname>Stupnikov</surname>
            ,
            <given-names>S. A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vovchenko</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Combined Virtual and Materialized Environment for Integration of Large Heterogeneous Data Collections</article-title>
          .
          <source>In Proc. of the RCDL 2014. CEUR Workshop Proceedings</source>
          ,
          <volume>1297</volume>
          , pp.
          <fpage>339</fpage>
          -
          <lpage>348</lpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Stupnikov</surname>
            ,
            <given-names>S. A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Miloslavskaya</surname>
            ,
            <given-names>N. G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Budzko</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Unification of Graph Data Models for Heterogeneous Security Information Resources' Integration</article-title>
          .
          <source>In Proc. of the Int. Conf. on Open and Big Data OBD</source>
          <year>2015</year>
          (
          <article-title>joint with 3rd</article-title>
          <source>Int. Conf. on Future Internet of Things and Cloud</source>
          , FiCloud
          <year>2015</year>
          ).
          <source>IEEE</source>
          <year>2015</year>
          , pp.
          <fpage>457</fpage>
          -
          <lpage>464</lpage>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>Zakharov</surname>
            ,
            <given-names>V. N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kalinichenko</surname>
            ,
            <given-names>L. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sokolov</surname>
            ,
            <given-names>I. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stupnikov</surname>
            ,
            <given-names>S. A.</given-names>
          </string-name>
          :
          <source>Development of Canonical Information Models for Integrated Information Systems. Informatics and Applications</source>
          ,
          <volume>1</volume>
          (
          <issue>2</issue>
          ), pp.
          <fpage>15</fpage>
          -
          <lpage>38</lpage>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>[26] MongoDB. https://www.mongodb.org</mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>