=Paper= {{Paper |id=Vol-2416/paper51 |storemode=property |title=Using ontology merging for the integration of information systems and the production capacity planning system |pdfUrl=https://ceur-ws.org/Vol-2416/paper51.pdf |volume=Vol-2416 |authors=Nadezhda Yarushkina,Anton Romanov,Aleksey Filippov,Aleksandra Dolganovskaya,Maria Grigoricheva }} ==Using ontology merging for the integration of information systems and the production capacity planning system == https://ceur-ws.org/Vol-2416/paper51.pdf
Using ontology merging for the integration of
information systems and the production capacity
planning system


               N Yarushkina1, A Romanov1, A Filippov1, A Dolganovskaya1 and
               M Grigoricheva1

               1Ulyanovsk State Technical University, Severny Venets street, 32, Ulyanovsk, Russia,

               432027


               e-mail: jng@ulstu.ru, romanov73@gmail.com, al.filippov@ulstu.ru,
               gms4295@mail.ru


               Abstract. This article describes the method of integrating information systems of
               an aircraft factory with the production capacity planning system based on the
               ontology merging. The ontological representation is formed for each relational
               database (RDB) of integrated information systems. The ontological representation is
               formed in the process of analyzing the structure of the relational database of the
               information system (IS). Based on the ontological representations merging the
               integrating data model is formed. The integrating data model is a mechanism for
               semantic integration of data sources.



1. Introduction
As part of the work on automating the process of production capacity of the aircraft factory,
it is necessary to take into account the presence of heterogeneous information systems in the
aircraft factory that automates various business processes [1]. Data consistency can be realized
by integrating the production capacity planning system with existing information systems of the
aircraft factory. Data integration means the integration of data from different sources and the
providing of data to users in a unified way. The main difficulties of data integration are:
  (i) Data models heterogeneity.
 (ii) Independence of information systems of the aircraft factory from each other.
(iii) Data can be located in different segments of the local network of the aircraft factory and
      (or) on the Internet.
(iv) Different data formats.
 (v) Different value representations.
(vi) Loss of data relevance by one of the data sources.
   Thus, the organization of the information interaction between the production capacity
planning system and the existing information systems of the aircraft factory raises the need
to solve the following methodological problems [2, 3, 4, 5, 6, 7, 8, 9]:




             V International Conference on "Information Technology and Nanotechnology" (ITNT-2019)
Data Science
N Yarushkina, A Romanov, A Filippov, A Dolganovskaya and M Grigoricheva




  (i) Creating an integrating data model. Integrating data model is the basis of a single user
      interface in the integration system.
 (ii) Development of methods for building onological representations for specific models of
      various data sources.
(iii) Development of methods for building integrating data model for specific models of various
      data sources.
(iv) Solving the problem of data sources heterogeneity.
 (v) Development of mechanisms for semantic integration of data sources.

2. Ontological representation of data source
The proposed information interaction algorithm consists of the following steps:
  (i) Extracting metadata from the RDB schema for automatic generation of ontologies for the
      source and target RBDs.
 (ii) Ontology merging to configure correspondence between objects, attributes, and
      relationships of integrated ISs. Creation of metaontology.
(iii) Using the metaontology to perform the interaction procedure on a schedule or event.
   The metaontology is the settings contains correspondences between data models (tables and
columns) of integrated ISs.
   Ontology is a model knowledge representation of a specific problem area [10]. An ontology
contains a set of classes, individuals, properties, and relations between them. An ontology
is based on the dictionary of terms which reflecting the concepts of a problem area. Also,
the dictionary contains a set of rules (axioms). Terms can be combined to construct a set of
statements about the state of the problem area based on a set of axioms.
   At the moment, a lot of researchers use the ontological approach for extracting metadata
from the RDB schema:
  (i) The Relational.OWL [11] currently supporting only MySQL and DB2 database management
      systems (DBMS). The generated ontology contains classes: Database, Table, Column, and
      PrimaryKey, and properties: has, hasTable, hasColumn, isIndentifiedBy, references, scale,
      length. The main disadvantage of ontology generated by Relational.OWL is the presence of
      limited coverage of the domain, not considering, for instance, data type, foreign keys, and
      constraints.
 (ii) The OWL-RDBO [12, 13] currently supporting only MySQL, PostegreSQL and DB2
      DBMSs. The generated ontology contains classes: DatabaseName, RelationList, Relation,
      AttributeList, Attribute, and properties: hasRelations, hasType, referenceAttribute,
      referenceRelation. ,The main disadvantage of ontology generated by OWL-RDBO is the
      presence of concepts external to the domain, such as RelationList to group a set of Relation,
      and AttributeList to group a set of attributes.
(iii) Other approaches, such as [14, 15] extract the real world relations from the RDB structure,
      and unable to reconstruct the original schema of the RDB.
   The relational data model can be represented as the following expression:

                                                   RDM = hE, H, Ri,                            (1)

where E = {E1 , E2 , . . . , Ep } is a set of RDB entities (tables);
Ei = (name, Row, Col) is the i-th RDB entity that contains the name, set of rows Row and
columns Col;
Colj = (name, type, constraints) is the j-th column of the i-th RDB entity that contains


V International Conference on "Information Technology and Nanotechnology" (ITNT-2019)           402
Data Science
N Yarushkina, A Romanov, A Filippov, A Dolganovskaya and M Grigoricheva




properties: the name, the type and set of constraints;
H = {H1 , H2 , . . . , Hq } is a hierarchy of RDB entities in the case of using the table inheritance
function:
                                            Hj = Ei D (x) Ek ,                                    (2)
where Ei and Ek are RDB entities;
D (x) is a ’parent-child’ relation between Ei and Ek ;
R = {R1 , R2 , . . . , Rr } is a set of RDB relations:
                                                               F (x)
                                                    Rl = Ei          Ek ,                        (3)
                                                               G (x)
where F (x) is an RDB relation between Ei and Ek ;
G (x) is an RDB relation between Ek and Ei .
   Functions F (x) and G (x) can take values: U is a single relation and N is multiple relations.
   The ontological representation of the RDB data model is:
                                                     O = hC, P, L, Ri,                           (4)
where C = {C1 , C2 , . . . , Cn } – is a set of data model ontology classes;
P = {P1 , P2 , . . . , Pm } – is a set of properties of data model ontology classes;
L = {L1 , L2 , . . . , Lo } – is a set of data model ontology constraints;
R is a set of data model ontology relations:
                                                   R = {RC , RP , RL },                          (5)
where RC is a set of relations defining the hierarchy of data model ontology classes;
RP is a set of relations defining the ’class-property’ data model ontology ties;
RL is a set of relations defining the ’property-constraint’ data model ontology ties.
   The following function is used to map the RDB structure (ex. 1) to the ontological
representation (ex. 4):
                    F (RDM, O) : {E RDM , H RDM , RRDM } → → {C O , P O , LO , RO },             (6)
where {E RDM , H RDM , RRDM } is a set of RDB entities and relations between them (eq. 1);
{C O , P O , LO , RO } is a set of ontology entities (eq. 4).
   The process of mapping the RDB structure into an ontological representation contains several
steps:
  (i) Formation of ontological representation classes.
      A set of ontological representation classes C is formed based on the set of RDB entities
      C Ei → Ci . The number of classes of the ontological representation must be equal to the
      number of RDB entities.
 (ii) Formation of properties of ontological representation classes.
      A set of properties P of the i-th ontological representation class Ci is formed based on the
      set of columns Col of the i-th RDB entity Ei Colj → Pj . The number of properties of the
      i-th ontological representation class Ci must be equal to the number of columns of the i-th
      RDB entity Ei . The name of the j-th property Pj is the name of the j-th column Colj of
      the RDB entity.
(iii) Formation of ontological representation constraints.
      A set of constraints L of the properties of the i-th ontological representation class Ci is
      formed based on the set of columns Col of the i-th RDB entity Ei Colk → L̂. The number
      of constraints of the i-th ontological representation class Ci must be equal to the number of
      constraints of the i-th RDB entity Ei . However, there are limitations to this approach due
      to the difficulty of mapping constraints if their presents as triggers or stored procedures.


V International Conference on "Information Technology and Nanotechnology" (ITNT-2019)             403
Data Science
N Yarushkina, A Romanov, A Filippov, A Dolganovskaya and M Grigoricheva




(iv) Forming hierarchy of ontological representation classes.
     It is necessary to form a set of ontology relationships RC between all the child and parent
     classes corresponding to the hierarchy of RDB entities if table inheritance uses in RDB
     H → RC . The domain of the j-th ontological representation relationship RCj is indicated
     by the reference to the parent class Cparent . The range of the j-th ontological representation
     relationship RCj is indicated by the reference to the child (or a set) class Cchild .
 (v) Formation of relations between classes and properties of classes of ontological representation.
     A set of ontological representation relationships RP is formed based on the set of columns
     Col of the i-th RDB entity Ei and the set of RDB relations R. Two types of relationships
     are formed for each j-th ontological representation property Pj :
      (a) The relationship ’class-property’. The domain of the ontological representation
           relationship is indicated by the reference to the i-th class Ci to which the j-th property
           belongs, and the range to the j-th property reference Pj .
     (b) The relationship ’property-data type class’. The domain of the k-th ontological
           representation relationship is indicated by the reference to the j-th property Pj . The
           range is indicated by the reference to the l-th class Cl corresponding to the l-th RBD
           entity El , or the reference to the m-th ontology class Cm corresponding to the data
           type of the j-th RBD column Colj .
(vi) Formation of relations between properties of classes and constraints of properties of classes
     of ontological representation.
     A set of relations RL of ontological representation is formed based on the set of columns
     Col of the i-th RDB entity.The domain of the j-th ontological representation relationship
     RLj is indicated by the reference to the k-th property Pk . The range of the j-th ontological
     representation relationship RLJ id indicated by the reference to the k-th constraint Col →
     RL .

3. Integrating data model
It is necessary to form an integrating data model based on the ontological representations that
obtained after mapping the RDB structure of each of the integrated information systems into
the ontological representation. The definition of an ontological system is used as a formal
representation of an integrating data model:
                                               O
                                               X
                                                    = hOM ET A , OIS , M i,                        (7)

where OM ET A is the integrating data model ontology (metaontology);
OIS = {O1IS , O2IS , . . . , OgIS } is a set of ontological representations of information systems that
must be integrated;
M is a model of reasoner.
   The following steps are necessary to form an integrating data model based on the set of
ontological representations of the information systems that must be integrated:
 (i) Formation of the universal concept dictionary for the current domain.
     The process of forming an integrating data model OM ET A is based on the presence of
     common terminology. Ontological representations of all information systems that must be
     integrated OIS should be built from a single concept dictionary. The concept dictionary is
     formed by the expert based on the analysis of the obtained ontological representations.
(ii) Formation an integrating data model OM ET A .
     At this step, the set of top-level classes C M ET A are added to the integrating data model
     OM ET A . The set of top-level classes C M ET A describes systems that must be integrated and
     is used as the basis for ontology merging.


V International Conference on "Information Technology and Nanotechnology" (ITNT-2019)               404
Data Science
N Yarushkina, A Romanov, A Filippov, A Dolganovskaya and M Grigoricheva




(iii) Formation of class hierarchy of integrating data model OM ET A .
      At this step, the integrating data model establishes a correspondence between the class
                      IS
      hierarchies C Oi of ontological representations OIS of information systems that must be
      integrated.
(iv) Formation of class properties of the integrating data model OM ET A .
      At this step, the integrating data model establishes a correspondence between the properties
         IS
      P Oi of ontological representations OIS of information systems that must be integrated.
      The expert decides which class properties of ontological representations OIS should be
      included in the integrating data model OM ET A .
 (v) Formation of axioms of classes and properties, checking the integrating data model OM ET A
      for consistency.
                                     IS                                   IS                 IS
      At this step, constraints LO are applied to the properties P O and classes C O of
      the integrating data model O    M ET A based on the constraints presents in the ontological
      representations O . After that, the resulting integrating data model OM ET A should
                          IS

      be checked for internal consistency using the reasoner M . However, the development of
      methods for checking the conditions of constraints is required, since the existing reasoners
      do not support working with such objects.
   The proposed method is allowed to configure the correspondence between tables and fields
of two RDBs. The main problem is the need for ontology merging. However, that problem can
be solved due to the use of specialized tools to automate the ontology merging process. Also,
specialized tools allow dividing the developer and domain expert roles. The main advantage of
the proposed method is the ability to dynamically generate the necessary SQL queries for select
and insert data from/to the RBD based on metaontology.

4. Example of creation the ontological representation of data source
Let see the following example of the ontological representation formation.
   Table 1 shows the structure of the ”Equipment and Tools” table of the aircraft factory IS.
   Thus, the ontological representation of the ”Equipment and Tools” entity (tab. 1) can be
represented as:
O=h
   C = { Equipment and Tools (E&T), CHAR, NUMBER, BLOB, DATE },
   P = { t2 ob, t2 ng , t2 nn, t2 r1, t2 r2, t2 r3, t2 p1, t2 z1, t2 p2, t2 z2,
      t2 p3, t2 z3, t2 gm, t2 p3, t2 z3, t2 gm, up dt, up us, t2 dc, t2 vid, t2 doc,
      t2 prim, t2 yyyy }
   L = { nullable, h length, 2 i, h length, 4 i, h length, 8 i, h length, 32 i,
      h length, 100 i, h length, 200 i, h length, 255 i, h precision, 5 i,
      h precision, 6 i }
   RP = { h E&T, t2 ob, CHAR i, h E&T, t2 ng, NUMBER i,
      h E&T, t2 nn, NUMBER i, h E&T, t2 r1, CHAR i,
      h E&T, t2 r2, CHAR i, h E&T, t2 r3, CHAR i,
      h E&T, t2 p1, CHAR i, h E&T, t2 z1, CHAR i,
      h E&T, t2 p2, CHAR i, h E&T, t2 z2, CHAR i,
      h E&T, t2 p3, CHAR i, h E&T, t2 z3, CHAR i,
      h E&T, t2 gm, CHAR i, h E&T, up dt, DATE i,
      h E&T, up us, CHAR i, h E&T, t2 dc, BLOB i,
      h E&T, t2 vid, CHAR i, h E&T, t2 doc, CHAR i,
      h E&T, t2 prim, CHAR i, h E&T, t2 yyyy, CHAR i }
   RL = { h E&T, t2 ob, h length, 200 i i, h E&T, t2 ng, h precision, 5 i i,
      h E&T, t2 nn, h precision, 6 i i, h E&T, t2 p1, h length, 2 i i,


V International Conference on "Information Technology and Nanotechnology" (ITNT-2019)          405
Data Science
N Yarushkina, A Romanov, A Filippov, A Dolganovskaya and M Grigoricheva




       h E&T, t2 p1, nullable i, h E&T, t2 z1, h length, 8 i i,
       h E&T, t2 z1, nullable i, h E&T, t2 p2, h length, 2 i i,
       h E&T, t2 p2, nullable i, h E&T, t2 z2, h length, 8 i i,
       h E&T, t2 z2, nullable i, h E&T, t2 p3, h length, 2 i i,
       h E&T, t2 p3, nullable i, h E&T, t2 z3, h length, 8 i i,
       h E&T, t2 z3, nullable i, h E&T, up us, h length, 32 i i,
       h E&T, t2 vid, h length, 4 i i, h E&T, t2 doc, h length, 100 i i,
       h E&T, t2 prim, h length, 100 i i, h E&T, t2 doc, nullable i,
       h E&T, t2 yyyy, h length, 4 i i }
i.



                 Table 1. The ”Equipment and Tools” table of the aircraft factory IS.
                        Column                     Data type                  Description
                        t2 ob                      CHAR(200)                  Name
                        t2 ng                      NUMBER(5)                  Group
                        t2 nn                      NUMBER(6)                  Position
                        t2 r1                      CHAR                       Type #1:
                                                                              0 — equipment;
                                                                              1 — tool;
                                                                              2 — material;
                                                                              6 — special tool.
                        t2 r2                      CHAR                       Type #2:
                                                                              0 — standard;
                                                                              1 — special.
                        t2 r3                      CHAR                       Type #3:
                                                                              20 — no;
                                                                              21 — design;
                                                                              30 — model;
                                                                              31 — design and
                                                                              model.
                        t2 p1                      CHAR(2)                    Parameter #1
                                                   nullable
                        t2 z1                      CHAR(8)                    Parameter #1 value
                                                   nullable
                        t2 p2                      CHAR(2)                    Parameter #2
                                                   nullable
                        t2 z2                      CHAR(8)                    Parameter #2 value
                                                   nullable
                        t2 p3                      CHAR(2)                    Parameter #3
                                                   nullable
                        t2 z3                      CHAR(8)                    Parameter #3 value
                                                   nullable
                        t2 gm                      BLOB                       Geometric model
                        up dt                      DATE                       Date of last update
                        up us                      CHAR(32)                   User
                        t2 dc                      BLOB                       Attachment
                        t2 vid                     CHAR(4)                    Tooling type
                        t2 doc                     CHAR(100)                  Document name
                        t2 prim                    CHAR(100)                  Notes
                                                   nullable
                        t2 yyyy                    CHAR(4)                    Production date



     As you can see from this example, the resulting ontology representation O has some sets of


V International Conference on "Information Technology and Nanotechnology" (ITNT-2019)               406
Data Science
N Yarushkina, A Romanov, A Filippov, A Dolganovskaya and M Grigoricheva




objects:
  (i) A set of classes C contains the ”Equipment and Tools” table and some data types: CHAR,
      NUMBER, BLOB, DATE. The OWL representation of ontology O uses Class signature to
      represent the table.
 (ii) A set of properties P contains all columns of the ”Equipment and Tools” table. The
      OWL representation of ontology O uses built-in data types to represent RDB data types
      (xsd:string, xsd:double, xsd:dateTime, xsd:base64Binary), and Class signature to represent
      RDB relationships.
(iii) A set of constraints L contains all variants of restrictions for columns of the ”Equipment
      and Tools” table. This set is not translated to OWL representation directly.
(iv) A set of relations between classes and properties RP contains ties between table and columns
      that belong to this table. The OWL representation of ontology O uses ObjectProperties and
      DataProperties signatures to represent a set of relations RP . ObjectProperties signatures
      are used to represent foreign keys. DataProperties signatures are used to represent columns
      that contain a value.
 (v) A set of relations between properties and constraints RL contains a tie between column and
      constraints of this column. OWL datatype restrictions are used for constraints specification.
      For example:
      DatatypeRestriction(
      xsd:integer xsd:minInclusive ”5”ˆˆxsd:integer xsd:maxExclusive ”10”ˆˆxsd:integer
      ).
   Thus, the ontological approach is commonly used to solve the methodological problem of
building an integrating data model of information systems.

5. Conclusion
This article presents the implementation of the method of integrating the information systems
of the aircraft factory with the production capacity planning system. The principles of
ontological engineering allows mapping database structure of each information system that must
be integrated into ontological representation. From the proposed methodology, an integrated
data model is formed based on the obtained ontological representations for each information
systems that must be integrated.
   The proposed method allows organizing information interaction without the participation of
developers in contrast to the traditional approach of consolidation, based on the method of direct
data exchange. The only requirement of the proposed method is the presence of metaontology.
The disadvantages of the proposed method implementation currently are:
 (i) The need for implementation of the data type casting algorithms in case of their mismatch
     for each DBMS.
(ii) The need for adapting the proposed method implementation to the SQL dialect of
     DBMS involved in the exchange process. Random DBMS cannot be supported by this
     implementation.

6. References
[1] Yarushkina N, Romanov A, Filippov A, Guskov G, Grigoricheva M and Dolganovskaya A 2019
The building of the production capacity planning system for the aircraft factory Research Papers
Collection OpenSemantic Technologies for Intelligent Systems 3 123-128
[2] Clark T, Barn B S and Oussena S 2012 A method for enterprise architecture alignment Practice-
Driven Research on Enterprise Transformation 48-76



V International Conference on "Information Technology and Nanotechnology" (ITNT-2019)           407
Data Science
N Yarushkina, A Romanov, A Filippov, A Dolganovskaya and M Grigoricheva




[3] Rouhani D B 2015 A systematic literature review on Enterprise Arquitecture Implementation
Methhodologies Information and Software Technology 1-20
[4] Medini K and Bourey J P 2012 SCOR-based enterprise architecture methodology Int. J. Comput.
Integrat. Manuf.
[5] Poduval A 2011 Do more with SOA Integration: Best of Packt
[6] Caselli V, Binildas C and Barai M 2008 The Mantra of SOA. Service Oriented Architecture with Java
(Birmingham. UK)
[7] Berna-Martinez V J, Zamora C, Ivette C, Perez M, Paz F, Paz L and Ramon C 2018 Method for the
Integration of Applications Based on Enterprise Service Bus Technologies
[8] Evsutin O O, Kokurina A S and Meshcheryakov R V 2019 A review of the methods of embedding
information in digital objects for security in the Internet of things Computer Optics 43(1) 137-154 DOI:
10.18287/2412-6179-2019-43-1-137-154
[9] Rycarev I A, Kirsh D V and Kupriyanov A V 2018 Clustering of media content from social
networks using bigdata technology Computer Optics 42(5) 921-927 DOI: 10.18287/
2412-6179-2018-42-5-921-927
[10] Gruber T 2019 Ontology URL: http://tomgruber.org/writing/ontology-in-encyclopedia-of-dbs.pdf
[11] de Laborda C P and Conrad S 2005 Relational. owl: a data and schema representation format
based on owl Proceedings of the 2nd Asia-Pacific conference on Conceptual modeling 43 89-96
[12] Trinh Q, Barker K and Alhajj R 2006 Rdb2ont: A tool for generating owl ontologies from
relational database systems Telecommunications International Conference on Internet and Web
Applications and Services/Advanced 170
[13] Trinh Q, Barker K and Alhajj R 2007 Semantic interoperability between relational database
systems Database Engineering and Applications Symposium 208-215
[14] Barrett T, Jones D, Yuan J, Sawaya J, Uschold M, Adams T and Folger D 2002 Rdf representation
of metadata for semantic integration of corporate information resources International Workshop Real
World and Semantic Web Applications
[15] Bizer C 2003 D2R MAP – A Database to RDF Mapping Language Proc. of the 12th International
World Wide Web Conference – Posters

Acknowledgments
The study was supported by:
  • the Ministry of Education and Science of the Russian Federation in the framework of
    the project No. 2.1182.2017/4.6. Development of methods and means for automation of
    production and technological preparation of aggregate-assembly aircraft production in the
    conditions of a multi-product production program;
 • the Russian Foundation for Basic Research (Projects No. 18-47-732016, 18-47-730022, 17-
    07-00973, No. 18-47-730019).




V International Conference on "Information Technology and Nanotechnology" (ITNT-2019)               408