=Paper= {{Paper |id=Vol-3027/paper115 |storemode=property |title=Data Conversion Model Using the Principles of Geometric Model Structure Proximity Comparison |pdfUrl=https://ceur-ws.org/Vol-3027/paper115.pdf |volume=Vol-3027 |authors=Alexey Boytyakov,Alexandr Filinskikh }} ==Data Conversion Model Using the Principles of Geometric Model Structure Proximity Comparison== https://ceur-ws.org/Vol-3027/paper115.pdf
Data Conversion Model Using the Principles of Geometric Model
Structure Proximity Comparison
Alexey Boytyakov 1 and Alexandr Filinskikh 1
1
    Nizhny Novgorod State Technical University n.a. R.E. Alekseev, 24 Minin Str., Nizhny Novgorod, 603950, Russia

                 Abstract
                 Research has been carried out into possible data conversion losses and the integration of
                 heterogeneous automated systems in enterprises. A model of data conversion in the framework
                 of heterogeneous automated systems interaction on the example of geometrical model
                 structures comparison of heterogeneous automated systems is proposed. The model can be used
                 for loss estimation using the representation of geometric models as data structures and
                 conversion metrics. The article deals with the problem at the current stage of information
                 support for lifecycles processes is the lack of integration of multi-vendor automation systems
                 in enterprises. Losses in one stage of the lifecycle can lead to technical and economic
                 difficulties in other stages of the lifecycle and problems can also be encountered when
                 integrating automation systems and data conversion between enterprises. There is a need to
                 develop an advanced parameter conversion model and compare the proximity of GM structures
                 between automation systems. It is required to evaluate the efficiency of data conversion
                 between environments using different formats using metrics.
1

                 Keywords
                 Geometric model, data format, graph, graph structure, parameter classification, CAD system,
                 PDM system

1. Introduction
    Today, the principle of a unified information space (UIS) [1] is one of the priority processes in the
development of Russian industrial enterprises. The situation in which there is no full automation of the
life cycle stages is common. In this case, software products of different foreign and domestic vendors,
the formats of which may be incompatible, are used. At different stages of product lifecycle (LCL),
including design and development processes, the use of different formats can lead to additional time
and financial expenses for data conversion. The current trend at Russian enterprises is towards import
substitution of foreign vendors with domestic ones.
    The main task of product lifecycle information support technologies (CALS or IPI) is to create an
UIS for all participants of the product lifecycle (LCL), which ensures information interaction between
CALS components. A distinctive feature is the extensive use of digital information model (DIM) of the
product and its components at most stages of the lifecycle. The base of DIM is a combination of
geometric model (GM) and attributive information. The components of DIM include computer-aided
design system (CAD), as well as product data management system (PDM) [2]. International standards
have been developed for universal interaction of design and manufacturing automation systems (STEP,
IGES).
    The main problem at the current stage of information support for lifecycles processes (Figure 1) is
the lack of integration of multi-vendor automation systems in enterprises [3]. For example, losses in
one stage of the lifecycle can lead to technical and economic difficulties in other stages of the lifecycle.

GraphiCon 2021: 31st International Conference on Computer Graphics and Vision, September 27-30, 2021, Nizhny Novgorod, Russia
EMAIL: alexey.boytyakov@gmail.com (Alexey Boytyakov); alexfil@yandex.ru (Alexandr Filinskikh)
ORCID: 0000-0002-6477-9303 (Alexey Boytyakov); 0000-0003-3826-6771 (Alexandr Filinskikh)
              ©️ 2021 Copyright for this paper by its authors.
              Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
              CEUR Workshop Proceedings (CEUR-WS.org)
Problems can also be encountered when integrating between enterprises: further design of products by
other enterprises in other formats can lead to significant time and resource costs, or even lead to re-
development of the product [4]. To date, neutral data conversion formats have been developed for data
formats from different vendors: e.g. STEP, IGES, etc. [5]. But with these formats it is not possible to
transfer all geometric model (GM) parameters without losses [6]. So there is a need to develop an
advanced parameter conversion model and compare the proximity of GM structures between
automation systems. It is required to evaluate the efficiency of data conversion between environments
using the above mentioned formats using metrics. These topical issues have become the subject of our
research.




Figure 1: Information support for product lifecycle stages

2. Problems of the data conversion process between automated systems
   An important role in the design is played by automated systems, which include computer-aided
design systems, product data management, and others. An example of geometric model in automated
system is shown in Figure 2. These systems carry out the calculations necessary for the engineer during
the development of the product model in CAD through the data that is located in the PDM. If there is a
need to calculate the behavior of products, such a system can be connected to PDM [7], using
specialized engineering analysis systems. When interacting with PDM, CAD will have access to the
results of the operation of these systems.




Figure 2: An example of geometric model in automated system

  There are several levels of interaction of heterogeneous automated systems [8]. The highest level is
when a single data model is used in the whole enterprise. All computer systems (CAD, PDM, automated
enterprise management system (ASUP), etc.) work with a single database. But to implement this level
of the systems’ interaction is very difficult in practice.
    Another level of interaction uses direct access to the database. All systems have their own databases,
each can send and receive data from other systems (the method is found in practice: for example, the
Tflex Docs PDM system has a mechanism for its implementation).
    The main problem of this level of interaction is that the manufacturers usually offer specific
solutions. There are no universal solutions [9], the integration of systems is hidden, so there is no way
to define a more universal system. The interaction of heterogeneous automated systems can be carried
out via the application programming interfaces (APIs), as represented in Figure 3 [10]. When
implementing a large number of systems in an enterprise, a large number of converters are required to
ensure data conversion. It obviously leads to considerable raise of implementation costs. The
disadvantages also include the need for a complete reworking of the software, in case it is necessary to
replace one of the systems with a system from another manufacturer, or when changing the API of any
of the systems.




Figure 3: Interaction of the automated systems via the APIs

    There is also a concept of a Unified Information Space (UIP), which includes the concept of PLM
technologies [11] and the concept of IPI technologies [12]. This concept involves the use of files for
data exchange between systems. When converting, the first system generates a file that contains the
transmitted data, and the second system reads this file after receiving the data. To create a file, special
converters are used that convert the data from the application system format to the exchange file format
and vice versa. When choosing formats, it is possible to use a neutral format, the ISO 10303 STEP
standard [13].
    The concept of PLM is to perform tasks using a set of software products from a single developer.
However, there may be a situation in which an engineer cannot replace the program with another
vendor, but only the entire complex. On the other hand, the use of systems of independent vendors can
lead to the problems with the data integration and transfer, i.e. the possibility of data conversion without
significant losses.
    The concept of IPI technologies is to free the user from a single developer, using a neutral data
conversion format (Figure 4). This approach is based on a unified information space UIP (an
international term is shared data environment, SDE), which is implemented using international data
presentation standards. The IPI strategy includes information support for the product lifecycle based on
the use of an integrated information environment, paperless presentation of information, the use of
electronic digital signatures, standardization of information descriptions of management objects,
improvement of business processes, parallel engineering, parallelization of a number of design works
and stages of the product lifecycle, and others [14].
Figure 4: The concept of IPI technologies in the UIP (SDE) organization

    One of the main stages in the implementation of the IPI strategy is the creation of the unified
information space of the enterprise, which is based on interacting CAD and PDM [15].
    In world practice, there are many examples of successful application of the IPI concept at enterprises
of various industries [16]: aircraft construction, automotive industry, mechanical engineering,
medicine. In Russia, for example, JSC "Tupolev", Voronezh Mechanical Plant, AVKP "Sukhoi" and
others have successfully implemented the IPI concept in their production cycles.
    Open distributed automated systems for design and management at industrial enterprises are the
basis of modern IPI technologies. The main problem is the transition to a uniform description and
interpretation of data, as well as regardless of the location and time characteristics of their receipt in the
system, which may have global scales.

3. Methods for Comparing the Proximity of Data Structures
    Each product can have a tree structure, which is a graphical representation of the hierarchical
structure. The principles of use in the lifecycle stages and operation of products involve checking at
each stage how the structure has been changed. So to compare structures, it is proposed to apply graph
theory to describe the methods for comparing the proximity of data structures [17].
    In the following, examples of the product model that have been applied to analyse the conversion
process are discussed. Initially the model is a GM which is some product, and the product needs to be
converted to another vendor's automation system. The final model represents some outcome of the
conversion to another automation system developed by another vendor, i.e. a set of operations
associated with the conversion process and with the GM data is identified at the output.
    It is required to identify probable difficulties in converting a GM within a data operation in
heterogeneous automation systems using graphs and mathematically propose a description in the form
of a "tree". We have created the structure of GM using graphs, presented as a set of elements for the
product model. A graph is known to be a mathematical object, a complex of two sets which are a set of
elements including a variety of edges and vertices. This set of elements of the product model includes
integration parameters, geometry parameters and such data as attributive information [18] and proposed
structure containing frames [19] and product tree. The integration parameters include a number of
information such as: information about the open and vendor-supported CAD or PDM API [20];
presence of CAD API functions for creating, converting and synchronizing properties and attribute
information of CAD files; presence of PDM functions for structured loading/unloading, tracking and
managing CAD data, etc. The GM is represented by a structure comprising a product tree and frames
containing data about GM parameters. The structure of the product model is denoted as graph G = (X,
A). The graph can include versions of the above-mentioned elements as well as their characteristics.
    There are several stages of comparison. The first is a proximity comparison of GM trees only, based
on a mathematical representation of trees in the form of adjacency matrices. The adjacency matrix is a
square matrix with logical values (0 or 1). The graph consists of vertices and edges, which are links
between the vertices. So the data on GM parameters is reflected in the presence of the graph edges and
also in the vertices where the information is contained first in the case of tree graphs of a product model
structure. If a vertex of the tree graph is lost, an edge is also lost. The following is a description of a
part of the the assembly of the original GM shown in Figure 5.
    The adjacency matrix is a binary square matrix, with rows and columns having values of 1 or 0, the
number of rows being matched to the number of columns. The matrix has dimension n x n, (where n is
the vertices of the structure as a graph), uniquely representing its structure. This is one of the variations
of graph structure as a matrix. The first row and the first column, which do not consist in a matrix but
are written down for ease of perception, contain the numbers at the intersection of which each of the
elements is located and determine the index value of the latter [21].
A = {aij}, i, j = 1, 2, ..., n, so each element of the matrix is defined as follows:
aij = 1, if there is an arc (хi, хj); aij = 0, if there is no arc (хi, хj).

    Such binary matrices are used to parse the conversion process and to identify unobservable
differences in graph structure. In the context of the conversion assessment task, this is to identify the
difference in structure of the product model after the data conversion process within a multivendor
framework. A matrix representation of the graphs is used to compare them. Algebraic operations are
performed with the matrices to reveal the result of how similar or different the graphs are. The adjacency
matrix of the original GM as well as the binary values of the product model are shown below.

                𝑎11 𝑎12 𝑎13 𝑎14 𝑎15 𝑎16 𝑎17 𝑎18 𝑎19                 011000000
                𝑎21 𝑎22 𝑎23 𝑎24 𝑎25 𝑎26𝑎27 𝑎28 𝑎29                  000100000
                𝑎31 𝑎32 𝑎33 𝑎34 𝑎35 𝑎36𝑎37 𝑎38 𝑎39                  000011000
                𝑎41 𝑎42 𝑎43 𝑎44 𝑎45 𝑎46𝑎47 𝑎48 𝑎49                  000000000
            A = 𝑎51 𝑎52 𝑎53 𝑎54 𝑎55 𝑎56𝑎57 𝑎58 𝑎59 ,            A = 000000110                          (1)
                𝑎61 𝑎62 𝑎63 𝑎64 𝑎65 𝑎66𝑎67 𝑎68 𝑎69                  000000001
                𝑎71 𝑎72 𝑎73 𝑎74 𝑎75 𝑎76𝑎77 𝑎78 𝑎79                  000000000
                𝑎81 𝑎82 𝑎83 𝑎84 𝑎85 𝑎86𝑎87 𝑎88 𝑎89                  000000000
               (𝑎91 𝑎92 𝑎93 𝑎94 𝑎95 𝑎96 𝑎97 𝑎98 𝑎99 )              (000000000)

    Next, the GM was converted and then transferred to another vendor's automation system, resulting
in some collisions. For clarity, a part of the converted GM assembly is shown in Figure 5.




Figure 5: An example of the original GM and a possible result of the GM conversion

   As depicted in Figure 5 and Figure 6, the conversion process reveals some losses as part of the data
transfer to another vendor's automation system. The binary square matrix of the transferred GM has the
same size as the original GM, as the transferred GM is compared to the original GM. The size of the
binary square adjacency matrix is determined by the number of vertices in the graph, so a procedure is
required to add zero rows and columns to the so-called "right places" (lost data), which must first be
determined. The following describes the part of the GM assembly after data conversion.
Figure 6: Final result of the GM conversion

   The adjacency matrix and binary matrix values of this GM are as follows:

                𝑏11 𝑏12 𝑏13 𝑏14 𝑏15 𝑏16 𝑏17 𝑏18 𝑏19               011000000
                𝑏21 𝑏22 𝑏23𝑏24 𝑏25 𝑏26 𝑏27 𝑏28 𝑏29                000100000
                𝑏31 𝑏32 𝑏33𝑏34 𝑏35 𝑏36 𝑏37 𝑏38 𝑏39                000011000
                𝑏41 𝑏42 𝑏43𝑏44 𝑏45 𝑏46 𝑏47 𝑏48 𝑏49                000000000
            B = 𝑏51 𝑏52 𝑏53𝑏54 𝑏55 𝑏56 𝑏57 𝑏58 𝑏59 ,          B = 000000000                         (2)
                𝑏61 𝑏62 𝑏63𝑏64 𝑏65 𝑏66 𝑏67 𝑏68 𝑏69                000000001
                𝑏71 𝑏72 𝑏73𝑏74 𝑏75 𝑏76 𝑏77 𝑏78 𝑏79                000000000
                𝑏81 𝑏82 𝑏83𝑏84 𝑏85 𝑏86 𝑏87 𝑏88 𝑏89                000000000
               (𝑏91 𝑏92 𝑏93 𝑏94 𝑏95 𝑏96 𝑏97 𝑏98 𝑏99 )            (000000000)

4. Application of metrics in determining the proximity of data structures
   We carried out a study of the proximity of GMs using graph theory. The structure of GM products
and transfer results are mathematically represented in the form of a graph and contain groups of
parameters previously described in more detail, including integration data. This structure is the source
of data for determining the structural weights of GM elements [22]. There is another option to improve
the accuracy of data conversion estimation, which requires additional conversion data for each node of
the GM tree. Here a graph structure [23] of GM transfer parameters is applied. The graph structure
includes a tree-like graph and a data set for each GM node, within a frame data representation,
containing a list of GM parameters. The initial layer contains the GM parameter data for the whole
product, represented as a tree view. In addition, each node of the next level contains an additional list
of parameters, represented as frames. Post-conversion comparisons were considered within the
assembly, within each individual structure level and at the node level. The data structure of the original
GM is shown in Figure 7.
   The proximity of the graphs is calculated by applying a metric based on the Hamming distance
expression if nominal conversion data is required:
                                               𝑝

                                    𝑑𝑖𝑗 = ∑|𝑥𝑖𝑘 − 𝑥𝑗𝑘 |,                                            (3)
                                            𝑘=1

   We get the following expression:
                                           𝑛       𝑛

                                  𝑑𝑖𝑗 = ∑ ∑|𝑎𝑖𝑗 − 𝑏𝑖𝑗 | ,                                           (4)
                                          𝑖=1 𝑗=1
where а – parameters of the 1st GM of i-th row and j-th column; b – parameters of the 2nd GM of i-th
row and j-th column obtained after conversion; n – number of elements.




Figure 7: An example of graphical structure of a GM

   Another calculation of the proximity of graphs using the metric is based on the Sorensen measure if
quantitative conversion data is required:
                                                 2𝑐
                                         𝐾𝑆 =       ,                                              (5)
                                                𝑎+𝑏
where а – number of parameters of the 1st GM, a = {X1,X2,X3,X4,X5,X6,X7,X8,X9}, b – number of
parameters on the 2nd GM as a result of the conversion, b = {X1,X2,X3,X4,X5,X8,X9}, с – number of
parameters common to the 1st and 2nd GM, c = {X1,X2,X3,X4,X5,X8,X9}.
    The problems when converting product models very often do not depend linearly on the number of
elements in the GM, but on the formats and vendors of the design automation systems. Therefore it was
necessary to find out possible data loss during conversion under conditions of different software vendors
and to what extent it is possible to apply neutral formats for data conversion for different software
vendors. The conversion experiments with neutral formats yielded metric values based on a comparison
of the proximity of the GM graphs from 0 to 0.5. The value for each vendor will be different, so each
case should be considered in detail: it is necessary to assess how satisfied the obtained result is, what
were the conversion losses, what additional recovery costs will be required. It was found that when
using engineering automation systems of a single vendor the conversion problems are not format
dependent but rather random. If production plants use software from different vendors, the dependence
on vendor formats was found. Often the different formats are incompatible, resulting in more data loss
and higher recovery costs. It was found that it is possible to use neutral formats, under certain
conditions: for example, when the losses are not great and will not affect the further development of the
product and work with the product model. Proper evaluation of data conversion losses should have a
positive impact on the further support of the product life cycle stages.

5. Conclusion
    The lack of a model to support data conversion between automation systems in the form of a
generalized machine-independent model based on a comparison of the proximity of graphs and graph
structures is detected. Our proposed model makes it possible to estimate the labor intensity of data
recovery if there have been losses during data conversion using graphs and graph structures. We propose
a methodology for obtaining the values of the metric for estimating data conversion losses. The
principles and problems of integration of automation systems and product data management systems
are revealed. The estimation of data conversion in the interaction of heterogeneous automated systems
within the framework of UIP based on metric estimates is proposed.

6. References
[1] Norenkov I.P., Kuzmin P.K., Information support for knowledge-intensive products CALS
     technology, Bauman Moscow State Technical University Publishers, Moscow, 2002, 320 p. (in
     Russian).
[2] Norenkov I.P., Fundamentals of computer-aided design, Bauman Moscow State Technical
     University Publishers, Moscow, 2002, 336 p. (in Russian).
[3] Kutin, A., Dolgov, V., Sedykh, M., Ivashin, S., Integration of Different Computer-aided Systems
     in Product Designing and Process Planning on Digital Manufacturing. Procedia CIRP 67, 2018,
     pp. 476-481. doi:10.1016/j.procir.2017.12.247.
[4] Filinskikh, A.D., Component approach to the translation of geometric models. 8th International
     Scientific Conference on Computing in Physics and Technology, 2020, pp. 220-225.
     doi:10.30987/conferencearticle_5fce27710c7721.55039399.
[5] Yablochnikov E.I., Fomina Y.N., Salomatina A.A., Computer Technologies in Product Life Cycle,
     SPbSU ITMO, SPb, 2010, 180 p. (in Russian).
[6] Shilolitsky O., How to re-invent CAD / PDM integration, 2014 URL:
     http://beyondplm.com/2014/05/19/how-to-re-invent-cad-pdm-integration.
[7] Koucky        S.,   Essentials    of   managing      product    design     data,   2001.    URL:
     https://www.machinedesign.com/archive/article/21816798/essentials-of-managing-product-
     design-data.
[8] Filinskikh A.D., Analysis of the state of project management systems at Russian enterprises,
     Bulletin of the Belgorod State Technological University named after V. G. Shukhov, No. 1, 2011,
     pp. 162-167. (in Russian).
[9] Gujarathi G.P., Ma Yongsheng, Parametric CAD/CAE integration using a common data model,
     Journal     of    Manufacturing      Systems,    30(3),     August      2011,    pp.    118-132.
     doi:10.1016/j.jmsy.2011.01.002.
[10] Lee K. Fundamentals of CAD (CAD/CAM/CAE), St. Petersburg: Piter, 2004, 560 p. (in Russian).
[11] Grieves M.W., Product lifecycle management: the new paradigm for enterprises (2005)
     International Journal of Product Development, 2 (1-2), pp. 71-84. doi: 10.1504/ijpd.2005.006669.
[12] Zakovryashin A.I., IPI technology for creating high-tech products, Electronic journal "Trudy
     MAI", No. 49, 2011, URL: http://trudymai.ru/upload/iblock/c01/ipi-tekhnologiya-sozdaniya-
     naukoemkikh-izdeliy.pdf (in Russian).
[13] ISO 10303-1-1994. Industrial Automation Systems and Integration-Presentation and exchange of
     product data-Part 1: Overview and basic principles, Geneva: ISO, 1994, 17 p.
[14] Levin A.I., Davydov A.N., Barabanov V.V., Concept for the development of CALS technologies
     in Russian industry, Moscow: Research Center of CALS-technologies "Applied Logistics", 2002,
     130 p. (in Russian).
[15] Shalumov A. S., Nikishin S.I., Noskov-Kovrov V. N., Introduction to CALS-technologies,
     Textbook. Manual, Kovrov state technol. akad., 2003, 184 p. (in Russian).
[16] Tutt W., Graph Theory, Translated from English: Mir, Moscow, 1988, 424 p. (in Russian).
[17] Filinskikh A.D., Sosnina O.A., Boityakov A.A., Hierarchical space of geometric model
     parameters, Bulletin of Belgorod State Technological University n.a. V.G. Shukhov, No 2, 2015,
     pp.131-134. (in Russian).
[18] Boytyakov A.A., Generalized model of data transfer between CAD- and PDM-systems using
     frames, Bulletin of Belgorod State Technological University n.a. V.G. Shukhov, No 3, 2015, pp.
     115 - 119. (in Russian).
[19] Golitsyna T.D, Issues of integration of product data management systems (PDM) and CAD,
     Scientific and Technical Bulletin of SPbSU ITMO - Information Technologies, No 6, 2009, pp.
     543- 547. (in Russian).
[20] Bertsun V.N., Mathematical modeling on graphs. Part 2, Tomsk State University Press, Tomsk,
     2013. 88 p. (in Russian).
[21] Filinskikh A.D., Byasherov A.Kh., Analysis of parametric and graphic information transfer based
     on experimental data, Bulletin of Belgorod State Technological University n.a. V.G. Shukhov, No
     2, 2012, pp. 164-166. (in Russian).
[22] Kalinina N.A. Models and Procedures of Hierarchical Network Representation of the Subject Area
     to Support Knowledge Acquisition Processes, D. thesis for the degree of Candidate of Technical
     Sciences. Nizhny Novgorod State Technical University, N. Novgorod, 2018, 180 p. (in Russian).