=Paper= {{Paper |id=Vol-3632/ISWC2023_paper_465 |storemode=property |title=Literal-Aware Knowledge Graph Embedding for Welding Quality Monitoring |pdfUrl=https://ceur-ws.org/Vol-3632/ISWC2023_paper_465.pdf |volume=Vol-3632 |authors=Zhipeng Tan,Zhuoxun Zheng,Antonis Klironomos,Mohamed H. Gad-Elrab,Guohui Xiao,Ahmet Soylu,Evgeny Kharlamov,Baifan Zhou |dblpUrl=https://dblp.org/rec/conf/semweb/TanZKG0SKZ23 }} ==Literal-Aware Knowledge Graph Embedding for Welding Quality Monitoring== https://ceur-ws.org/Vol-3632/ISWC2023_paper_465.pdf
                                Literal-Aware Knowledge Graph Embedding for
                                Welding Quality Monitoring
                                Zhipeng Tan1,2,* , Zhuoxun Zheng1,3 , Antonis Klironomos1,4 , Muhammed Gad1 ,
                                Guohui Xiao5 , Ahmet Soylu6 , Evgeny Kharlamov1,3,* and Baifan Zhou6,3,*
                                1
                                  Bosch Center for AI, Germany
                                2
                                  RWTH Aachen, Germany
                                3
                                  Department of Informatics, University of Oslo, Norway
                                4
                                  University of Mannheim, Germany
                                5
                                  University of Bergen, Norway
                                6
                                  Department of Computer Science, Oslo Metropolitan University, Norway


                                                                         Abstract
                                                                         Recently there has been a series of studies in knowledge graph embedding (KGE), which attempts to
                                                                         learn the embeddings of the entities and relations as numerical vectors and mathematical mappings
                                                                         via machine learning (ML). However, there has been limited research that applies KGE for industrial
                                                                         problems in manufacturing. This paper investigates whether and to what extent KGE can be used for an
                                                                         important problem, that is quality monitoring for welding in manufacturing industry. It is an important
                                                                         process accounting for production of millions of cars annually. The work is in line with our research of
                                                                         data-driven solutions that intends to replace the traditional costly quality monitoring. The paper tackles
                                                                         two challenging questions simultaneously: how large the welding spot diameter is; and to which car body
                                                                         the welded spot belongs to. The problem setting is difficult for traditional ML because there exist a high
                                                                         number of car bodies that should be assigned as class labels. We formulate the problem as link prediction,
                                                                         and experimented popular KGE methods with literals on real industry data, with consideration of literals.
                                                                         This paper accompanies the full paper in in-use track and provides additional discussion on problem
                                                                         formulation, literal handling strategies, and included information in industrial KG construction.


                                1. Introduction
                                Background and Challenge. Research in knowledge graphs and their industrial applications
                                has attracted increasing attention [1]. Recently there has been a series of studies in knowledge
                                graph embedding (KGE), but there has been limited research that applies KGE for industrial
                                problems in manufacturing. This paper investigates whether and to what extent KGE can
                                be used for an important problem, that is quality monitoring for welding in manufacturing
                                industry. We discuss automated welding, which plays a critical role in the automotive industry
                                for manufacturing high-quality car bodies, with millions of cars produced annually. The welding
                                process generates a vast amount of data, considering the number of welding machines in car
                                production lines and the thousands of spots on each carbody (Fig. 1a). This large amount of
                                data increases the demand of data-driven solutions, which aim to reduce and eventually replace

                                ISWC2023: The 22nd International Semantic Web Conference, November 06–10, 2023, Athens, Greece
                                *
                                 Corresponding author.
                                $ zhipeng.tan@rwth-aachen.de (Z. Tan); evgeny.kharlamov@de.bosch.com (E. Kharlamov); baifanz@ifi.uio.no
                                (B. Zhou)
                                                                       © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
                                    CEUR
                                    Workshop
                                    Proceedings
                                                  http://ceur-ws.org
                                                  ISSN 1613-0073
                                                                       CEUR Workshop Proceedings (CEUR-WS.org)




CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
       a                              !
                                            Question 1:
                                                                      diameter as classes
                                                                                                   link prediction:
                                                                                                                             owl:Class
                                                                                                                             rdf:type         ab
                                           spot diameter?                  Diameter                   predict tail           owl:ObjectProperty
      Current,                𝑰                                             <0.1                                             rdfs:subClassOf
                                              regression:
      voltage,
     Resitance,
                                          predict real number               Diameter
                                                                            0.1-0.2
                                                                                            ?
                                                                                                     Spot103
       Time,                                            discretisation     Diameter
       Power                                            entity creation    0.2-0.3
                                                                                        head   predicate      tail        score
        etc.                                                               Diameter    Spot103 rdf:type Diameter 0.1-0.2? 0.65       prediction,
                                                                              ...      Spot103 rdf:type Diameter 0.2-0.3? 0.52       label, rank=1
                                                                                       Spot103 rdf:type Diameter 0.3-0.4? 0.34

                          Moving
                                      Question 2: Which                                carbody parts                                          bc
                         direction                                                                          Car body
                                         carbody part?                                   as entities         part3
                                      classification:                                              Car body
                                     predict carbody                                                 part1
                   Car body part         part label    part1 part2 part3
                                                                         entity creation Car body                     ?   Spot103
                                                                                                  part2
                                                    ?
                         Welding
                     𝑫               welding spot                                            head   predicate      tail      score
                                                                                                                                     prediction
                          Spot                                                              Spot103 belongsTo Carbody part1? 0.76
                                                                                            Spot103 belongsTo Carbody part2? 0.55    label, rank=2
                                                          part4 part5 part6                 Spot103 belongsTo Carbody part3? 0.43

Figure 1: Two core questions in welding quality monitoring: Question 1 (Q1), how large is the spot
diameter? Question 2 (Q2), which car body part does this spot diameter belongs to?

conventional destructive, yet extremely expensive and inefficient, methods. Addressing this
challenges, two core questions need to be answer here as shown in Fig. 1b and Fig. 1c. First, spot
diameter (Q1) is the key quality indicator for evaluating welding quality. Since the diameter must
be neither too large nor too small. Second, the carbody part of the spot is important because
it is essential to know the percentage of good spots for each car body, but the carbody-spot
correspondence information does not exist in a large amount of historical data silos from past
welding systems. Currently there are research applying ML and Digital Twin in industry [2],
but limitedly solve this challenge especially the two core questions.
Our Approach. We investigate KGE for answering the two questions in the automotive industry
with production data, and compare with representative classic ML. We first construct KGs from
tabular data with special handling on literals; then, we conduct experiments and compare
mainstream KGE methods such as TransE, RotatE, AttH with multilayer perceptron (MLP), and
compare with variant KGE applied in [3]. This poster paper accompanies our full paper in ISWC
2023 [4]. This paper provides additional discussions on problem formulation, literal handling,
and industrial KG construction, which were not present in the full paper.

2. Approach
Welding KG construction. A welding KG is constructed from tabular data collected from a
German factory. We have used welding-related information, such as time of welding processes,
welding machines, welding programs, and welding parameters (e.g., voltage, current, resistance).
The constructions are conducted on welding spots and the car body and diameters. We transform
the values of the welding data table into entities and the relationships between these entities as
edges in the KG. Fig.2b shows the construction of literal entities, which are entities generated
from numeric values. The literals are handled as described in the later part.
Problem formulation. Fig. 1b and Fig. 1c shows in more details the two research questions
of the quality monitoring in the use case and our approach to reformulate them: given the
information of the welding spot, we want to predict the carbody of the this welding spot and
the diameter of the welding spot. As shown in the Fig. 1, both of the problems are reformulated.
The Spot diameter prediction was a regression problem based on the welding data to predict
the real values for the diameters size. Due to resolution when measuring the spot diameter,
    current (I)
                                                                                   diameter as classes
                                                                    a b                     Diameter              Diameter             Diameter                     Diameter
                                                                                               <0.1                0.1-0.2              0.2-0.3                        ...
              stage2                             stage3

 stage1
                     aggregation                                 time        literal entities
                                                                                             U_mean
                                                                                                                                       ?
                                                                                              <0.1            hasVoltagemean
    spot_id         I_mean I_1_mean I_2_mean I_3_mean                           I_mean
                                                                                  <0.1                                                                    belongsTo
    Spot103         0.72     0.63         0.80            0.60
                                                                                         hasCurrentmean
    Spot104         0.57     0.52         0.70            0.49                                                                 Spot103
                                                                                                                                                           ?                Car body
                                                                                                                                                                             part2
                    discretisation for each feature                   e ct                          ed   By
                                                                   nn G                          duc
                                                                 co o K                    pro
                                                                    t
                                                                             Machine1                                                                rt




                                                                                                                                                                                   hasC
                                                                                                                                                   pa                 nt1




                                                                                                                      am
          I_mean                I_mean                I_mean                                                                                  dy                    ne




                                                                                                                                                                          nt2
                                                                                                                                        rbo                    po




                                                                                                                   gr




                                                                                                                                                                                    ompo
                                                                                                                                      Ca




                                                                                                                  ro
            <0.1                0.1-0.2               0.2-0.3                                                                                                om




                                                                                                                                                                        ne
                                                                                                                                  s




                                                                                                                 sP
                                                                                     ope                                        ha                      sC




                                                                                                                                                                      po
                                                                                                                                                      ha




                                                                                                               ha
                                                                                        rate




                                                                                                                                                                    om
                                                                                            dBy




                                                                                                                                                                                      nent3
                                                                                                                                                                 sC
                                                                                                                                                               ha
                                                                                                 Program1                       Component1
          I_mean                I_mean                I_mean                        owl:Class
          0.3-0.4               0.4-0.5               0.5-0.6                       rdf:type                                                         Component2                 Component3
                                                                                    owl:ObjectProperty
              ...                   ...                    ...                      rdfs:subClassOf


Figure 2: (a) Procedure of literal embedding (b) Partial illustration of the welding KG

we discretise the diameters into different diameter classes as classification problem. We then
constructed the entities based on the diameter classes and carbody class. The links between
welding spots and the diameter classes or the carbody are predicted.
Literal handling. We did the following steps for literal embeddings inspired by [5]. The
numeric literals of the knowledge graphs are embedded following aggregation, discretisation
of features, entity creation, and linking. As shown in the Fig. 2, in the aggregation step, the
sensor measured values are aggregated into the mean values of the three stages and the overall
mean values in real numbers. Then in the discretisation step, we discretise the real values into
different ranges. And then we create entities based on the discretised ranges and link them.

3. Evaluation and Discussion
We discuss the experiments and provide additional discussion on the problem formulation, the
literal handling and discretisation techniques, and the industrial KG construction.
Discussion on problem formulation. We discuss three promising ways: (1) classic ML
with MLP; (2) classic KGE with link
                                        Table 1: Model performance comparison on answering Q1 and
prediction; (3) binary triple classifi-
                                                 Q2. Bold: best results. Underlined: second best.
cation inspired by [3].
                                                                                                                   MLP          TransE                       RotatE               AttH
MLP. In manufacturing, quality               Acc(Hits@1)      0.39    0.42      0.25   0.31
monitoring aims at predicting diam- Q1          MRR             -     0.65      0.49   0.57
eters and carbody. These questions              nrmse         0.05    0.06      0.08   0.06
are formulated as classification prob-       Acc(Hits@1)      0.61    0.64      0.52   0.53
lems, where carbodies and diame- Q2             MRR             -     0.77      0.69   0.70
ters are formulated as the predicted       Hits@Groupby3        -     0.85      0.81   0.79
classes. This formulation is verified
by the MLP model. This model is proven to be most efficient and have best performance only in
nrmse in Q1, but provides inadequate performance for Q2, see Tab. 1.
KGE. Since there can be over hundreds and thousands of carbodies, Q2 is not very suitable
as a classic ML problem. We adapt the problem and reformulate it as link prediction (Fig. 2),
where the correct link between a welding spot and the correct carbody should have the highest
score among all the other carbodies. Similar principles also hold for diameter predictions.
This formulation is verified by the KGE model. We use metrics, including Acc, MRR, and
Hits@Groupby3. We introduce a new metric Hits@Groupby3, because no KGE model deliver
satisfactory Acc. We thus adopt the adaptation to relax the metric, to test the accuracy based
on the group of 3 carbodies. The results, see Tab. 1, show TransE delivers the best results for
both Q1 and Q2 on Acc, MRR, and Hits@Groupby3.
KGE-MLP. The third possible problem formulation is inspired by [3], where the link is formulated
as binary classification with the output score between 0 and 1, where a value closer to 1 indicates
the link exists. The score is compared with all other potential diameters or carbodies, and the
predicted link the highest score is selected as the prediction. This formulation is verified by the
KGE-MLP model. This model proved to be not good on Acc, MRR, and other metrics compared
with the MLP and KGE models in our welding quality monitoring use case, but is State-of-the-Art
(SotA) model on other use case [3]. So we still choose this model in our paper.
Discussion on literal landling. In
                                       Table 2: Best KGE model compared with KGE-MLP models.
SotA research, there exist mainly
                                                 The KGE-MLP models are marked with*. Bold: best.
two ways to handle literals: discreti-
sation (KGA) [5] and literal as em-               Metric       TransE TransE* DistMult* HolE*
bedding vector (LiteralE) [6]. Ac-             Acc(Hits@1)      0.42    0.17       0.22      0.21
cording to the experiments in [5],       Q1        MRR          0.65    0.45       0.48      0.48
                                                   nrmse        0.06    0.11       0.09      0.10
the discretisation with bins meth-
                                                  Hits@1        0.64    0.34       0.34      0.37
ods yield SotA results with large im-
                                         Q2        MRR          0.77    0.48       0.52      0.41
provements on the traditional KGE.
                                             Hits@GroupBy3 0.85         0.45       0.46      0.52
We also consider [6] not suitable, be-
cause it requires fixed embedding
size for fixed number of literals, while in real application the number of literals can vary a lot.
This paper chooses the discretisation with bins as in [5] method to encode the literal information.
Other discretisation methods are compared and discussed in the next part.
Discussion on discretisation strategy. There are different discretisation approaches discussed
in the paper, including the single setting without overlapping, overlapping and hierarchical
settings. There are also two different bins creation methods based on frequencey or the fixed
value. According to the experiment results the single bin with fixed value is simple and the
performance differences between different discretisation strateties are insignificant, so we
choose this method on the welding quality monitoring.
Discussion on included information in industrial KG construction. In the KG construction
step, we also notice it is important to consider the impact of the tabular columns on the KGE
performance. The number of columns in production data are very large (over 200), but only few
information are important based on domain knowledge, and most of them are meta-settings,
or overlapping information that do not contribute to the KGE performance. This is a common
problem for industrial KGs, especially for industries such as manufacturing, mining, chemistry,
where large amount of information is collected, but only few are essential for the quality
monitoring.. Thus, we need to choose the least amount of columns that represent the most
important information for welding. This selection process is done by iteratively updating the
features and evaluating the performance. Our observations are, the most important information
are (1) those that are crucial to the graph structure of the KG, such as welding machine, welding
program, the materials and thickness of carbody, that have impact define the graph structure;
(2) sensor values (literals in KG).


4. Conclusion and Outlook
This poster paper presents an extended abstract of our full paper [4] and provides additional
discussions. The research is under the under the umbrella of Neuro-Symbolic AI for Industry
4.0 at Bosch. We aim at enhancing manufacturing technology with both symbolic AI [7]
(such as semantic technologies) for improving transparency [1], and ML for prediction power.
We will further improve the performance of the KG embedding method and develop other
complementary technologies, such as ontologies [8, 9], ontology-based data access, etc.
Acknowledgements. The work was partially supported by EU projects Dome 4.0 (953163),
OntoCommons (958371), DataCloud (101016835), Graph Massiviser (101093202), EnRichMyData
(101093202), and SMARTEDGE (101092908) and the Norwegian Research Council funded projects
(237898, 308817).


References
[1] Z. Zheng, B. Zhou, D. Zhou, A. Soylu, E. Kharlamov, Executable knowledge graph for
    transparent machine learning in welding monitoring at bosch, in: CIKM, 2022, pp. 5102–
    5103.
[2] Z. Huang, M. Fey, C. Liu, E. Beysel, X. Xu, C. Brecher, Hybrid learning-based digital
    twin for manufacturing process: Modeling framework and implementation, Robotics and
    Computer-Integrated Manufacturing 82 (2023) 102545.
[3] E. B. Myklebust, E. Jimenez-Ruiz, J. Chen, R. Wolf, K. E. Tollefsen, Knowledge graph
    embedding for ecotoxicological effect prediction, in: ISWC, 2019.
[4] Z. Tan, B. Zhou, Z. Zheng, O. Savkovic, Z. Huang, I. G. Gonzalez, A. Soylu, E. Kharlamov,
    Literal-aware KGE for welding quality monitoring, in: ISWC, 2023.
[5] J. Wang, F. Ilievski, P. A. Szekely, K.-T. Yao, Augmenting knowledge graphs for better link
    prediction, in: IJCAI, 2022.
[6] A. Kristiadi, M. A. Khan, D. Lukovnikov, J. Lehmann, A. Fischer, Incorporating literals into
    knowledge graph embeddings, in: ISWC, 2019.
[7] D. Rincon-Yanez, M. H. Gad-Elrab, D. Stepanova, K. T. Tran, C. C. Xuan, B. Zhou, E. Karlamov,
    Addressing the scalability bottleneck of semantic technologies at bosch, ESWC Industry
    (2023).
[8] B. Zhou, Z. Zheng, D. Zhou, Z. Tan, O. Savković, H. Yang, Y. Zhang, E. Kharlamov, Knowledge
    graph-based semantic system for visual analytics in automatic manufacturing, ISWC, 2022.
[9] Z. Zheng, B. Zhou, D. Zhou, A. Q. Khan, A. Soylu, E. Kharlamov, Towards a statistic ontology
    for data analysis in smart manufacturing, in: ISWC Posters, volume 3254, 2022.