<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Serving Bosch Production Data as Virtual KGs ?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Elem Güzel Kalaycı</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Irlan Grangel González</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Felix Lösch</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Guohui Xiao</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anees ul-Mehdi</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Evgeny Kharlamov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Diego Calvanese</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Bosch Center for AI</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Bosch Corporate Research</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Free University of Bozen-Bolzano</institution>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>University of Oslo</institution>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>Virtual Vehicle Research GmbH</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Analyses of manufacturing processes is vital for effective and efficient manufacturing. In complex industrial settings, such analyses should account for data that comes from many different and highly heterogeneous machines, and thus are affected by the data integration challenge. In this work, we show how this challenge can be addressed with semantics using Virtual Knowledge Graphs. For this purpose, we propose the SIB Framework, in which we semantically integrate Bosch manufacturing data. In this demo we we present SIB in action on 2 scenarios for the analysis of the Surface Mounting Process (SMT) pipeline.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>electronic control units. The scenario of the demo is the product quality analysis
that is performed at the plants and that requires integration of vast amounts of
heterogeneous data. More precisely, the demo is focused on failure detection for
Surface Mounting Process (SMT) that fundamentally relies on the integration
and analysis of data generated by the machines deployed in different phases
of the process. Such machines, e.g., for placing electronic components (SMD)
and for automated optical inspection (AOI) of solder joints, usually come from
different suppliers and they rely on distinct formats and schemata for managing
the same data across the process. Hence, the raw, non-integrated data does not
give a coherent view of the whole SMT process and hampers analysis of the
manufactured products. During the demo the attendees will be able to explore
the SMT Ontology we developed, observe sample SMT data, and mappings
between the data and ontology. Moreover, we encoded relevant product analysis
tasks into a catalog of SPARQL queries formulated over the SMT Ontology. The
demo attendees will be able to explore the product analyses tasks, how they were
encoded in SPARQL, and how easy such complex tasks can be achieved with
the help of SIB. In particular, the latter will be shown by comparing SPARQL
queries to the native database queries over the underlying SMT data.</p>
      <p>This demo accompanies our accepted in-use track paper at ISWC’20 [3].
2</p>
    </sec>
    <sec id="sec-2">
      <title>Our Solution</title>
      <p>Our SIB solution for semantic integration of manufacturing data of the SMT
process is depicted in Figure 1. Note that the raw manufacturing log data comes
in JSON files generated by various machines and then it is extracted and loaded
into a PostgreSQL database. For processing of queries posed over the VKG, SIB
relies on the state-of-the-art VKG framework Ontop that computes answers
enduser SPARQL queries by translating them into SQL queries, and delegating the
execution of the translated SQL queries to the original data sources.</p>
      <p>Note that the VKG approach does not require to materialize into a KG all
facts entailed by the ontology. Moreover, the workflow of Ontop can be divided
into an off-line and an online stage. As the first step at the off-line stage, Ontop
loads the OWL 2 QL ontology and classifies it via the built-in reasoner, resulting
in a directed acyclic graph stored in memory that represents the complete
hierarchy of concepts and that of properties. In the second step, Ontop constructs a
so-called saturated mapping, by compiling the concept and property hierarchies
into the original VKG mapping. This aspect is important also in SIB, since the
domain knowledge encoded in the ontology allows for simplifying the design of
the mapping layer. During the offline stage, Ontop also optimizes the saturated
mapping by applying structural and semantic query optimization.</p>
      <p>
        During the online stage, Ontop takes a SPARQL query and translates it into
SQL by using the saturated mapping. To do so, it applies a series of
transformations that we briefly summarize here [
        <xref ref-type="bibr" rid="ref2">2,6</xref>
        ]: (i) it rewrites the SPARQL query
w.r.t. the ontology; (ii) it translates the rewritten SPARQL query into an
algebraic tree represented in an internal format; (iii) it unfolds the algebraic tree
w.r.t. the saturated mapping, by replacing the triple patterns with their
optimized SQL definitions; and (iv) it applies structural and semantic techniques to
optimize the unfolded query. One of the key points in the last step is the
elimination of self-joins, which negatively affect performance in a significant way. To
perform this elimination, Ontop utilizes in an essential way the key constraints
defined in the data sources. In those cases where it is not possible to define these
key constraints explicitly in the data sources, or to expose them as metadata of
the data sources so that Ontop can use them, Ontop allows one to define them
implicitly, as part of the mapping specification. The data we have been working
with in the Bosch use case was mostly log data and stored as separate tables
containing often highly denormalized and redundant data. Consequently, there
were a significant amount of constraints in the tables that are not declared as
primary or foreign keys, which brought significant challenges to the performance
of query answering. To address these issues, we had to declare these constraints
manually, and supply them as separate inputs to Ontop.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Demonstration Scenarios</title>
      <p>We prepared two scenarios for the demo:
[S1:] SIB Deployment over Bosch data. In this scenarios the attendees
will get a better understanding of the data integration challenge with the Bosch
SMT use case and how it can be addressed with the help of semantic technologies
offline, prior performing the actual data analyses. In particular, the attendees will
be able to look closer at Bosch manufacturing data, to understand particularities
of SMD and AOI data formats. Then, the attendees will study the Bosch SMT
Ontology by zooming into its classes and properties. Finally, they will be able
to study mappings relating the ontology and the data.
[S2:] Product analysis with SIB. In this scenario the attendees will be able to
benefit from the deployed Bosch VKG solution. In particular, the attendees will
study several product analysis tasks for the Bosch SMT use case. Then, they will
study how these tasks can be expressed by means of suitable SPARQL queries
over the SMT Ontology. Notably, such queries make use of ontology terms to
refer to the relevant information assets, and thus are very close to the natural
language formulation of the analysis tasks, which in turn makes it easy for Bosch
engineers to formulate them. Then, the attendees will experience how to obtain
the respective analysis data coming from the process logs, by simply executing
such queries over the underlying database via the SIB VKG engine. Finally,
the attendees will compare the SPARQL queries and their SQL counterparts to
witness how much easier the former are comparing to the latter in terms of the
size, number of joins, readability of schema elements.</p>
      <p>We now illustrate the data and the queries for the two scenarios. The data is
mainly based on two sets of relational tables: SMD Tables whre the most notable
ones are smd_event, smd_location, smd_panel, smd_components, and AOI Tables
with aoi_event, aoi_location, aoi_panel, and aoi_failures. Consider a sample
example record in one of these tables:
smd_panel
panelId boardNo machineName processedTS location
p01 b01 SMD Machine 1 24-04-2020 mes01</p>
      <p>We prepared 13 analytical tasks for the demo and they were the result of a
collaborative work and a careful selection during two visits to Bosch plants and
meetings with Bosch line engineers and line managers. The queries offer a good
balance among three dimensions: they are representative for product analyses,
offer a good coverage of product analyses tasks, and they are complex enough
to account for a reasonable number of domain terms. Consider one such query
in natural language and in SPARQL:</p>
      <p>Query q3: “Return all panels processed from a given time T up to the
detection of a failure.”</p>
      <p>Despite the temporal nature of the query it can be realized in SPARQL:
1 SELECT DISTINCT ? panel ?ts ? eventTime
2 WHERE {? panel psmt : pTStamp ?ts . {
3 SELECT ? eventTime
4 WHERE {? eventfailure fsmt : eTStamp ? eventTime .
5 FILTER (? eventTime &gt; ’2018 -06 -01 T00 :06:00.000+02:00 ’^^ xsd: dateTimeStamp )}
6 ORDER BY (? eventTime ) LIMIT 1 }
7 FILTER (? ts &gt; ’2018 -06 -01 T00 :06:00.000+02:00 ’^^ xsd: dateTimeStamp &amp;&amp; ?ts &lt; ? eventTime ) }
3. Kalaycı, E.G., González, I.G., Lösch, F., Xiao, G., ul Mehdi, A., Kharlamov, E.,
Calvanese, D.: Semantic integration of bosch manufacturing data using virtual
knowledge graphs. In: Proc. ISWC. (2020)
4. Kharlamov, E., Hovland, D., Skjaeveland, M.G., Bilidas, D., Jiménez-Ruiz, E., Xiao,
G., Soylu, A., Lanti, D., Rezk, M., Zheleznyakov, D., Giese, M., Lie, H., Ioannidis,
Y., Kotidis, Y., Koubarakis, M., Waaler, A.: Ontology based data access in Statoil.</p>
      <p>J. Web Semantics 44 (2017) 3–36
5. Xiao, G., Ding, L., Cogrel, B., Calvanese, D.: Virtual knowledge graphs: An overview
of systems and use cases. Data Intelligence 1(3) (2019) 201–223
6. Xiao, G., Kontchakov, R., Cogrel, B., Calvanese, D., Botoeva, E.: Efficient handling
of SPARQL optional for OBDA. In: Proc. ISWC. LNCS, Springer (2018) 354–373</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Bienvenu</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosati</surname>
          </string-name>
          , R.:
          <article-title>Query-based comparison of mappings in ontology-based data access</article-title>
          .
          <source>In: Proc. KR</source>
          , AAAI Press (
          <year>2016</year>
          )
          <fpage>197</fpage>
          -
          <lpage>206</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Calvanese</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cogrel</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Komla-Ebri</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kontchakov</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lanti</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rezk</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rodriguez-Muro</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xiao</surname>
          </string-name>
          , G.:
          <article-title>Ontop: Answering SPARQL queries over relational databases</article-title>
          .
          <source>Semantic Web J</source>
          .
          <volume>8</volume>
          (
          <issue>3</issue>
          ) (
          <year>2017</year>
          )
          <fpage>471</fpage>
          -
          <lpage>487</lpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>