=Paper= {{Paper |id=Vol-3759/paper18 |storemode=property |title=Towards SHACL-based Knowledge Graph Transformation of Visual Domain Knowledge |pdfUrl=https://ceur-ws.org/Vol-3759/paper18.pdf |volume=Vol-3759 |authors=Simon Steyskal,Stefan Bischof,Josiane Xavier Parreira,Erwin Filtz,Michael Baumgart,David Gruber,Maximilian Liebetreu,Florian Rötzer,Stephan Strommer |dblpUrl=https://dblp.org/rec/conf/i-semantics/Steyskal0PFBGLR24 }} ==Towards SHACL-based Knowledge Graph Transformation of Visual Domain Knowledge== https://ceur-ws.org/Vol-3759/paper18.pdf
                                Towards SHACL-based Knowledge Graph
                                Transformation of Visual Domain Knowledge
                                Stefan Bischof1 , Erwin Filtz1 , Josiane Xavier Parreira1 , Simon Steyskal1 ,
                                Michael Baumgart2 , David Gruber2 , Maximilian Liebetreu2 , Florian Rötzer2 and
                                Stephan Strommer2
                                1
                                    Siemens AG Österreich, Vienna, Austria
                                2
                                    AIT Austrian Institute of Technology GmbH, Center for Vision, Automation & Control, Vienna, Austria


                                              Abstract
                                              Effective knowledge representation plays a pivotal role in harnessing the full potential of domain-specific
                                              information. Through tools like Infinity Maps, domain knowledge can be easily captured in a visual
                                              manner. However, translating these visually intuitive representations to formal, machine-processable
                                              formats often necessitates expert knowledge, thereby creating a significant barrier between domain
                                              experts and knowledge engineers. While domain experts possess deep understanding of their respective
                                              domains, they often lack the formalisation skills required to transform this knowledge into machine-
                                              readable formats. Conversely, knowledge engineers can design and implement sophisticated knowledge
                                              graphs, but may not have access to the domain-specific expertise necessary for effective knowledge
                                              representation. To address this challenge, we propose a novel approach that leverages SHACL (Shape
                                              Constraint Language) rules to transform visual domain knowledge expressed as Infinity Maps into
                                              knowledge graphs. Our method enables domain experts to define their knowledge structures using
                                              familiar Infinity Map representations, which are then transformed into standardised knowledge graphs
                                              compliant with the SHACL standard.

                                              Keywords
                                              Knowledge Graphs, Infinity Maps, RDF, SHACL, Semantic Web




                                1. Introduction and Motivation
                                The setup and optimisation of industrial production processes, such as, for example, high-
                                pressure die casting, heavily rely on the expert knowledge and experience of a few individuals
                                within a company. Consequently, domain knowledge often remains personal property rather
                                than a shared company asset, creating a dependency on specific personnel. Despite companies’
                                quality management standards, this valuable knowledge is frequently undocumented and
                                undigitised, making it inaccessible to the broader workforce. This lack of effective knowledge
                                management and transfer hinders resource-efficient, green production of advanced products,

                                SEMANTICS 2024
                                $ bischof.stefan@siemens.com (S. Bischof); erwin.filtz@siemens.com (E. Filtz); josiane.parreira@siemens.com
                                (J. X. Parreira); simon.steyskal@siemens.com (S. Steyskal); michael.baumgart@ait.ac.at (M. Baumgart);
                                david.gruber@ait.ac.at (D. Gruber); maximilian.liebetreu@ait.ac.at (M. Liebetreu); florian.roetzer@ait.ac.at
                                (F. Rötzer); stephan.strommer@ait.ac.at (S. Strommer)
                                 0000-0001-9521-8907 (S. Bischof); 0000-0003-3445-0504 (E. Filtz); 0000-0002-3050-159X (J. X. Parreira);
                                0000-0002-5183-2486 (S. Steyskal); 0009-0009-6776-4404 (M. Baumgart); 0000-0002-7544-5632 (D. Gruber);
                                0000-0001-8374-8476 (M. Liebetreu); 0000-0003-2661-5575 (F. Rötzer); 0009-0009-9349-9683 (S. Strommer)
                                            © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).




CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
such as complex die-casting parts, especially in times of skilled labour shortages. It has therefore
become a pressing issue for industries to find a low-threshold, minimal-effort way for experts to
digitise and share their knowledge, making it machine-readable, accessible and usable also for
non-domain experts. A toolchain that achieves this goal should include a highly accessible tool
for experts to document their domain knowledge. Examples of such tools are Conceptboard 1 ,
Microsoft Whiteboard 2 , and Infinity Maps 3 .
   In the present paper, we will use Infinity Maps due to their rich JSON exporting capabilities.
Additionally, a technology and/or database that is able to store knowledge in a structured form
is of utter importance. In our case, the output of the toolchain is a knowledge graph (KG) in
RDF format. The envisaged toolchain is outlined in Fig. 1.
Contribution. We design a transformation pipeline from Infinity Maps to a structured RDF
graph, based on the SHACL standard4 . Features include: (i) The authors of an Infinity Map
are free to dump a mixture of structured and unstructured knowledge and data into each Map.
(ii) Authors do not require any technical understanding of the KG that will be produced. (iii) By
employing SHACL, authors of an Infinity Map receive automated assistance in structuring
the domain knowledge for computational processing. (iv) Using SHACL, the generated KG is
guaranteed to conform to the requirements of any ontologies applied to the graph.

2. From Domain Knowledge to Knowledge Graph
Step 1: Visual modelling by schema. The first step in our pipeline is to organise and gain
an overview of all available expert and domain knowledge, which comes in various formats like
PNG, PDF, CSV files, emails, and interviews. Infinity Maps excels in visualising and organising
this unstructured information. It allows the creation and connection of cards into tree-like
hierarchies, a feature we use extensively to efficiently manage and structure our knowledge
base. Once an overview on all available knowledge sources has been attained, Infinity Maps
can also be used to combine and organise what information was learned from these sources. To
this end, we designed a loose schema based on the gathered information and visually modelled
this domain knowledge accordingly.
Remark. The availability of visual tools for structured KG creation is limited. This is unsurprising: KGs are
a comparatively novel concept, and while they have had a lot of success in recent years, most success stories
originate from fields and applications that create such KGs automatically from other forms of structured
databases, with the main challenge being leveraging the information contained within (triple prediction for
recommender systems, etc.) [1, 2, 3]. However, in industrial production domains, the degree of digitisation is
often surprisingly low, with knowledge being available in hand-written form, separate documents, literature,
and the minds of experts and operators drawing from said literature and their own experience. Structuring
this knowledge in any form, but particularly in an explainable way and one that can be easily and efficiently
queried, is a vital step towards leveraging all available knowledge to optimise processes and products.
Step 2: Visual modelling by ontology. In the later stages of visual modelling, we developed
an ontology for the envisaged KG, thereby clarifying the modelling guidelines. We utilised tags
and tree-like hierarchies within Infinity Maps to denote relationships between entities and label
1
  Conceptboard: https://conceptboard.com/
2
  Microsoft Whiteboard: https://www.microsoft.com/en-us/microsoft-365/microsoft-whiteboard
3
  Infinity Maps: https://infinitymaps.io/
4
  SHACL: https://www.w3.org/TR/shacl
Figure 1: Transformation of domain knowledge to knowledge graph using visual modelling with Infinity
Maps and SHACL.

cards accordingly. These tags and hierarchical logic subsequently guides the transformation
process from Infinity Maps to the KG.
   A significant drawback of Infinity Maps is the absence of dynamic links between cards.
Although each card has a unique URL and can be referenced via hyperlinks, these links are
static text and can easily break during the Map authoring process. To maintain simplicity for
human readability, we opted to cross-reference tagged cards by ensuring that each combination
of card label and tag remains unique throughout the entire Infinity Maps project.
   Due to this limitation in dynamic linking, we anticipate the appearance of duplicate entities
and errors in triplets when converting from Infinity Maps to a KG. This necessitates additional
constraints and rules for a successful transformation.
Step 3: Transformation by constraints. Infinity Maps allows exporting each Map to JSON
format. We transform these JSON structures to a KG using Python code and shapes implemented
in SHACL. This procedure is described in much more detail in Section 3.
Closing the loop. From the KG, new domain knowledge can be gained by experts. Any new
knowledge can be added to the Infinity Maps and the KG itself, thereby closing the loop. An
overview of the entire pipeline is given in Fig. 1.

3. SHACL-based Transformation Framework
The Shapes Constraint Language (SHACL), is a W3C recommendation designed for validating
RDF graphs against a set of SHACL shapes, i.e. the constraints the to-be-validated RDF graph
has to adhere to. Such constraints can include (but are not limited to), e.g., checking existence of
particular properties, data types, value ranges, and relationships between nodes5 . Using SHACL,
one can ensure that the data conforms to the expected structure and semantics, enabling reliable
data integration and interoperability.
   Additionally, we utilise SHACL Rules6 to facilitate the transformation of raw data (i.e., the
Infinity Maps JSON exports) into a structured, and semantically enriched representation that
aligns with predefined ontologies and the domain understanding as provided by the domain
experts.
5
    SHACL Core Components: https://www.w3.org/TR/shacl/#core-components
6
    SHACL Rules were introduced as part of the SHACL Advanced Features Note: https://www.w3.org/TR/shacl-af.
                                                                                   id           MftLLDMDB3f       9nf9HL7GjJf
                                                                                   children                       Hmj7nQmbND9
                                                                                   title        xyz               ...
                                                                                   color        @blue
                                                                                   parent
                                                                                   tags                           t3g4MHFRFHd

                                                                                   id           Hmj7nQmbND9
                                                  MftLLDMDB3f                      children
                                                  Hmj7nQmbND9                      title        Y
                                                  ...                              parent       MftLLDMDB3f

                mapId PTtNf3N37Tm                                                  tags

                title   Example Structures
                                                                           id           9nf9HL7GjJf:Hmj7nQmbND9
                nodes
                                                                           start 9nf9HL7GjJf
                edges                        9nf9HL7GjJf:Hmj7nQmbND9
                                                                           end          Hmj7nQmbND9
                root    MftLLDMDB3f
                                                                           title        property abc
                tags

                                                                                   id        t3g4MHFRFHd
                                                                                   name Function: Process
                                                                                   color @blue



Figure 2: Excerpt of an Infinity Maps JSON export.

3.1 Handling Infinity Maps Data As                                     ex:FH9d47H93gf a dg:KPI, im:Node ;
                                                                           rdfs:label "Schließkraft Err" ;
shown in Fig. 2, an Infinity Maps JSON ex-                                 im:child ex:LnrDTfHTHqF ;
port follows a very basic structure. At its core,                          im:id "FH9d47H93gf" ;
each Infinity Map is represented as a JSON                                 im:parent ex:99dqJd2rjLb ;
object with three main elements: nodes, edges,                             im:tag ex:Jj3qLGjGhGp ;
                                                                           im:title "Schließkraft Fehler" .
and tags. Where nodes represent all nodes in
the Map, edges all edges between nodes, and                            ex:LnrDTfHTHqF a im:Node ;
tags are all tags used in the Map. As depicted                             rdfs:label "Abhängigkeiten" ;
in Listing 1, each node has a unique identifier,                           im:child ex:P2jBpjh8RjM, ex:Qbd7rnb ;
                                                                           im:id "LnrDTfHTHqF" ;
a title, and optionally a reference to its par-                            im:parent ex:FH9d47H93gf ;
ent node, and a list of references to any of its                           im:title "Abhängigkeiten" .
child nodes. Each edge has a source and tar-
get node, and a name. Each tag has a unique                            ex:P2jBpjh8RjM a dg:Quantity, im:Node ;
identifier and a title7 .                                                  rdfs:label "Schließkraft Err (Gießen)" ;
                                                                           im:id "P2jBpjh8RjM" ;
                                                                           im:parent ex:LnrDTfHTHqF ;
3.2 Enrichment with Domain-Specific                                        im:tag ex:BLJD8RqhTrd ;
SHACL Rules Based on modeling                                              im:title "Schließkraft Err (Gießen)" .
guidelines specified by the process/domain
experts, we define SHACL rules that capture Listing 1: KPI that has a Quantity as dependency.
the unique semantics and requirements of the domain. For example, one of the guidelines
states that relations are represented by chaining at least two parent-child relationships between
a starting entity and one or more target entities, where a relation contains exactly one node
in its path that defines the type of the relation. The SHACL rule in Listing 2 captures this by
searching for paths starting from a set of focus nodes of type dg:Quantity, traversing through
one or more im:child relationships to intermediate nodes ? p , and finally reaching target entities
? mid . Using the VALUES clause, we define the properties to be used based on the labels of the
intermediate nodes. For example, for the example triples in Listing 1, the following triple would
7
    Due to space limitations, we have not included sample triples for tags or edges.
be generated: ex:FH9d47H93gf dg:has_dependency ex:P2jBpjh8RjM .
  enr:QuantityRule a sh:SPARQLRule ;
     sh:construct """
         CONSTRUCT {
             $this ?rel ?mid .
         } WHERE {
             $this im:child ?typ .
             ?typ rdfs:label ?l ; im:child* ?p .
             ?p im:child ?mid .
             ?mid im:tag ?tag ; a ?target .
             VALUES (?l ?rel ?target) {
                 ( "Abhängigkeiten" dg:has_dependency dg:Quantity)
                 ( "Messung" dg:has_measurement dg:Signal)
             }
         }""" ;
     sh:condition [ # evaluate rule only if focus node is a dg:Quantity
         sh:property [
             sh:path rdf:type ;
             sh:hasValue dg:Quantity ;
         ] ;
     ] ; sh:prefixes  .

Listing 2: SHACL Rule for creating relations based on guidelines provided by domain experts.

Conclusion and Future Work In this paper, we presented a novel approach that enables
domain experts to model their knowledge using an easy and intuitive visual representation,
which is then exported as JSON, and afterwards transformed into a semantically enriched KG
representation using SHACL. Future work will focus on integration of additional SHACL rules
as well as evaluation of the transformation process on different real-world use cases.

Acknowledgments
This work was conducted within the Austrian research project DG Assist (FFG project number:
FO999899053). This project is funded by the Federal Ministry for Climate Protection, Envir-
onment, Energy, Mobility, Innovation and Technology, BMK, and is carried out as part of the
Production of the Future programme.

References
[1] A. Hogan, et al., Knowledge graphs, ACM Comput. Surv. 54 (2021). URL: https://doi.org/10.
    1145/3447772. doi:10.1145/3447772.
[2] J. Liu, L. Duan, A survey on knowledge graph-based recommender systems, in: 2021 IEEE
    5th Advanced Information Technology, Electronic and Automation Control Conference
    (IAEAC), 2021, pp. 2450–2453. doi:10.1109/IAEAC50856.2021.9390863.
[3] N. Noy, Y. Gao, A. Jain, A. Narayanan, A. Patterson, J. Taylor, Industry-scale knowledge
    graphs: lessons and challenges, Commun. ACM 62 (2019) 36–43. URL: https://doi.org/10.
    1145/3331166. doi:10.1145/3331166.