<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Bridging Expert Knowledge and AI: A Semantic Architecture for Manufacturing Knowledge Management using Knowledge Graphs and Large Language Models</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Camilla Hemmer</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Annariina Komljenovic</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jaroslaw Warzecha</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>valantic GmbH</institution>
          ,
          <addr-line>Ainmillerstrasse 22, 80801 München</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Manufacturing companies face a critical knowledge management crisis where decades of operational expertise remains trapped in expert minds, leading to extended onboarding periods, ineficient decision-making and knowledge loss through retirement. This paper presents the development of a semantic AI assistant system that combines Knowledge Graphs with Large Language Models to transform how manufacturing engineers access and utilize institutional knowledge. The approach addresses the integration of scattered data across incompatible systems through a three-stage pipeline: multi-format document ingestion, knowledge graph construction, and LLM-enhanced natural language querying. Our implementation demonstrates how semantic technologies can address real-world industrial challenges, reducing information retrieval time and creating pathways for knowledge preservation at enterprise scale.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Knowledge Graphs</kwd>
        <kwd>Large Language Models</kwd>
        <kwd>Manufacturing Knowledge Management</kwd>
        <kwd>Semantic Web Technologies</kwd>
        <kwd>Industrial AI</kwd>
        <kwd>Enterprise Knowledge Systems</kwd>
        <kwd>Natural Language Processing</kwd>
        <kwd>Manufacturing Automation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Modern manufacturing organizations face an unprecedented knowledge management crisis. As
industrial processes become increasingly complex and specialized, critical operational knowledge becomes
concentrated in the minds of expert engineers and technicians[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>The scope of this challenge is exemplified by our case study organization: a manufacturing company
where new engineering hires require over two years of training to become productive contributors
often due to the fragmented and inaccessible nature of institutional knowledge. Critical information
resides across incompatible systems including Customer Relationship Management (CRM) platforms,
legacy Lotus Notes databases, scattered document repositories, and informal email communications.</p>
      <p>
        This paper presents a novel approach that bridges this semantic gap through the integration of
Knowledge Graphs (KGs) and Large Language Models (LLMs)[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Our architecture transforms
manufacturing documentation into a semantic knowledge base, enabling natural language access to
technical expertise and institutional knowledge. The contribution of this work extends beyond technical
implementation to demonstrate practical deployment challenges and business impact measurement in
real-world manufacturing environments. This industrial focus aligns with recent eforts in the ISWC
community to bridge the gap between academic research and practical enterprise deployment[
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ].
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Industrial Automation Knowledge Graph: Beyond Simple</title>
    </sec>
    <sec id="sec-3">
      <title>Taxonomies</title>
      <p>Our system is comprised of a three-stage pipeline that transforms heterogeneous documentation into a
unified, accessible knowledge resource. This approach builds upon established manufacturing
knowledge graph methodologies while addressing the specific challenges of multi-format enterprise
document processing.</p>
      <sec id="sec-3-1">
        <title>2.1. Document Ingestion</title>
        <p>The pipeline processes diverse content formats prevalent in manufacturing, including PDFs, HTML,
PowerPoint, and structured data exports. The export formats are raw chunked text and accompanying
image formats that are linked to the corresponding text:
• PDFs: Extracted while maintaining formatting context.
• HTML: Analyzed for structural hierarchies.</p>
        <p>• PowerPoint: Extracts both text and embedded diagrams critical to specifications.</p>
      </sec>
      <sec id="sec-3-2">
        <title>2.2. Knowledge Graph Construction</title>
        <p>
          In the first iteration of the pipeline, a schema is inferred from example chunks of the text. The
ingested documents are then extracted based on the schema inferred and uploaded into the KG. This
approach leverages recent advances in LLM-powered knowledge graph construction[
          <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
          ], demonstrating
enterprise-grade accuracy in domain-specific extraction tasks.
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>2.3. Knowledge Extraction Results</title>
        <p>The schema creation resulted in 11 entity types with 38 unique relationship types. The inferred schema
exhibits a three-tier hierarchy: core entities (Equipment, Document), operational entities
(Specification, Material, Variant, Position), and parameter entities (SpeedRange, LoadRequirement,
VentilationRequirement, ControlMechanism, OperatingCondition). Manufacturing engineers validated both the
extracted schema and entity relationships, confirming that the resulting ontology is a valid approach
to structuring the available specification documents. However, additional analysis remains necessary
to expand the schema and examine the ontological hierarchies for this specific domain.</p>
        <p>The knowledge graph construction phase extracted 982 entities from the manufacturing
documentation corpus. The extraction process revealed varying completion rates across diferent entity attributes
within manufacturing knowledge domains:</p>
      </sec>
      <sec id="sec-3-4">
        <title>2.4. AI Assistant Chatbot with Knowledge Graph Querying</title>
        <p>We provide a frontend for the user with a chat interface that communicates with a Claude-3.5 instance
in the backend. The LLM instance can write Cypher queries to query the Knowledge Graph for specific
information. The answer then verbalizes this information and refers to the section of the document
that is attached to this entry in the KG. Refer to Appendix A for an architectural overview.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>3. LLM-Enhanced Knowledge Retrieval: Making Experts Accessible</title>
      <p>
        The integration of Large Language Models transforms knowledge graph access from a technical query
interface into a natural conversation platform that mimics expert consultation[
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ]. This
transformation addresses the adoption barriers that limited previous knowledge management approaches.
      </p>
      <p>
        Natural language query translation represents a critical component of the LLM integration[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. The
system processes queries expressed in technical terminology and conversational language, identifying
relevant entities and relationships within the knowledge graph. Recent comparative studies
demonstrate that GraphRAG approaches significantly outperform traditional vector-based RAG systems[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]
in enterprise applications, with accuracy improvements from 0% to over 90% for complex knowledge
retrieval tasks[
        <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
        ].
      </p>
      <p>
        Manufacturing operations are inherently fragmented, highly interdependent, and often involve
latent knowledge, making traditional approaches inadequate. By extracting and structuring this complex
information before embedding it into a Knowledge Management system, we achieved a far more
intelligent and context-aware retrieval process. This approach aligns with recent advances in graph neural
retrieval frameworks that demonstrate state-of-the-art performance on complex reasoning tasks[
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
      </p>
      <p>We piloted the software with Sales Engineers in their specialization. The underlying document
base consisted of Application Instructions and Technical Specifications. Initial feedback was
enthusiastic and scaling the system should result in higher-quality decision-making, less turnaround time and
higher employee satisfaction.</p>
    </sec>
    <sec id="sec-5">
      <title>4. Lessons Learned: Semantic Technologies in Manufacturing</title>
      <p>Technical challenges center primarily on data quality and schema evolution requirements. The
provided documentation often repeats but with diferent terminology and contains often outdated
information. Schema evolution proves particularly challenging as manufacturing processes and equipment
evolve, requiring knowledge graph structures that can accommodate change without breaking existing
functionality.</p>
      <p>
        Document format diversity presents ongoing challenges for content ingestion. Manufacturing
organizations typically maintain information in dozens of diferent formats, many of which require
specialized processing approaches. This challenge suggests the value of standardized documentation
approaches within manufacturing organizations, as advocated by recent Industry 4.0 ontology
standardization eforts[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
      </p>
      <p>
        Organizational challenges focus on user adoption and knowledge contribution patterns. Engineers
demonstrate enthusiasm for knowledge access capabilities but show resistance to knowledge
contribution requirements. This finding echoes challenges reported in other industrial knowledge graph
deployments[
        <xref ref-type="bibr" rid="ref15 ref16">15, 16</xref>
        ]. In future projects we will focus on being able to submit and interact with the
Knowledge Graph from the user interface.
      </p>
      <p>
        Best practices emerged from the deployment experience that can guide similar initiatives.
Iterative schema development allows for continuous refinement based on user feedback and system
performance. This approach aligns with established methodologies for industrial ontology engineering
platforms[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] and semantic integration frameworks[
        <xref ref-type="bibr" rid="ref17">17</xref>
        ].
      </p>
    </sec>
    <sec id="sec-6">
      <title>Declaration on Generative AI</title>
      <p>The author(s) have not employed any Generative AI tools.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>G.</given-names>
            <surname>Buchgeher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Gabauer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Martinez-Gil</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Ehrlinger</surname>
          </string-name>
          ,
          <article-title>Knowledge graphs in manufacturing and production: A systematic literature review</article-title>
          , arXiv preprint arXiv:
          <year>2012</year>
          .
          <volume>09049</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S.</given-names>
            <surname>Pan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Luo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <article-title>Unifying large language models and knowledge graphs: A roadmap</article-title>
          ,
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          <volume>36</volume>
          (
          <year>2024</year>
          )
          <fpage>3580</fpage>
          -
          <lpage>3599</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Cao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Grangel-Gonzalez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Du</surname>
          </string-name>
          ,
          <article-title>A knowledge graph-based approach for the quality management of bosch</article-title>
          ,
          <source>in: Proceedings of ISWC 2023 Posters, Demos and Industry Tracks</source>
          , volume
          <volume>3828</volume>
          <source>of CEUR Workshop Proceedings</source>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>E. G.</given-names>
            <surname>Kalaycı</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. Grangel</given-names>
            <surname>González</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Lösch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Xiao</surname>
          </string-name>
          , A. ul
          <string-name>
            <surname>Mehdi</surname>
            , E. Kharlamov,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Calvanese</surname>
          </string-name>
          ,
          <article-title>Semantic integration of bosch manufacturing data using virtual knowledge graphs</article-title>
          , in: International Semantic Web Conference, Springer,
          <year>2020</year>
          , pp.
          <fpage>464</fpage>
          -
          <lpage>481</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Mavridis</surname>
          </string-name>
          , et al.,
          <article-title>Large language models for intelligent rdf knowledge graph construction: results from medical ontology mapping</article-title>
          ,
          <source>Frontiers in Artificial Intelligence</source>
          <volume>8</volume>
          (
          <year>2025</year>
          )
          <fpage>1546179</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>X.</given-names>
            <surname>Bai</surname>
          </string-name>
          , et al.,
          <article-title>Construction of a knowledge graph for framework material enabled by large language models and its application</article-title>
          ,
          <source>npj Computational Materials</source>
          <volume>11</volume>
          (
          <year>2025</year>
          )
          <fpage>51</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>K.</given-names>
            <surname>Afolter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Stockinger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bernstein</surname>
          </string-name>
          ,
          <article-title>A comparative survey of recent natural language interfaces for databases</article-title>
          ,
          <source>VLDB Journal 28</source>
          (
          <year>2019</year>
          )
          <fpage>793</fpage>
          -
          <lpage>819</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Guo</surname>
          </string-name>
          , et al.,
          <article-title>Talk2data: A natural language interface for exploratory visual analysis via question decomposition</article-title>
          ,
          <source>ACM Transactions on Interactive Intelligent Systems</source>
          <volume>14</volume>
          (
          <year>2024</year>
          )
          <fpage>1</fpage>
          -
          <lpage>24</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>W.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Deng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Mahindru</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Zeng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Guven</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <article-title>A technical question answering system with transfer learning</article-title>
          ,
          <source>in: Proceedings of EMNLP 2020 System Demonstrations</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>92</fpage>
          -
          <lpage>99</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>H.</given-names>
            <surname>Han</surname>
          </string-name>
          , et al.,
          <article-title>Retrieval-augmented generation with graphs (graphrag</article-title>
          ),
          <source>arXiv preprint arXiv:2501.00309</source>
          (
          <year>2025</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>J.</given-names>
            <surname>Sequeda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Allemang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Jacob</surname>
          </string-name>
          ,
          <article-title>A benchmark to understand the role of knowledge graphs on large language model's accuracy for question answering on enterprise sql databases</article-title>
          ,
          <source>in: Proceedings of the 7th Joint Workshop on Graph Data Management Experiences &amp; Systems (GRADES) and Network Data Analytics (NDA)</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Microsoft</surname>
            <given-names>Research,</given-names>
          </string-name>
          <article-title>GraphRAG: Unlocking LLM discovery on narrative private data</article-title>
          ,
          <source>Technical Report, Microsoft Corporation</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>C.</given-names>
            <surname>Mavromatis</surname>
          </string-name>
          , G. Karypis, Gnn-rag:
          <article-title>Graph neural retrieval for large language model reasoning</article-title>
          ,
          <source>arXiv preprint arXiv:2405.20139</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>M. H.</given-names>
            <surname>Brynildsen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Jakobsen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Abildgaard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Woods</surname>
          </string-name>
          ,
          <article-title>Building an industrial ontology engineering platform</article-title>
          ,
          <source>in: Proceedings of ISWC 2024 Posters, Demos and Industry Tracks</source>
          , volume
          <volume>3828</volume>
          <source>of CEUR Workshop Proceedings</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>J.</given-names>
            <surname>Martinez-Gil</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Buchgeher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Gabauer</surname>
          </string-name>
          , et al.,
          <article-title>Root cause analysis in industrial domain using knowledge graphs</article-title>
          ,
          <source>in: Procedia Computer Science</source>
          , volume
          <volume>200</volume>
          ,
          <year>2022</year>
          , pp.
          <fpage>944</fpage>
          -
          <lpage>953</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>J. M. Rožanec</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Zajec</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Kenda</surname>
          </string-name>
          , et al.,
          <article-title>Xai-kg: Knowledge graph to support explainable ai in manufacturing</article-title>
          , in: CAiSE 2021 Workshops,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Tan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Savkovic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Rincon-Yanez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Nikolov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Roman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Soylu</surname>
          </string-name>
          , E. Kharlamov,
          <article-title>Semantic cloud system for scaling data science solutions for welding at bosch</article-title>
          ,
          <source>in: Proceedings of ISWC 2024 Posters, Demos and Industry Tracks</source>
          , volume
          <volume>3828</volume>
          <source>of CEUR Workshop Proceedings</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>