<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Leveraging large language models for automated knowledge graphs generation in non-destructive testing</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ghezal Ahmad Jan Zia</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andre Valdestilhas</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Benjamí Moreno Torres</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sabine Kruschwitz</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Federal Institute For Materials Research and Testing (BAM)</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Technical University of Berlin (TUB)</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper presents an innovative approach for the automatic generation of Knowledge Graphs (KGs) from heterogeneous scientific articles in the domain of Non-Destructive Testing ( NDT) applied to building materials. Our methodology leverages large language models (LLMs) to extract and semantically relate concepts from diverse sources. We developed material-specific agents for concrete, wood, steel, and bricks, each equipped with a curated glossary of terms to ensure domain accuracy. These agents process PDF documents, extracting relevant information on deterioration mechanisms, physical changes, and applicable NDT methods. The extracted data is then normalized, validated, and structured into a Neo4j graph database, forming a comprehensive KG. Our results demonstrate the system's ability to automatically discover and represent intricate relationships between materials, deterioration mechanisms, physical changes, and NDT techniques. The generated KG successfully captures complex interactions, such as the applicability of specific NDT methods to various materials under diferent deterioration conditions. This work not only highlights the potential of KGs in enhancing knowledge discovery and representation in NDT research but also provides a scalable framework for extending this approach to other scientific domains. GitHub:https://github.com/ghezalahmad/LLM_NDT_Knowledge_Graph.git NDT is a set of tools with high applicability in the field of Material Sciences Engineering ( MSE), crucial for detecting defects and assessing material integrity without causing further damage. NDT ensures structures and components' safety, reliability, and longevity across various industries, including aerospace, civil engineering, and manufacturing. In the context of building materials, the literature on NDT is extensive, covering a wide range of materials such as concrete, wood, masonry, and metals, and employing diverse testing methods like ultrasonic testing, radiography, and infrared thermography. Although the utility of such methods is beyond doubt [1], there is diversity at diferent levels that can complicate the proper selection of the NDT method when applying it. This diversity includes variations in material identification, degradation phenomena (whether isolated or concurrent), the symptoms in which such degradation phenomena manifest, and parameters in these non-destructive techniques (as resolution, range, frequency, etc.). To address this challenge, we propose leveraging advanced Natural Language Processing (NLP) techniques, specifically LLMs [ 2] such as OpenAI's GPT4o 1. The model will be used to automate the extraction and organization of NDT methods and their related physical magnitudes, with their SeMatS 2024: The 1st International Workshop on Semantic Materials Science co-located with the 20th International Conference on Semantic Systems (SEMANTiCS), September 17-19, Amsterdam, The Netherlands. * Corresponding author. $ ghezal-ahmad.zia@bam.de (G. A. J. Zia); firmao@gmail.com (A. Valdestilhas); benjami.moreno-torres@bam.de (B. M. Torres); sabine.kruschwitz@bam.de (S. Kruschwitz) 0000-0002-9082-9423 (G. A. J. Zia); 0000-0002-0079-2533 (A. Valdestilhas); 0000-0002-4422-7130 (B. M. Torres); 0000-0003-1649-6832 (S. Kruschwitz)</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Materials Science and Engineering</kwd>
        <kwd>Large Language Model</kwd>
        <kwd>Linked Open Data</kwd>
        <kwd>Data Interoperability</kwd>
        <kwd>RDF</kwd>
        <kwd>Semantic Web</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        applicability to the detection of degradation phenomena for every material. To add more utility, we
extract and refer to these relationships from the scientific literature. By utilizing the LLM model, we
aim to create an extensive and easily accessible KG that systematically links NDT tools to specific
deterioration mechanisms and materials. KGs provide a promising solution by structuring information
into a network of entities and relationships, enabling more eficient knowledge discovery and retrieval
[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>The primary objective of this study is to develop a robust methodology for the automatic generation of
KGs from scientific articles on NDT in building materials. This methodology facilitates the exploration
of intricate relationships between NDT techniques and various materials’ degradation phenomena.
The resulting methodology serves as a resource for researchers and engineers, supporting informed
decision-making and fostering innovation in the access to complex scientific field information.</p>
      <p>In this paper, we present our approach to creating a KG from heterogeneous NDT literature,
demonstrate the efectiveness of our methodology through three usage examples, and discuss the potential
implications and future directions of this research. Our work highlights the significant benefits of
integrating AI-driven concept extraction and knowledge representation techniques in advancing the
ifeld of NDT and improving the accessibility of critical information for scientific and engineering
applications.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Hypothesis and Research Questions</title>
      <p>We hypothesize that LLMs, specifically OpenAI’s GPT-4o, can accurately and eficiently extract detailed
NDT methods and their related deterioration mechanisms from the scientific literature. Furthermore,
we propose that a KG generated from this extracted information can efectively organize and represent
complex relationships, thereby facilitating enhanced knowledge discovery and application in the field
of NDT.</p>
      <p>To explore these hypotheses, we formulate the following research questions:
1. How efectively can LLMs extract and categorize deterioration mechanisms, physical changes,
and recommended NDT techniques for building materials from scientific literature?
• This question aims to assess the ability of LLMs to process diverse scientific texts and
extract relevant NDT information. By evaluating the extracted data against a manually
curated glossary for each material (concrete, wood, steel, and bricks), we aim to demonstrate
the reliability and comprehensiveness of AI-driven extraction techniques across diferent
domains within NDT. The question is addressed also in results and discussion section.
2. To what extent can the generated KG facilitate the exploration and understanding of relationships
between NDT methods, deterioration mechanisms, and materials?
• The purpose of this question is to assess how useful the KG is in helping researchers
and engineers explore and understand complex relationships within the NDT domain. By
organizing scientific literature’s content in a structured manner, the KG is expected to
facilitate quick information retrieval, identification of NDT techniques applicable across
diferent materials, and reveal new insights that may not be easily found through traditional
literature review methods. Additionally, the graph has been validated by an expert in the
NDT domain.
3. How does the performance of material-specific agents compare in terms of information extraction
accuracy and completeness across diferent building materials (concrete, wood, steel, and bricks)?
• This question aims to explore potential diferences in the eficacy of our method when
applied to various materials. By evaluating the performance of each material-specific agent,
we can pinpoint any challenges that are specific to particular domains and evaluate how
universally applicable our approach is. The results and discussion section will further
elaborate on this information.</p>
      <p>By addressing these research questions, we aim to validate the efectiveness of LLMs in automating
NDT data extraction, demonstrate the practical benefits of using KGs for knowledge representation
and discovery in materials science and engineering, and assess the robustness of our approach across
diferent building materials. This study contributes to the broader goal of leveraging AI and
graphbased technologies to accelerate scientific discovery and improve decision-making in the field of
non-destructive testing.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Literature Review</title>
      <p>
        In literature, Semantic Web Technologies (SWT) have been successfully used in MSE [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ]. Building on
this success, we are currently focused on extracting Knowledge Graph (KG) from scientific literature,
especially in the field of NDT. This method represents an advanced approach for organizing and
structuring large volumes of complex data. The literature review delves into the latest research on the
application of NLP and machine learning techniques for automating the extraction and generation of
KGs from NDT-related documents.
      </p>
      <p>
        Moreno et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] present a work where the main contribution of the paper is its successful
demonstration of how semantic technologies, specifically ontologies, can be efectively applied to improve data
interoperability within the realm of non-destructive testing. By utilizing ontologies, the study showcases
a methodology that enhances the integration of data from various test methods, particularly focusing on
the analysis of water content and porosity distribution in solids using 1H nuclear magnetic resonance
relaxometry. However, a key limitation of the research is the necessity for additional iterations to
fully exploit the potential benefits of semantic enrichment and knowledge transfer in interdisciplinary
research settings. While the initial implementation of the digital workflow methodology shows promise
in enhancing data management and semantic expressiveness, further refinements and enhancements
are required to maximize the impact of ontologies in facilitating seamless collaboration and information
exchange among interdisciplinary team members in the field of materials science and non-destructive
testing.
      </p>
      <p>
        Kamsu-Foguem et al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] ofers an approach using conceptual graphs to provide a formal and
structured framework for improving compliance monitoring and knowledge representation in NDT
for aircraft structures. By employing conceptual graphs, the paper enhances the clarity and precision
of reasoning processes, enabling a more systematic approach to verifying compliance with technical
procedures and equipment specifications in the maintenance of aircraft components. This approach
aligns with the concept of a "Knowledge Graph," which refers to a knowledge base that integrates
information from various sources and represents it in a structured format for eficient retrieval and
analysis. However, a potential limitation of the approach outlined in the paper is the challenge of
maintaining and updating the knowledge base represented by the conceptual graphs. As industry
standards, equipment technologies, and maintenance practices evolve, ensuring the accuracy and
relevance of the knowledge base becomes crucial but may require significant efort and resources to
keep up-to-date with the dynamic nature of the aviation industry. This ongoing maintenance task could
pose a challenge regarding resource allocation and the need for continuous validation and refinement
of the knowledge representation to reflect the latest developments in NDT practices and equipment
requirements.
      </p>
      <p>Hagedorn et al. [8] provides an approach where the development of a web-based platform that
implements Information Containers for Linked Document Delivery (ICDDs) for asset management,
integrating existing systems and demonstrating the use of domain-specific ontologies and 3D BIM
models to enable querying across multiple information sources for stakeholder-specific views. However,
the paper does not provide a comprehensive evaluation of the proposed BIM-enabled Asset Management
System, and it also mentions the use of non-destructive testing methods as an important aspect of
condition assessment.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Research Method</title>
      <sec id="sec-4-1">
        <title>4.1. Dataset - NDT Methods from Research Papers</title>
        <p>The dataset used for this study [9, 10, 11, 12, 13] comprised a range of NDT-related research papers and
technical documents. The selection criteria for the dataset included:
• Diversity of NDT Methods: Papers covering a wide array of NDT techniques, including but not
limited to, ultrasonic testing, radiography, magnetic particle testing, and eddy current testing.
• Variety of Materials: Documents addressing NDT applications across diferent materials such as
concrete, steel, wood, and bricks.
• Comprehensive Coverage: Inclusion of recent advancements and historical perspectives in the
ifeld of NDT to ensure a holistic understanding of the domain.</p>
        <p>Using GPT-4o to extract NDT methods from these papers, we aimed to create a robust and comprehensive
knowledge base, facilitating the generation of a detailed knowledge graph for further analysis and
application in materials science and engineering.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. LLM for Extracting Information from Heterogeneous Scientific Papers</title>
        <p>We employed OpenAI’s GPT-4o to automate the extraction of NDT methods, associated deterioration
mechanisms, and physical changes from a diverse set of scientific papers and technical documents. The
extraction process involved several key steps to ensure the accuracy and comprehensiveness of the
information gathered:</p>
        <p>Firstly, a corpus of scientific literature and technical documents on NDT was compiled. This corpus
included peer-reviewed journal articles, conference papers, technical reports, and industry standards
documents covering various NDT methods, materials, and deterioration mechanisms. The aim was
to ensure a broad and inclusive dataset that could provide a holistic view of the field. The collected
documents were then preprocessed to convert them into plain text format, ensuring they were suitable
for analysis. This involved extracting text from PDFs and other document formats, and organizing
the data into a consistent structure. This preprocessing step was crucial for preparing the data for the
language model’s analysis.</p>
        <p>Specific prompts were designed for the GPT-4o model to extract relevant information. These prompts
were crafted to instruct the model to identify and extract details related to the materials afected by the
deterioration mechanisms (e.g., concrete, steel, wood, bricks), the specific deterioration mechanisms
detected (e.g., corrosion, spalling), the physical changes caused by these mechanisms (e.g., thinning,
discoloration, structural changes), and the types of NDT methods used (e.g., ultrasonic testing, radiography,
visual inspection).</p>
        <p>The preprocessed text from each document was input into the GPT-4o model using the designed
prompts. The model was executed to extract the necessary information, which was then compiled into
structured outputs. This step leveraged the model’s advanced natural language processing capabilities
to parse complex scientific texts and extract pertinent details eficiently.</p>
        <p>Finally, the extracted data was organized into a standardized format, categorizing the materials,
deterioration mechanisms, physical changes, and recommended NDT methods. This structured format
facilitated the subsequent integration of the information into the knowledge graph, enabling a coherent
and comprehensive representation of the extracted knowledge.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Knowledge Graph Construction</title>
        <p>The structured information extracted by GPT-4o was used to construct a Knowledge Graph (KG) using
the Neo4j graph database. The construction process involved the following steps:
1. Node Creation: Nodes were created for each material (concrete, steel, wood, bricks), each
deterioration mechanism, physical change, and NDT method identified.
2. Relationship Establishment: Relationships were established between the nodes to represent
the extracted information. For instance, a material node (e.g., Concrete) was linked to a
deterioration mechanism node (e.g., Corrosion) through a "HAS_DETERIORATION_MECHANISM"
relationship. The deterioration mechanism node was linked to a physical change node (e.g.,
Cracking) through a "CAUSES_PHYSICAL_CHANGE" relationship, and the physical change
node was linked to an NDT method node (e.g., Ultrasonic Testing) through a "DETECTED_BY"
relationship.
3. Validation and Normalization: The relationships and nodes were validated by a MSE domain
specialist to ensure consistency and accuracy. The data was normalized to eliminate redundancies
and ensure a coherent structure within the KG.
4. Query Implementation: The Neo4j database was queried to visualize and analyze the
relationships within the KG. Besides the visualization seems enough for the MSE domain specialist,
in addition, Cypher queries were utilized to retrieve specific information and explore complex
interactions between materials, deterioration mechanisms, physical changes, and NDT methods.</p>
        <p>This systematic approach enabled the automatic generation of a comprehensive KG that captures
intricate relationships in the NDT domain, facilitating enhanced knowledge discovery and representation.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Experimental Program</title>
      <p>The experimental program aimed to validate the efectiveness of using an LLM for extracting and
organizing NDT methods and their related deterioration mechanisms into a comprehensive
knowledge graph. The program was divided into five key phases, each focused on diferent aspects of the
methodology and its implementation.</p>
      <p>1. Data Collection and Preparation
• Document Compilation: A collection of scientific papers, technical reports, and industry
standards related to NDT methods was compiled. This dataset included documents from
various sources to ensure comprehensive coverage of NDT techniques and materials.
• Preprocessing: The documents were converted to plain text format. This involved handling
diferent file types (PDFs, Word documents, RTFs) and ensuring the text was clean and free
from irrelevant formatting.
2. Model Configuration and Prompt Design
• LLM Configuration : OpenAI’s GPT-4o was selected for its advanced natural language
processing capabilities. The model was configured to handle the large volume of text and
the specific needs of NDT information extraction.
• Prompt Development: Custom prompts were developed to guide the LLM in identifying
and extracting key information related to NDT methods, deterioration mechanisms, and
afected materials. These prompts were designed to be clear and specific to ensure the
model’s responses were relevant and accurate.
3. Information Extraction
• Execution of LLM: The preprocessed text from the collected documents was input into
the LLM using the developed prompts. The model processed each document to extract
information about NDT tools, their corresponding deterioration mechanisms, and the
materials they are used on.
• Data Structuring: The extracted information was organized into a standardized format,
detailing the NDT method, associated deterioration mechanisms, and the materials afected.</p>
      <p>This structured data was crucial for the next phase of knowledge graph construction.
4. Knowledge Graph Construction
• Neo4j Implementation: Neo4j, a graph database management system, was used to
construct the knowledge graph. The structured data from the LLM was input into Neo4j,
creating nodes for NDT tools, deterioration mechanisms, and materials, and establishing
relationships between them.
• Graph Schema Design: A schema was designed to represent the entities and
relationships within the knowledge graph. Nodes represented NDT tools, materials, and
deterioration mechanisms, while edges represented the relationships between these entities,
such as "HAS_DETERIORATION_MECHANISM," "CAUSES_PHYSICAL_CHANGE," and
"DETECTED_BY."
5. Validation
• Expert Review: The constructed knowledge graph was reviewed by domain experts to
ensure the accuracy and relevance of the extracted information. Feedback from these experts
was used to refine the extraction process and improve the quality of the knowledge graph.</p>
      <p>The experimental program successfully demonstrated the feasibility and efectiveness of using a large
language model to automate the extraction and organization of NDT information into a comprehensive
knowledge graph. This approach not only streamlined the data collection process but also improved
the accessibility and usability of NDT knowledge for researchers and engineers.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Results and Discussion</title>
      <p>The results of this study demonstrate the efectiveness of using an LLM to extract and organize
information on NDT methods and their associated deterioration mechanisms into a comprehensive
KG. The constructed KG includes nodes representing four primary materials: concrete, steel, wood,
and bricks. Each material node is linked to various deterioration mechanisms, physical changes, and
corresponding NDT methods. This structured representation enables the exploration and analysis
of how diferent NDT techniques are applied to detect specific types of deterioration across various
materials.</p>
      <p>The majority of the extracted entries were correctly classified and structured, following the format
of Material; Deterioration Mechanism; Physical Change; and NDT Method. However, several issues
were identified during the review process, necessitating adjustments for consistency and accuracy.
For instance, entries initially listed "moisture content" as a deterioration mechanism. However, it
is more accurately described as a property or condition that can lead to deterioration. Therefore,
it was reclassified as "moisture changes" or "high moisture content" to better reflect its role in the
deterioration process. Similarly, entries listed "rot" as a deterioration mechanism, which, while correct,
was inconsistent with the use of "fungal decay" in other entries. To maintain consistency, all references
to rot were updated to "fungal decay."</p>
      <p>Another notable adjustment was required for entries, where "knots" were listed as a deterioration
mechanism. Since knots are natural features of wood and not a deterioration mechanism, these entries
were reclassified or removed. In an entry, "UV exposure" was correctly identified as a deterioration
mechanism, but the associated physical change was listed as "chemical constitution." This was refined
to more specific changes such as "color changes" or "surface degradation" to provide clearer, observable
efects.</p>
      <p>Consistency in terminology and classification is crucial for the utility of the knowledge graph.
Variations in naming, such as "acoustic emission monitoring" versus "acoustic emissions," were standardized
to ensure clarity and consistency. Similarly, vague terms like "laser-based technique" were specified
where possible. Entries introducing specific techniques, such as "ultrasonic critical refracted longitudinal
waves" or "guided ultrasonic wave procedure," were reviewed for consistency with other entries. While
specificity is valuable, it was balanced with the need for generalizability across the dataset.</p>
      <p>The knowledge graph was validated through expert review, ensuring the accuracy and relevance of
the extracted information. Feedback from domain experts was instrumental in refining the classification
and standardization of entries, contributing to the overall quality of the knowledge graph.</p>
      <p>The knowledge graph was utilized to answer specific research questions and provide insights into
the application of NDT methods across diferent materials. Custom queries allowed for the exploration
of relationships within the graph, facilitating the identification of cross-material NDT techniques and
the discovery of novel insights. While the majority of the entries were accurately classified, the study
highlighted several challenges, including ensuring consistent terminology across diverse documents,
balancing the need for specific information with the generalizability of the knowledge graph, and
distinguishing between natural features and actual deterioration mechanisms, especially in materials
like wood.</p>
      <p>In conclusion, the experimental program successfully demonstrated the feasibility and efectiveness
of using an LLM to automate the extraction and organization of NDT information into a comprehensive
KG. This approach not only streamlined the data collection process but also enhanced the accessibility
and usability of NDT knowledge for researchers and engineers. The results highlight the potential of
knowledge graphs in advancing the field of NDT, providing a scalable framework for future research
and application.</p>
      <sec id="sec-6-1">
        <title>6.1. Challenges and Future Work</title>
        <p>While the majority of the entries were accurately classified, the study highlighted several challenges:
• Consistency in Terminology: Ensuring consistent terminology across diverse documents
remains a challenge.
• Specificity vs. Generalizability : Balancing the need for specific information with the
generalizability of the knowledge graph requires careful consideration.
• Handling Natural Features: Distinguishing between natural features and actual deterioration
mechanisms, especially in materials like wood, is essential for accurate classification.</p>
        <p>Future work will focus on refining the extraction process, improving the consistency and specificity
of entries, and extending the approach to other scientific domains.</p>
      </sec>
      <sec id="sec-6-2">
        <title>6.2. Case Study Examples</title>
        <p>The constructed knowledge graph provides practical insights for diferent materials, demonstrating its
utility in various domains:</p>
        <p>Concrete: The knowledge graph identified that Ultrasonic Testing (UT) and Ground Penetrating
Radar (GPR) are highly efective for detecting internal flaws and assessing structural integrity in
concrete. This insight supports the use of these methods in civil engineering projects to ensure safety
and durability.</p>
        <p>Steel: For steel, the knowledge graph highlighted the efectiveness of Magnetic Particle Testing (MT)
and Eddy Current Testing (ET) in detecting surface and subsurface cracks. This information is crucial
for industries such as aerospace and automotive manufacturing, where material integrity is paramount.</p>
        <p>Wood: The graph showed that Infrared Thermography (IRT) and Electrical Resistivity Testing (ERT)
are valuable for detecting moisture content and decay in wood. This can guide the preservation and
maintenance of wooden structures and cultural heritage artifacts.</p>
        <p>Bricks: The knowledge graph demonstrated that Visual Inspection and Ultrasonic Testing (UT) are
efective for detecting weathering efects such as cracking and spalling in bricks. These methods are
vital for ensuring the structural integrity and longevity of brick constructions in various environmental
conditions.</p>
        <p>Overall, the results demonstrate the potential of using LLMs to enhance the extraction and
organization of NDT knowledge, providing a valuable resource for researchers, engineers, and practitioners in
the field. The automated approach not only improves eficiency but also uncovers new insights and
relationships, driving innovation and informed decision-making in materials science and engineering.</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusion</title>
      <p>The experimental program successfully demonstrated the feasibility and efectiveness of using an LLM
to automate the extraction and organization of NDT information into a comprehensive knowledge
graph. This approach not only streamlined the data collection process but also enhanced the accessibility
and usability of NDT knowledge for researchers and engineers. The results highlight the potential of
knowledge graphs in advancing the field of NDT, providing a scalable framework for future research
and applications.</p>
      <p>This study opens new avenues for further research and development. Future work can focus on
expanding the corpus of scientific articles to include more diverse sources and materials. Additionally,
enhancing the accuracy and depth of entity and relationship extraction through advanced
machinelearning techniques will further improve the utility of the knowledge graph.</p>
      <p>Integrating the knowledge graph with other scientific databases and ontologies can enrich its content
and broaden its applicability. Furthermore, developing user-friendly interfaces and tools for interacting
with the knowledge graph will enhance its accessibility and usability for domain experts and researchers.</p>
      <p>In conclusion, the application of LLMs to automate the extraction and organization of NDT knowledge
represents a significant advancement in materials science and engineering. This approach not only
streamlines the compilation of extensive technical data but also fosters innovation and informed
decision-making, ultimately contributing to the advancement of NDT practices and the enhancement
of material safety and integrity.</p>
    </sec>
    <sec id="sec-8">
      <title>8. Acknowledgments</title>
      <p>We would like to express our sincere gratitude to Reincarnate for funding this project. Their support was
crucial in enabling the research and development of the automated knowledge graph for non-destructive
testing. We also extend our thanks to our colleagues at the Bundesanstalt für Materialforschung und
-prüfung (BAM) and the Technical University of Berlin (TUB) for their invaluable contributions and
support throughout this study.
[8] P. Hagedorn, L. Liu, M. König, R. Hajdin, T. Blumenfeld, M. Stöckner, M. Billmaier, K. Grossauer,
K. Gavin, Bim-enabled infrastructure asset management using information containers and semantic
web, Journal of Computing in Civil Engineering 37 (2023) 04022041.
[9] J. Helal, M. Sofi, P. Mendis, Non-destructive testing of concrete: A review of methods, Electronic</p>
      <p>Journal of Structural Engineering 14 (2015) 97–105.
[10] B. des Deutschen Zentrums für Schienenverkehrsforschung, Prüfverfahren
Zustandserfassung Bau. Modul „Zerstörungsfreie Prüfverfahren“, Technical Report, Deutsches
Zentrum für Schienenverkehrsforschung beim Eisenbahn-Bundesamt (Hrsg.), 2021.
https://doi.org/10.48755/dzsf.210009.01. doi:https://doi.org/10.48755/dzsf.210009.01.
[11] M. Gupta, M. A. Khan, R. Butola, R. M. Singari, Advances in applications of non-destructive testing
(ndt): A review, Advances in Materials and Processing Technologies 8 (2022) 2286–2307.
[12] P. Niemz, D. Mannes, Non-destructive testing of wood and wood-based materials, Journal of</p>
      <p>Cultural Heritage 13 (2012) S26–S34.
[13] M. P. Schuller, Nondestructive testing and damage assessment of masonry structures, Progress in</p>
      <p>Structural Engineering and Materials 5 (2003) 239–251.
[14] L. Zhang, W. Huang, Cross-material applications of non-destructive testing methods, Materials</p>
      <p>Science and Engineering: R 132 (2018) 1–20.
[15] Y. Wang, J. Li, Applications of ndt techniques in evaluating material properties, Journal of Materials</p>
      <p>Research 31 (2016) 1200–1210.
[16] S. Fortunato, et al., Science of science, Science 359 (2018) eaao0185.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Smith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Doe</surname>
          </string-name>
          ,
          <article-title>Non-destructive testing: Principles and applications</article-title>
          ,
          <source>Journal of Materials Science</source>
          <volume>35</volume>
          (
          <year>2000</year>
          )
          <fpage>123</fpage>
          -
          <lpage>130</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S.</given-names>
            <surname>Minaee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mikolov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Nikzad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chenaghlu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Socher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Amatriain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <article-title>Large language models: A survey</article-title>
          ,
          <source>arXiv preprint arXiv:2402.06196</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Singhal</surname>
          </string-name>
          ,
          <article-title>Introducing the knowledge graph: Things, not strings</article-title>
          ,
          <source>Google Oficial Blog</source>
          (
          <year>2012</year>
          ). Https://googleblog.blogspot.com/
          <year>2012</year>
          /05/introducing
          <article-title>-knowledge-graph-things-not</article-title>
          .html.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Valdestilhas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Bayerlein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. Moreno</given-names>
            <surname>Torres</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. A. J.</given-names>
            <surname>Zia</surname>
          </string-name>
          , T. Muth,
          <article-title>The intersection between semantic web and materials science</article-title>
          ,
          <source>Advanced Intelligent Systems</source>
          <volume>5</volume>
          (
          <year>2023</year>
          )
          <article-title>2300051</article-title>
          . URL: https://onlinelibrary.wiley.com/doi/ abs/10.1002/aisy.202300051. doi:https://doi.org/10.1002/aisy.202300051. arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/aisy.202300051.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Valdestilhas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Hanke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Javamasoudian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. A. J.</given-names>
            <surname>Zia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Fellenberg</surname>
          </string-name>
          , T. Muth, NaturalMSEQueries
          <article-title>- A natural way to query Material Sciences Engineering data experiments</article-title>
          ,
          <source>in: 22nd International Conference on WWW/Internet - ICWI</source>
          <year>2023</year>
          , volume
          <volume>22</volume>
          ,
          <year>2023</year>
          , pp.
          <fpage>125</fpage>
          -
          <lpage>132</lpage>
          . URL: http://dx.doi.
          <source>org/ 10.13140/RG.2.2.35533.41444/2. doi:10.13140/RG.2.2.35533.41444/2.</source>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>B.</given-names>
            <surname>Moreno Torres</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Völker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Nagel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Hanke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kruschwitz</surname>
          </string-name>
          ,
          <article-title>An ontology-based approach to enable data-driven research in the field of ndt in civil engineering</article-title>
          ,
          <source>Remote Sensing</source>
          <volume>13</volume>
          (
          <year>2021</year>
          )
          <fpage>2426</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>B.</given-names>
            <surname>Kamsu-Foguem</surname>
          </string-name>
          ,
          <article-title>Knowledge-based support in non-destructive testing for health monitoring of aircraft structures</article-title>
          ,
          <source>Advanced engineering informatics 26</source>
          (
          <year>2012</year>
          )
          <fpage>859</fpage>
          -
          <lpage>869</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>