Integrating LLMs with Knowledge Graphs-enhanced
                                Task-Oriented Dialogue Systems
                                Vasile-Ionut-Remus Iga1,*
                                1
                                    Babes-Bolyai University, Business Informatics Research Center, str. Th Mihali, nr. 58-60, Cluj-Napoca, Romania


                                                   Abstract
                                                   Large Language Models (LLM) have become the state-of-the-art natural language processing systems.
                                                   Their emergent abilities paved the way for dialogue systems capable of understanding and solving
                                                   users’ specific tasks, ranging from arithmetic problems to simple chatting, all expressed in natural
                                                   language. However, for specific domains, research has shown that LLMs cannot directly substitute
                                                   Task-Oriented Dialogue Systems (TOD). TOD Systems aims to master a specific domain or company,
                                                   enabling communication by natural language. Thus, this research project focuses on building
                                                   personalized TODS with the help of artificial intelligence, using LLMs grounded with Temporal
                                                   Knowledge Graphs. We assess the temporal validity of facts in the KG through temporal timestamps.
                                                   To capture the dynamics of a company or domain, business processes are modeled with BPMN,
                                                   offering the possibility of converting them to KGs. Finally, the TOD System will be able to grow a
                                                   domain-specific KG and reason over it, leveraging LLMs capabilities of solving KG-related tasks.

                                                   Keywords
                                                   Large Language Model, Knowledge Graph, Task-Oriented Dialogue System, Business Process
                                Modeling.


                                1. Introduction
                                Nowadays, dialogue systems aim to possess the ability to converse freely using natural
                                language. Scientific research implemented it in two main ways: task-oriented dialogue systems
                                and chatbots. A task-oriented dialog system (TOD) aims to assist the user in fulfilling certain
                                tasks in a specific domain, such as restaurant booking, weather query, and flight booking, which
                                makes it valuable for real-world business [1]. Chatbots are mainly for entertainment,
                                mimicking chit-chatting, and making dialogues more natural.
                                    TODs could be built in various ways, ranging from classical rule-based approaches to neural
                                network ones. The latter supposes either creating a processing pipeline by connecting four
                                dialogue processing tasks, namely natural language understanding, dialogue state tracking,
                                dialogue policy, and natural language generation, or a single end-to-end model that can solve
                                all the above-mentioned tasks at once, this being the state-of-the-art (SOTA) approach [1].
                                    A certain use case for TOD systems is solving diverse tasks of a specific company, by
                                leveraging internal features, while mapping its knowledge. In our prior research [2], we created


                                CAiSE 2024 Doctoral Consortium
                                * Corresponding author.
                                   vasile.iga@ubbcluj.ro (V.I.R. Iga)
                                   0009-0001-4568-929X (V.I.R Iga)
                                              © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
a TODs, leveraging an ontology and a static Knowledge Graph (KG) to contextualize
conversations and store pertinent information. This advancement offers significant benefits,
including the ability to have multiple discussion threads within the same conversation and the
KG serving as a proxy for data validation. The system ultimately aids the company in
constructing and managing its specific knowledge base, facilitating Create-Retrieve-Update-
Delete (CRUD) operations. The maintenance of the KG is done through our TOD system by
solving the Knowledge Graph Completion (KGC) task, and the Knowledge Graph Reasoning
(KGR) task to address CRUD operations.
    However, our system relied on text template-matching rules, limiting the natural flow of
dialogues and the adaptation to new concepts beyond the provided ontology. Consequently, in
a subsequent study [3], we employed neural networks to discern user intent and identify
relevant entities from input text. While promising, this approach did not completely overcome
the previously mentioned drawbacks. Another constraint is the use of static KGs, which cannot
capture the time-related validity of facts, unlike temporal KGs.
    Regardless of their architecture type, these systems are usually limited by their coverage of
the domain’s knowledge. For example, a neural network-based TOD is bounded by its training
data, leaving a gap for newer use cases. Moreover, the quality and size of the used datasets
heavily impact its performance. Therefore, a solution is the synergy between two important
and appealing technologies, Large Language Models (LLMs) and Temporal Knowledge Graphs
(KGs), assisted by the versatility of Business Process Model Notation (BPMN).
    LLMs took the world by storm with their emergent capabilities of solving a wide range of
tasks, from mathematical to text-processing ones, formulated in natural language (i.e. prompts).
At their core, they are neural networks-based complex architectures with billions of parameters,
trained with terabytes of data on computationally expensive supercomputers, that try to model
the generative likelihood of word sequences, to predict the probabilities of the next tokens [4].
ChatGPT† is the best-known one, with others being Llama‡, Grok§, etc. LLMs cannot be used
instead of TODs on their own [5], due to their limited domain-specific knowledge, and tendency
to hallucinate facts [6]. Therefore, aligning them with KGs may alleviate these shortcomings.
    Knowledge Graphs (KG) can be defined as graphs of data intended to accumulate and convey
knowledge of the real world, whose nodes represent entities of interest and edges represent
potentially different relations between these entities [7]. Their capacity to store the time-related
context of the facts divides them into Static and Temporal KGs. Meanwhile, they are difficult to
construct and evolving by nature [8], thus an LLM-enhanced TODS can reduce these drawbacks
by construction through processing natural language text and leveraging pre-trained
knowledge.
    Building upon our previous work, while leveraging emergent technologies synergized with
well-established ones, we formulate the following research objective:
    Does the integration of LLMs, Temporal KGs and auxiliary tools such as BPMN mitigate the
shortcomings of limited-knowledge and robotic-like TOD systems?
    To answer it, we propose a system that will be able to (but not limited to) (i) grow a specific
company or domain’s knowledge graph, and (ii) execute certain related tasks in a traceable


†
  https://openai.com/blog/chatgpt
‡
  https://llama.meta.com/
§
  https://x.ai/model-card/
manner, both while engaging in natural language dialogues. To the best of our knowledge, we
are among the first to study the aforementioned research objective using the specified
technologies.
   The rest of the proposed research article evolves as follows: Section 2 presents the related
work, section 3 provides an in-depth analysis of our previous work from where we start, and
further details about the desired system’s architecture, while Section 4 concludes the proposal.

2. Related Work
The possibility of grounding TOD systems with knowledge graphs is a well-studied domain. By
mapping the dialogue history into graphs to capture its semantics, Yang et al. [9] further
exploits the connections between entities in the KG and the mapped graph to enhance
reasoning. Tuan et al. [10] proposes an end-to-end model-agnostic method that does symbolic
reasoning on any scale of KGs that can be incorporated into dialogue systems to enhance
response generation. A somehow similar idea is presented in [11], where a smaller, task-
relevant KG is extracted and fed to a BERT model, to guide the answers with the elements from
the KG, in an end-to-end manner. Andreas et al. [12] maps each user utterance into a local graph
of actions, that is further evaluated and executed similarly to a program, to decide for the next
action to take. Our previous work [2] builds on the aforementioned research. However, none of
them take into consideration the use of LLMs.
    Both LLMs and TODs share the same underlying trait, being conversational agents. Current
research aims at enhancing a TOD with an LLM, rather than substituting it. For example, Hu et
al. [5] uses ChatGPT to offer user-simulated satisfaction feedback of the TOD output, to further
optimize it. Chiu et al. [13] transforms the user’s input text into executable code, using an LLM.
Next, the code is executed, while each result updates the dialogue state, based on which an
action is selected. Other works shift to using LLMs for TOD-related tasks, such as Dialogue
State Tracking (DST) or Natural Language Generation (NLG). For example, Gao et al. [14]
designs an adaptive prompt generation framework, to create DST- or NLG-specific prompts for
a black-box LLM. Somehow similar to our proposed research, Braunschweiler et al. [15] and
Shen et al. [16] try to ground LLMs response, using external knowledge, leading to a more
specialized and factual-based TOD. Rather differently, none of them use KGs to store the
additional information.
    Although the interconnection between LLMs and TODS was not studied too much through
the lenses of KGs, large language models can solve KG-related tasks, such as KGC or KGR. Pan
et al. [8] proposes a unified roadmap that focuses on the synergy between LLMs and KGs. Zhu
et al. [17] experimented with GPT4 and ChatGPT for KGC and KGR, concluding that they are
below SOTA models for KGC in Zero- and One-Shot training paradigms, while for KGR they
can achieve close or above performances. Han et al. [18] trains a smaller Pre-Trained Language
Model (PLM) to check and complete the output of an LLM, for the task of KGC, in an iterative
manner. Wei et al. [19] guides ChatGPT to extract relevant information from the input text in
a multi-stage dialogue approach, given a certain schema. Despite being relevant work, none of
the mentioned studies tackle the tasks regarding Temporal KGs. The use of LLMs for TKG is
rather limited, while early experiments suggest that ChatGPT performs poorly, having trouble
keeping consistency during temporal inference and failing in actively long-dependency
temporal inference [20].
   In our work, we plan on using auxiliary tools to help the system become more interpretable,
thus one such example is modeling business processes through the use of BPMN. The
connection between BPMN and LLMs finds itself at an incipient stage, while not many studies
are available yet. One notable research work analyses ChatGPT’s capacity to model different
phases of a Business Process Modeling lifecycle [21]. It concludes that it is fit to solve
preliminary tasks, such as “gathering information” or “process modeling” in as-is scenarios, but
yet unprepared for “selecting and applying redesign methods”, “comparing and assessing
models”, or “querying and refactoring models”. Through their work, we find the opportunity to
further test and adapt LLMs for BPM.
   The research investigated above mainly focuses on the interplay between LLMs and
KGs/TODS/BPMN, or TODS and KGs, but does not take into account the possibility of
combining them all into a system that benefits their advantages, complementing each other
weaknesses. Therefore, our work is keen on developing a powerful TOD that can easily
communicate via natural language, while building a company or domain’s specific knowledge
and leveraging it, in a straightforward, interpretable, and traceable manner.

3. Research Methodology
In line with the Design Science approach [25], this section presents the past knowledge that we
build upon, the relevance of our work and the design of the proposed system.

3.1. Preliminary Results
The preliminary state-of-the-art in the application domain of our work was extensively
presented in the Related Work section, thus here we emphasize our past work relevant to our
current research.
   In our previous work, we focused on developing a rule-based TOD**, that maps its knowledge
on a provided ontology, uses a local KG to map the conversation’s context, and a global KG to
store relevant information [2]. Figure 1 presents the overall architecture of the TOD system. It
comprises four modules, each focusing on a certain task.


**
     https://github.com/IonutIga/TOD-System
Figure 1: The pipeline architecture of the TOD system [2].

   We continued with the integration of neural network components, mainly substituting the
rule-based NLU component with a fine-tuned BERT instance†† [3]. Its overall architecture was
kept simple, adding two classifier layers on top of BERT, for intent and slots detection. The
results were satisfactory but not good enough to use in practice.
   However, to train such models, one needs adequate datasets, that are not off the shelf. To
overcome this problem, we designed a machine-to-machine dialogue simulator‡‡ [22]. Figure 2
presents a simplified version of the overall architecture, where the TOD system is the main
actor. It starts by requesting a prompt from the generator, based on which it engages in a
conversation with a probabilistic rule-based user simulator while annotating each conversation
for the specified task (here, NLU).


Figure 2: Summarized diagram of the M2M system architecture [22].

   All of the above systems were based on a simple ontology, comprising three classes: Project,
Employee, and Status, and six relationships either between themselves (hasManager, hasStatus),
or with literal values (hasName, hasClass, hasCode, hasRole). Figure 3 depicts the ontology.
More details can be found in [22].


††
     https://github.com/IonutIga/Domain-Specific-NLU-BERT
‡‡
     https://github.com/IonutIga/Dialogue-Simulator
Figure 3: Domain ontology example used throughout experiments.

   The system was developed in Python 3.9, the ontology and KGs were described in RDF using
the Turtle syntax and managed with the RDFLib library, while the KG queries were written in
SPARQL.
   As mentioned in the introduction, our approach had several limitations, including the use of
template-matching rules for the dialogue flow, or static KGs that suffer from time-validity issues
of the stored facts, unlike temporal variants. Thus, we aim to extend the knowledge base
(specific to the Rigor Cycle in Design Science) with a new artifact (the proposed system) and all
the resulted conclusions after testing the software against automatic and human-centric
metrics.

3.2. Application Context
Building a new software must target a specific group of people or organization, such that its
development will contribute to the welfare of them. Our main goal is to target the shortcomings
of limited-knowledge, robotic-like TOD systems in any domain.
    However, we are aware that testing such hypothesis is difficult to achieve, thus we narrowed
our application context to the usage of such assistants by university students to do their
bureaucratic duties (load contract/scholarship forms, choose exam dates, send feedback, request
documents etc.), replacing the need to use a variety of platforms or interact with secretaries.
From personal and other students’ experiences, this tasks can quickly become tedious, having
to search through different regulations, call the secretary or discuss with fellow students about
how they solved them. These shortcomings should not be part of a modern university, thus they
can be alleviated by a central, well-instructed and human-like TOD system, as proposed by our
research.
    Ultimately, acceptance criteria for the ultimate evaluation of the research results should be
defined, able to answer the question: “Does the design artifact improve the environment and
how can this improvement be measured?”. Therefore, the final system will be evaluated directly
by students, by interacting with it to solve the desired tasks. After a certain interaction, each
student will be asked to complete a form regarding the truthfulness, coherency, fluency,
understanding, success rate or other relevant metrics. Their responses will be centralized, thus
obtaining a global view of the TODS efficiency, and whether we improved the environment or
not. Alongside human metrics, automatic ones will be employed at each stage of the system, to
ensure only performant components are added to it. Example of such metrics are: accuracy,
recall, precision, f1 score, joint goal accuracy, inform/success rate, hits@k etc. of clearly
detecting the intent, choosing the right action, building the correct graph, executing a valid
BPMN process, generating a valid response etc.;

3.3. Proposed System
Constructing the artifact is the heart of any design science research project, known as the
Design Cycle. The application context leads the development and evaluation of the system,
while the preliminary results and related work ground our approach.
    Based on our previous findings, culminating with the current research directions, we decided
to shift the neural network and rule-based components of the TOD system to LLMs, while
keeping the KG-related features with the addition of the time dimension. On top of that, we
plan on adding BPMN to trace the execution of the tasks. The expected system should be able
to discuss with ease in natural language with the users, understand their needs, extract relevant
information and store them in a KG, and provide KG-grounded responses, while executing their
specific tasks, in a traceable manner.
    LLMs are the main engine of the proposed TOD system. Their task is to execute the four
natural language modules presented in Figure 1. As we are aware of their NLG capabilities, the
focus shifts to the other three. To achieve them, LLMs should solve the KGC and KGR tasks.
Therefore, we aim to start by using prompting techniques to guide LLMs in solving the
mentioned tasks. Prompting is the most direct and least demanding manner to obtain desirable
results from these models. Different techniques are considered, such as direct prompting (DP),
in-context learning (ICL), chain of thought (COT), or planning (P). Pitfalls may be encountered,
one of them being the inability of LLMs to perform such tasks by only using prompting
techniques. To overcome this, we start by considering fine-tuning the model. Two main
techniques are to be used, classical supervised fine-tuning (SFT), and reinforcement learning
with human or AI feedback (RLHF/RLAIF). Both of them require training and testing datasets.
The first one is straightforward, while the second one requires either a human or another model
to guide the reward model into ranking positive answers higher than the negative ones, which
is going to be used to fine-tune the LLM. If the LLM is still underperforming, we move to train
a smaller specialized PLM that either corrects the LLM or is corrected by it. Finally, one last
solution is to consider dividing the main tasks into smaller ones, iterating through all the
mentioned approaches. Figure 4 presents the alternative routes for specializing LLMs on KGC
and KGR. Each route is depicted with a certain color. We start with prompting (black route),
then try fine-tuning (red route), finally testing using specialized PLMs (green route). When a
PLM corrects the LLM, prompting happens first. If the LLM corrects the output of the PLM,
prompting happens after. The two stages are numbered accordingly, to highlight the possible
scenarios. Finally, we can consider the same approach when dividing each of the main tasks
into sub ones.
Figure 4: Alternative routes for specializing LLMs on KGC, KGR or sub-tasks of each.

    Regarding the temporal dimension of the KG, we plan on starting with a simple, yet powerful
method following Nguyen et al. [23], instantiating each relationship, and adding two new
temporal relationships to it, namely startDate and endDate. The start date is either today's date
(when the input text is encountered) or a provided date. The end date is either a provided date
or marked with a label to highlight the fact’s continuous validity. It can also be omitted, thus
signaling the model that the fact still holds. Regardless of the labeling choice of time, scalability
and understanding issues may be encountered. Scalability refers to the number of new
relationships added with each new fact, while understanding focuses on the LLMs capability of
processing the labels’ meaning. If such issues occur with the proposed labeling method, we may
consider changing it, following the guidelines of Zhang et al. [24].
    Finally, to trace and guide the execution of certain tasks by the TOD system, we emphasize
the use of BPMN to model business processes. As BPMN models can be converted into RDF
graphs, the connection to our proposed system is evident. Therefore, we plan on designing such
models by human experts, that can be later interpreted by the TODS, mainly the LLM
component. Thus, we need to ensure LLMs can understand the models. To overcome the
possible pitfall, the system will either be directly fed with guidelines about the necessary
information to be obtained from the user (by providing human-designed questions to be asked
and the target parameters to be collected) or by instructing it to reason over the graph version
of the process, where the ontology also describes the nature of encountered facts.
    Figure 5 depicts the above-mentioned details of the overall expected TOD. For the BPMN
process, we provide both possible solutions that were mentioned above.
Figure 5: The expected TOD system behaviour.

   In line with the application context, the addition of LLMs should increase the naturalness of
the dialogue flows and add new information to the system knowledge. Temporal KGs store the
university-specific bureaucracy information to ensure the TOD system knows its domain. The
temporal dimension assures validity of the processes that are stored inside it. For example, if a
task is no longer supported by the university, the system will know to disable its usage.
Describing the processes with BPMN increases the interpretability of our system, thus they will
only execute what we expect from them. We can view the obtained system as an automatization
tool with a natural language interface and self-managing knowledge.

4. Conclusions
Task-Oriented Dialogue Systems focus on solving specific-domain tasks. Such systems can
either assist humans in solving their duties or replace repetitive work that can be automated.
   However, TODs usually lack naturalness in dialogue flows, while also being reluctant to
outside-of-domain information. Therefore, other technologies may be used to overcome these
shortcomings. For example, LLMs are fit for conversational paradigms, while also possessing
strong background information that can be leveraged for OOD situations. KGs are a great way
to store information in a more humanized way, leading to the possibility of reasoning over it.
Also, adding the temporal context to it alleviates the problem of time-validity of facts.
   To this day, LLMs-KGs, LLMs-TODS, or TODS-KGs synergy has been studied, leading to
promising results. However, the connection between all three at once is understudied and
should receive more attention.
   Building on top of previous work, with respect to the Design Science approach, we focus on
developing a TOD system enhanced by LLMs, KGs, and auxiliary tools such as BPMN, to
overcome their specific limitations: lack of naturalness and OOD adaptation (TODS),
hallucinations, lack of interpretability and domain-specific knowledge (LLMs), and the difficulty
of keeping KGs up to date. Each component’s advantages can be used to compensate for the
other’s shortcomings. Our main plan of execution is backed by solutions for each component,
reducing the effects of possible pitfalls. Therefore, we aim to obtain a TOD system capable of
growing one domain/company knowledge, executing different tasks in a traceable manner,
grounded with valid facts existent in a continuously growing specialized Temporal KG.

Acknowledgements
I want to thank Prof. Dr. Gheorghe-Cosmin Silaghi (gheorghe.silaghi@ubbcluj.ro) from Babes-
Bolyai University for the supervision of our previous work and the current thesis.

References
[1] Z. Zhang, T. Ryuichi, Z. Qi., H. MinLie, and Z. XiaoYan, Recent Advances and Challenges,
    Task-Oriented Dialog Systems, Science China Technological Sciences 63 (10): 2011-2027
    (2020), https://doi.org/10.1007/s11432-016-0037-0
[2] V.I. Iga, G.C. Silaghi, Ontology-based dialogue system for domain-specific knowledge
    acquisition, in: da Silva, A.R., da Silva, M.M., Estima, J., Barry, C., Lang, M., Linger, H.,
    Schneider, C. (eds.) Information Systems Development: Organizational Aspects and
    Societal Trends (ISD2023 Proceedings), Lisboa, Portugal. Association for Information
    Systems (2023), https://doi.org/10.62036/ISD.2023.46
[3] V.I. Iga, G.C. Silaghi, Leveraging bert for natural language understanding of domain-
    specific knowledge, in: 25th Intl. Symp. on Symbolic and Numeric Algorithms for Scientific
    Computing, SYNASC 2023, Nancy, France. IEEE (2023), to appear
[4] W.X. Zhao, K. Zhou, et al, A survey of large language models. arXiv preprint
    arXiv:2303.18223 (2023), https://doi.org/10.48550/arXiv.2303.18223
[5] Z. Hu, Y. Feng, A.T. Luu, B. Hooi, A. Lipani, Unlocking the Potential of User Feedback:
    Leveraging Large Language Model as User Simulators to Enhance Dialogue System, in:
    Proceedings of the 32nd ACM International Conference on Information and Knowledge
    Management (CIKM '23). Association for Computing Machinery, New York, NY, USA,
    3953–3957, (2023), https://doi.org/10.1145/3583780
[6] Y. Zhang, Y. Li, L. Cui, D. Cai, L. Liu, T. Fu, X. Huang, E. Zhao, Y. Zhang, Y. Chen, L. Wang,
    A.T. Luu, W. Bi, F. Shi, S. Shi, Siren's Song in the AI Ocean: A Survey on Hallucination in
    Large          Language           Models.          ArXiv,        abs/2309.01219          (2023),
    https://doi.org/10.48550/arXiv.2309.01219
[7] A. Hogan, E. Blomqvist, M. Cochez, C. d'Amato, G. de Melo, C. Gutierrez, S. Kirrane, J.E.L.
    Gayo, R. Navigli, S. Neumaier, A.N. Ngomo, A. Polleres, S.M. Rashid, A. Rula, L.
    Schmelzeisen, J. Sequeda, S. Staab, A. Zimmermann, Knowledge Graphs. Synthesis
    Lectures on Data, Semantics, and Knowledge, Morgan & Claypool Publishers, 2021.
    https://doi.org/10.2200/S01125ED1V01Y202109DSK022
[8] S. Pan, L. Luo, Y. Wang, C. Chen, J. Wang, X. Wu, Unifying large language models and
     knowledge         graphs:      A      roadmap,       CoRR       abs/2306.08302      (2023).
     https://doi.org/10.48550/ARXIV.2306.08302
[9] S. Yang, R. Zhang, S.M. Erfani, Graphdialog: Integrating graph knowledge into end-to-end
     task-oriented dialogue systems, in: Proceedings of the 2020 Conference on Empirical
     Methods       in    Natural    Language      Processing,    pages     1878–1888,    (2020),
     https://doi.org/10.18653/v1/2020.emnlp-main.147
[10] Y. Tuan, S. Beygi, M. Fazel-Zarandi, et al., Towards Large-Scale Interpretable Knowledge
     Graph Reasoning for Dialogue Systems, in: Findings of the Association for Computational
     Linguistics: ACL 2022, (2022) pp. 383–395, https://doi.org/10.18653/v1/2022.findings-acl.33
[11] D. Chaudhuri, M.R.A.H. Rony, J. Lehmann, Grounding Dialogue Systems via Knowledge
     Graph Aware Decoding with Pre-trained Transformers, in: Verborgh, R., et al. The
     Semantic Web. ESWC 2021. Lecture Notes in Computer Science, vol 12731. Springer, Cham,
     (2021), https://doi.org/10.1007/978-3-030-77385-4_19
[12] J. Andreas, et al., Task-oriented dialogue as a dataflow synthesis, in: Transactions of the
     Association for Computational Linguistics, vol. 8, pp. 556-571, (2020),
     https://doi.org/10.1162/tacl_a_00333
[13] J. Chiu, W. Zhao, D. Chen, S. Vaduguru, A. Rush, D. Fried, Symbolic Planning and Code
     Generation for Grounded Dialogue, in: Proceedings of the 2023 Conference on Empirical
     Methods in Natural Language Processing, pages 7426–7436, Singapore. Association for
     Computational Linguistics, (2023), https://doi.org/10.18653/v1/2023.emnlp-main.460
[14] J. Gao, L. Xiang, H. Wu, H. Zhao, Y. Tong, Z. He, An Adaptive Prompt Generation
     Framework for Task-oriented Dialogue System, in: Findings of the Association for
     Computational Linguistics: EMNLP 2023, pages 1078–1089, Singapore. Association for
     Computational Linguistics, (2023), https://doi.org/10.18653/v1/2023.findings-emnlp.76
[15] N. Braunschweiler, R. Doddipatla, S. Keizer, S. Stoyanchev, Evaluating Large Language
     Models for Document-grounded Response Generation in Information-Seeking Dialogues,
     in: Proceedings of the 1st Workshop on Taming Large Language Models: Controllability in
     the era of Interactive Assistants!, pages 46–55, Prague, Czech Republic. Association for
     Computational Linguistics, (2023), https://aclanthology.org/2023.tllm-1.5
[16] W. Shen, Y. Gao, C. Huang, F. Wan, X. Quan, W. Bi, Retrieval-Generation Alignment for
     End-to-End Task-Oriented Dialogue System, in: Proceedings of the 2023 Conference on
     Empirical Methods in Natural Language Processing, pages 8261–8275, Singapore.
     Association             for          Computational             Linguistics,         (2023),
     https://doi.org/10.18653/v1/2023.emnlp-main.514
[17] Y. Zhu, X. Wang, J. Chen, S. Qiao, Y. Ou, Y. Yao, S. Deng, H. Chen, N. Zhang, Llms for
     knowledge graph construction and reasoning: Recent capabilities and future opportunities,
     CoRR abs/2305.13168 (2023), https://doi. org/10.48550/ARXIV.2305.13168
[18] J. Han, N. Collier, W.L. Buntine, E. Shareghi, PiVe: Prompting with Iterative Verification
     Improving Graph-based Generative Capability of LLMs, CoRR abs/2305.12392 (2023),
     https://doi.org/10.48550/ARXIV.2305.12392
[19] X. Wei, X. Cui, N. Cheng, X. Wang, X. Zhang, S. Huang, P. Xie, J. Xu, Y. Chen, M. Zhang,
     Y. Jiang, W. Han, Zero-shot information extraction via chatting with chatgpt, CoRR
     abs/2302.10205 (2023), https://doi.org/10.48550/ARXIV.2302.10205
[20] C. Yuan, Q. Xie, S. Ananiadou, Zero-shot Temporal Relation Extraction with ChatGPT, in:
     The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared
     Tasks, pages 92–102, Toronto, Canada. Association for Computational Linguistics, (2023),
     https://doi.org/10.18653/v1/2023.bionlp-1.7
[21] N. Klievtsova, J.V. Benzin, T. Kampik, J. Mangler, S. Rinderle-Ma, Conversational Process
     Modelling: Can Generative AI Empower Domain Experts in Creating and Redesigning
     Process        Models?,          arXiv         preprint       arXiv:2304.11065,    (2023),
     https://doi.org/10.48550/arXiv.2304.11065
[22] V.I.R. Iga, Ontology-driven dialogue simulator for generating task-oriented dialogue
     datasets, Master’s thesis, Babes-Bolyai University, Cluj-Napoca, Romania, 2023,
     https://github.com/IonutIga/Dialogue-Simulator
[23] V. Nguyen, O. Bodenreider, A.P. Sheth, Don't like RDF reification? Making statements
     about statements using singleton property, in: Chung, C., Broder, A.Z., Shim, K., Suel, T.
     (eds.) 23rd International World Wide Web Conference, WWW '14, Seoul, Republic of
     Korea, April 7-11, 2014. pp. 759_770. ACM (2014), https://doi.org/10.1145/2566486.2567973
[24] F. Zhang, Z. Li, D. Peng, et al., RDF for temporal data management – a survey. Earth Sci
     Inform 14, 563–599 (2021), https://doi.org/10.1007/s12145-021-00574-w
[25] A. R. Hevner, A Three Cycle View of Design Science Research, in: Scandinavian Journal
     of     Information       Systems:      Vol.      19:     Iss.  2,    Article    4  (2007),
     Available at: https://aisel.aisnet.org/sjis/vol19/iss2/4