<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Framework for Graph Database Query Generation Leveraging Large Language Models</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Volker Tresp</string-name>
          <email>tresp@dbs.ifi.lmu.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bailan He</string-name>
          <email>bailan.he@siemens.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yushan Liu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marcel Hildebrandt</string-name>
          <email>marcel.hildebrandt@siemens.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Zifeng Ding</string-name>
          <email>zifeng.ding@siemens.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yaomengxi Han</string-name>
          <email>yaomengxi.han@siemens.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Ludwig Maximilian University of Munich</institution>
          ,
          <addr-line>Geschwister-Scholl-Platz 1, 80539 Munich</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Siemens AG</institution>
          ,
          <addr-line>Otto-Hahn-Ring 6, 81739 Munich</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Technical University of Munich</institution>
          ,
          <addr-line>Arcisstraße 21, 80333 Munich</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Large language models (LLMs) have attracted considerable attention in academia and industry due to their superior performance compared to classical machine learning models across various applications. In particular, prompt engineering and in-context learning enable LLMs to operate efectively in scenarios with minimal training data, where they demonstrate proficiency with only giving precise instructions or a few examples. The advanced reasoning abilities of LLMs have been instrumental in the development of intelligent assistants. These assistants often rely on accessing information from comprehensive databases such as knowledge graphs (KGs) through natural language. The process of converting natural language requests into query language to retrieve information from databases is known as query generation (QG). One challenge in QG is the evaluation of the LLMs' performance due to the absence of standardized evaluation frameworks and datasets. To tackle this challenge, we introduce an automated evaluation framework tailored for QG, featuring three key metrics: Gold Query Accuracy, Execution Accuracy, and Execution Rate. We focus on the exemplary use case of accessing a graph database in a supply chain management (SCM) setting via a natural language interface. Our results demonstrate the eficacy of our framework and metrics in accurately evaluating model performance for QG tasks.</p>
      </abstract>
      <kwd-group>
        <kwd>Leveraging</kwd>
        <kwd>Query generation</kwd>
        <kwd>Knowledge graph</kwd>
        <kwd>Large language model</kwd>
        <kwd>Supply chain management</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR
ceur-ws.org</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>
        Large Language Models (LLMs) have recently gained prominence in natural language
processing, demonstrating exceptional proficiency across various tasks [
        <xref ref-type="bibr" rid="ref1 ref2 ref3 ref4">1, 2, 3, 4</xref>
        ]. Unlike conventional
machine learning models that rely heavily on labeled data, LLMs exhibit reasoning capabilities,
allowing them to perform diverse tasks without extensive labeled datasets. For instance,
PerezBeltrachini et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] developed a semantic parser using LLMs with an RDF dataset. This system
CEUR
Workshop
Proceedings
efectively understands natural language questions within a user’s dialogue and generates
corresponding SPARQL queries to address them. Given the capacity of LLMs to process and generate
vast amounts of text data, manual evaluation of LLMs becomes impractical, while automatic
frameworks ensure a swift and eficient assessment. Automated evaluation is also preferred due
to its ability to mitigate biases and inconsistencies introduced by human evaluators, thereby
enhancing reliability and reproducibility. As the utilization of LLMs expands across various
domains, scalable evaluation methods are imperative. Objective metrics such as perplexity [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ],
BLEU score [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], and ROUGE score [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] ofer quantifiable performance measures in general
scenarios. However, accurately evaluating the performance of LLMs in domain-specific settings
remains challenging. In many cases, anecdotal evidence and manual inspection have been used
as substitutes for more rigorous evaluation methods. Furthermore, in many industrial contexts,
the developers of artificial intelligence tools may not have the necessary domain expertise to
evaluate the performance of LLMs, leading to a need for an automatic and systematic approach
to evaluate LLMs in the absence of in-depth domain knowledge. In this paper, we focus on
an application in the context of supply chain management (SCM), where the lack of standard
evaluation datasets poses a significant challenge, complicating the comparison of diferent
LLMs.
      </p>
      <p>
        Supply chain management (SCM) involves continuously monitoring supply chains to ensure
their operability and proactively making them suficiently resilient to withstand disruptive
events such as pandemics, natural disasters, or political and economic crises [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. With the
growing volume and complexity of available data, it is essential for supply chain managers to
eficiently manage, store, and retrieve the relevant information. This ensures that relevant data
can be accessed and analyzed in a timely manner to make well-informed strategic decisions,
identify critical risks, and accurately react to current events [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. The management across the
whole supply chain necessitates extensive databases for comprehensive analysis. Knowledge
Graphs (KGs) represent networks of real-world entities, encompassing objects, events, situations,
and concepts, and elucidate the connections between them and have emerged as invaluable tools
for ofering an interconnected perspective on SCM data. For instance, within SCM, entities like
suppliers, smelters, and components can be represented as nodes in a KG, while relationships
between diferent types of nodes, such as ”located in,” serve as the edges within the KG. Several
studies have developed SCM-related KGs and applied reasoning techniques to improve supply
chain management [11, 12, 13]. For example, the CoyPu KG 1 integrates macroeconomic data
and global crisis events to enhance transparency in SCM operations. These KGs are typically
processed using graph database management systems like Neo4j 2. Querying these systems,
such as using Cypher queries for Neo4j, assists in extracting relevant information for further
analysis. However, mastering the optimal querying of graph databases requires significant time
and efort from SCM professionals. Therefore, developing a natural language interface utilizing
LLM-based approaches to interpret database schemas and translate requests into graph database
queries (e.g., SPARQL or Cypher) can significantly aid SCM professionals in understanding the
data and making informed decisions. This translation task is referred to as query generation
(QG).
      </p>
      <sec id="sec-2-1">
        <title>1https://coypu.org/ergebnisse/knowledge-graph 2https://neo4j.com/</title>
        <p>Despite the emergence of QG solutions [14, 15], their real-world performance when using
proprietary domain data and schemas is still uncertain because there are no readily available
datasets to evaluate model performance. To address this challenge, we present a novel automated
evaluation framework 3 tailored for practical applications, exemplified in the industrial use
case of SCM QG. Grounded on KGs, our framework first harnesses the power of GPT-3.5 [ 16]
to generate an evaluation dataset. This dataset is then utilized to objectively measure model
performance, enabling users to modify prompts according to their specific needs and data
requirements.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>2. Definitions</title>
      <p>
        Definition 1 (Knowledge Graph). A KG is defined as a collection of triples  ⊂ ℰ × ℛ × ℰ
where ℰ denotes the set of entities and ℛ the set of relation types. In our use case, elements in
ℰ correspond to supply chain-related entities, e.g., suppliers, smelters, and components, and are
represented as nodes in the graph. Every entity has a unique entity type, which is defined by
the mapping  ∶ ℰ →  , where  stands for the set of entity types. The entities are connected
via relations specified in ℛ , represented as typed directed edges in the graph.
Definition 2 (Query Generation). Given a user’s natural language request  = ( 1,  2, ...,   )
and a corresponding query  = ( 1,  2, ...,   ), where each   and   represent a token in the
request and query, respectively, the aim is to accurately discern the underlying intent of a
request  and generate a corresponding query  [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. The execution of query  within the
KG database yields retrieval results denoted as  , where  corresponds to the user’s desired
information from the KG.
      </p>
    </sec>
    <sec id="sec-4">
      <title>3. Framework Overview</title>
      <p>Our proposed framework consists of two main processes: (1) A query dataset creation process
that generates an evaluation dataset containing diverse natural language requests and queries.
(2) A QG process that evaluates the model performance of diferent prompts based on the
generated evaluation dataset.</p>
      <sec id="sec-4-1">
        <title>3.1. Query dataset creation</title>
        <p>For the scope of our study, the query dataset consists of structured queries alongside their
corresponding natural language requests. Creating such datasets presents notable challenges,
particularly due to the potential lack of existing datasets containing relevant queries tailored to
specific scenarios or domains. The task of annotating data for specific domains, such as SCM,
requires the involvement of domain experts, which incurs significant costs and manual eforts.</p>
        <p>To optimize eficiency, we present an innovative framework harnessing the generative
potential of language models. This framework aims to automate the creation of query datasets,
3Code is released at:
https://github.com/4hebailanc/Automated-Evaluation-Framework-for-GraphDatabase-Query-Generation
thereby facilitating the quantitative evaluation of models. As depicted in Figure 1, the query
generation process encompasses four sequential steps:
a. Initial Query Template The process begins by creating a general query template  , using
placeholders for specific information elements like nodes or relations. An example for a query
template  in the query language Cypher is MATCH (n) -[r]-&gt; (m:Label1) RETURN n, r, where
m:Label1 acts as a placeholder for a node labeled with ‘Label1’. In the context of this query
template, ‘Label1’ represents a specific label assigned to nodes within the graph database. The
placeholder allows for dynamic substitution with actual node labels during query generation.
b. Placeholder Substitution Once the template is crafted, specific information is utilized to
replace the designated placeholders, resulting in the formulation of a comprehensive gold query  .
A gold query  is the final query created by substituting specific information into a template, and
it acts as the standard against which we evaluate the accuracy of our model’s query generation
capabilities. In this context, the placeholder m:Label1 is replaced with m:ManufacturerPart.
Consequently, the placeholder substitution produces the following gold query  : MATCH (n)
-[r] -&gt; (m:ManufacturerPart) RETURN n, r. This particular query,  , is tailored to retrieve nodes
(identified as ’n’) and their associated relationships (referred to as ’r’) connected to nodes labeled
as ’ManufacturerPart’ within the database’s graph structure. After substitution, each generated
gold query will be executed once to ensure its functionality and accuracy in retrieving pertinent
information from the database.
c. Requests Generation The LLM employs diverse prompting formats to generate a range of
query requests in natural language. As shown in Figure 2, three distinct prompt templates, i.e.,
simple prompt, schema prompt, and in-context prompt templates have been devised. The simple
prompt directly instructs the model to generate natural language requests. The schema prompt
incorporates the KG schema to guide the model. The in-context prompt employs in-context
learning [17], integrating multiple query and natural language request pairs as demonstrations
within the template. Each query results in the generation of three distinct natural language
requests.
d. Human Evaluation The concluding stage necessitates human intervention to validate the
quality and precision of the generated queries (see Section 4.2 for details). This step acts as
an important quality control measure, confirming that the queries efectively represent the
intended information retrieval process.</p>
        <p>It is noteworthy that only steps (a) and (d) necessitate human intervention within the proposed
methodology. Step (a) consists of creating a query template, a process that is simplified by using
the reference base query syntax 4. The human validation undertaken in Step (d) serves to ensure
the quality and accuracy of queries, thereby augmenting the overall reliability of the system.</p>
      </sec>
      <sec id="sec-4-2">
        <title>3.2. Query generation with LLMs</title>
        <p>
          Our study centers on the task of query generation within the framework of Neo4j databases.
Specifically, our objective is to develop a model capable of producing queries based on natural
language requests, enabling the retrieval of pertinent information from the database. To assess
the model’s performance quantitatively, in line with prior research [
          <xref ref-type="bibr" rid="ref5">18, 5</xref>
          ], we employ
executionbased automatic metrics. We employ three primary metrics in our evaluation: Execution Rate
(ER), which evaluates the executability of the output queries within the database, Gold Query
Accuracy (GQA), which assesses the similarity between the output query and the gold query,
and Execution Accuracy (EA), which measures the correspondence of retrieval results to those
of the gold query. Both GQA and EA are calculated by averaging the scores of all gold and
output queries pairs using BERTScore [19], which measures the similarity between two text
sequences by computing the cosine similarity between the contextual embeddings of their
tokens, as produced by the language model BERT [20]. The key idea behind BERTScore is to
not only take exact word matches into account but also semantic similarity and contextual
appropriateness 5, which makes it more robust compared to traditional evaluation metrics.
Higher values for all three metrics are preferred: a high ER signifies correct execution of output
queries in the database, while higher GQA and EA scores indicate strong resemblance between
4https://neo4j.com/docs/cypher-manual/current/queries/basic/
5In our case, the same cypher queries return a fixed order of requested information,
the output and gold queries, along with accurate retrieval results aligning with those of the
gold query.
        </p>
        <p>1. Execution Rate (ER): This metric evaluates the executability of the output queries
within the database. It is defined as the ratio of executable queries to the total number of
output queries:
2. Gold Query Accuracy (GQA): This metric evaluates the similarity between the output
query and the gold query. It is calculated using the BERTScore metric [19] as follows:
ER =</p>
        <sec id="sec-4-2-1">
          <title>Number of executable queries</title>
        </sec>
        <sec id="sec-4-2-2">
          <title>Total number of output queries</title>
          <p>GQA = 1</p>
          <p>=1</p>
          <p>∑ BERTScore(  ,   )
EA = 1</p>
          <p>=1</p>
          <p>∑ BERTScore(  ,    )
where  is the total number of output and gold queries pairs, and   and   represent the
output query and gold query for the  -th pair, respectively.
3. Execution Accuracy (EA):
(1)
(2)
(3)
where  is the total number of output and gold queries pairs, and   and    represent
the retrieval results for the output query and gold query for the  -th pair, respectively.
output is given under each type of prompt. Wrong or undesired query outputs are marked in red.
Without the schema information, the model with simple prompt utilizes ”associated with” as a relation
type, which is not defined in the KG schema. The prompt with schema solves this problem but does not
use the desired ”collect” function. With in-context demonstrations, the model shows best performance.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. Experiments</title>
      <p>
        We illustrate the practical application of our framework by utilizing real SCM data [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ],
emphasizing its efectiveness in assessing the performance of LLMs for QG in real-world scenarios.
      </p>
      <sec id="sec-5-1">
        <title>4.1. Supply chain knowledge graph</title>
        <p>The supply chain knowledge graph consolidates information from internal sources of the
company Siemens. This comprehensive dataset includes insights such as tier-1 suppliers,
business scopes, and specific parts associated with a company. Additionally, it incorporates
external data, such as publicly available information on smelters and substances. Specifics
regarding tier-2 and tier-3 suppliers are primarily derived from customs data. This information
is further supplemented by a smaller portion obtained from both private customs records and
public media sources. In total, there are 16,910 tier-1 suppliers, 43,759 tier-2 suppliers, and
49,775 tier-3 suppliers of Siemens. All entity and relation types and corresponding numbers of
nodes and edges are listed in Table 1. It is important to note that suppliers at diferent tier levels
are not mutually exclusive. To facilitate the access to this complex network of information, the
graph is structured using the graph database platform Neo4j.</p>
      </sec>
      <sec id="sec-5-2">
        <title>4.2. Supply chain query dataset</title>
        <p>We compiled a set of 60 query templates, categorized into six main groups as detailed in
Table 2. Each template underwent placeholder substitution, as outlined in Section 3.1, yielding
ifve distinct queries per template. This process resulted in the creation of 300 gold queries.
Subsequently, we generated three natural language requests for each gold query, using three
diferent prompts. These query-request pairs were subsequently evaluated by three reviewers
within our organization to determine their feasibility. A query-request pair is labeled as
reasonable if the reviewers agree that the query is efective in retrieving information from the
database to answer the corresponding request; otherwise, it was marked as unreasonable. Any
pair receiving unreasonable annotations from the reviewers was filtered out. Following this
ifltering process, we retained 825 pairs of query-requests to constitute our query dataset.
Interannotator agreement was measured using Fleiss’ kappa [21], which yielded a value of 0.72 in
our annotations, indicating substantial agreement among the reviewers regarding the alignment
of the generated query and request pairs. The high number of retained pairs underscores
the efectiveness of our dataset generation methodology. The time spent annotating by each
reviewer is estimated to be 3 hours.</p>
      </sec>
      <sec id="sec-5-3">
        <title>4.3. Query generation performance</title>
        <p>In this section, we show how the generated dataset can be used to evaluate the QG capabilities
of the model and to help modify the model’s corresponding task prompts. As shown in Figure 3,
we evaluate the performance of three diferent types of QG prompts using two state-of-the-art
language models, GPT-3.5 and GPT-4: simple prompts, prompts with schema, and in-context
prompts. In a simple prompt, the model is directed to generate a corresponding query without
prior knowledge of schema information within the KG. This may result in the utilization of
relation types not present in the KG; for instance, the relation type ”associated with” is not
defined in the KG schema. However, when additional schema data is integrated, the model
shows improved efectiveness by employing precise schema properties in queries. Nevertheless,
without specific user directives, such as indicating the desired output format using the ”collect”
function, the generated query may not perfectly match our gold query. The in-context prompt
takes a step further by providing multiple request and query pairs, along with the schema, to
guide the model in producing correct results. When presented with an in-context demonstration,
the model shows the best performance in QG tasks.</p>
        <p>The results detailed in Table 3 consistently demonstrate GPT-4’s superior performance over
GPT-3.5 across all three prompt types. Notably, employing schema and in-context
demonstrations consistently result in higher GQA , ER, and EA compared to direct instructions to the
model. The provision of a schema significantly enhances model performance, with in-context
demonstrations exhibiting the highest eficacy in both GPT-3.5 and GPT-4. These results provide
valuable insights into the eficacy of various prompting methods and highlight the advancements
from GPT-3.5 to GPT-4 in QG tasks. Furthermore, they validate the utility of query datasets
generated through the pipeline for prompt tuning in real-world applications.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>5. Conclusion</title>
      <p>In conclusion, our paper proposed a comprehensive automated evaluation framework for
QG, addressing the challenge of assessing the performance of LLMs in specific domains due
to the lack of standardized evaluation criteria and datasets. Our framework comprises two
main steps: firstly, the creation of an evaluation dataset leveraging the reasoning abilities
of LLMs through query templates, and secondly, the evaluation of QG model performance
based on the generated evaluation dataset. To illustrate the eficacy of our framework, we
apply it to a concrete industry use case: QG in SCM KGs. The creation of a diverse supply
chain query dataset, as delineated in Table 2, establishes the groundwork for assessing the
model performance of diferent prompts for QG. Subsequent analysis, summarized in Table 3,
consistently demonstrates the superiority of GPT-4 over GPT-3.5 across various conditions.
Overall, our framework enhances the comprehension of QG within the realm of SCM graphs,
ofering practical implications for enhancing query eficiency and accuracy in real-world SCM
applications. This use case exemplifies our methodology for enhancing eficiency and accuracy
in query generation within real-world scenarios.</p>
    </sec>
    <sec id="sec-7">
      <title>6. Future Directions</title>
      <p>
        In future research, expanding the experimental configuration to include other prominent LLMs
like Gemini [22] from Google and Llama [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] from Meta would be beneficial. This expansion
would broaden the scope of the study, allowing for a more comprehensive evaluation of the
proposed framework’s efectiveness across a wider range of LLMs. Additionally, the current
single-round question answering format is constraining; hence, delving into the exploration of
multi-round dialogues represents a promising avenue for further exploration. By incorporating
these future directions, the research can continue to advance our understanding of LLMs and
their applications in real-world scenarios.
      </p>
      <p>Acknowledgments This work has been supported by the German Federal Ministry for
Economic Afairs and Climate Action (BMWK) as part of the project CoyPu under grant number
01MK21007K and has also been supported by the DAAD programme Konrad Zuse Schools
of Excellence in Artificial Intelligence, sponsored by the Federal Ministry of Education and
Research.
ternational Workshop on Linked Data-driven Resilience Research 2023 co-located with
Extended Semantic Web Conference 2023 (ESWC 2023), Hersonissos, Greece, May 28,
2023, volume 3401 of CEUR Workshop Proceedings, CEUR-WS.org, 2023. URL: https:
//ceur-ws.org/Vol-3401/paper3.pdf.
[11] J. Deng, C. Chen, X. Huang, W. Chen, L. Cheng, Research on the construction of event
logic knowledge graph of supply chain management, Adv. Eng. Informatics 56 (2023)
101921. URL: https://doi.org/10.1016/j.aei.2023.101921. doi:10.1016/J.AEI.2023.101921.
[12] X. Huang, L. Cheng, J. Deng, T. Wang, Binocular attention-based stacked bilstm NER
model for supply chain management event knowledge graph construction, in: Proceedings
of the 15th International Conference on Machine Learning and Computing, ICMLC 2023,
Zhuhai, China, February 17-20, 2023, ACM, 2023, pp. 40–46. URL: https://doi.org/10.1145/
3587716.3587723. doi:10.1145/3587716.3587723.
[13] W. Chen, L. Cheng, T. Wang, J. Deng, Knowledge graph construction for supply chain
management in manufacturing industry, in: D. Huang, P. Premaratne, B. Jin, B. Qu, K. Jo,
A. Hussain (Eds.), Advanced Intelligent Computing Technology and Applications - 19th
International Conference, ICIC 2023, Zhengzhou, China, August 10-13, 2023, Proceedings,
Part IV, volume 14089 of Lecture Notes in Computer Science, Springer, 2023, pp. 682–693. URL:
https://doi.org/10.1007/978-981-99-4752-2_56. doi:10.1007/978- 981- 99- 4752- 2\_56.
[14] T. Bratanic, Generating cypher queries with chatgpt-4 on any
graph schema, 2023. URL: https://neo4j.com/developer-blog/
generating-cypher-queries-with-chatgpt-4-on-any-graph-schema/, accessed:
[15.10.2023].
[15] X. Li, R. Zhao, Y. K. Chia, B. Ding, L. Bing, S. R. Joty, S. Poria, Chain of
knowledge: A framework for grounding large language models with structured knowledge
bases, CoRR abs/2305.13269 (2023). URL: https://doi.org/10.48550/arXiv.2305.13269.
doi:10.48550/ARXIV.2305.13269. arXiv:2305.13269.
[16] T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan,
P. Shyam, G. Sastry, A. Askell, et al., Language models are few-shot learners, Advances in
neural information processing systems 33 (2020) 1877–1901.
[17] Q. Dong, L. Li, D. Dai, C. Zheng, Z. Wu, B. Chang, X. Sun, J. Xu, Z. Sui, A survey for
in-context learning, arXiv preprint arXiv:2301.00234 (2022).
[18] A. Saha, V. Pahuja, M. Khapra, K. Sankaranarayanan, S. Chandar, Complex sequential
question answering: Towards learning to converse over linked question answer pairs with
a knowledge graph, in: Proceedings of the AAAI conference on artificial intelligence,
volume 32, 2018.
[19] T. Zhang, V. Kishore, F. Wu, K. Q. Weinberger, Y. Artzi, Bertscore: Evaluating text
generation with BERT, in: 8th International Conference on Learning Representations,
ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020, OpenReview.net, 2020. URL: https:
//openreview.net/forum?id=SkeHuCVFDr.
[20] J. Devlin, M. Chang, K. Lee, K. Toutanova, BERT: pre-training of deep bidirectional
transformers for language understanding, in: J. Burstein, C. Doran, T. Solorio (Eds.),
Proceedings of the 2019 Conference of the North American Chapter of the Association
for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019,
Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), Association for
Computational Linguistics, 2019, pp. 4171–4186. URL: https://doi.org/10.18653/v1/n19-1423.
doi:10.18653/V1/N19- 1423.
[21] J. L. Fleiss, Measuring nominal scale agreement among many raters., Psychological bulletin
76 (1971) 378.
[22] G. Team, R. Anil, S. Borgeaud, Y. Wu, J.-B. Alayrac, J. Yu, R. Soricut, J. Schalkwyk, A. M.</p>
      <p>Dai, A. Hauth, et al., Gemini: a family of highly capable multimodal models, arXiv preprint
arXiv:2312.11805 (2023).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>H.</given-names>
            <surname>Touvron</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Lavril</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Izacard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Martinet</surname>
          </string-name>
          , M.
          <article-title>-</article-title>
          <string-name>
            <surname>A. Lachaux</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Lacroix</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Rozière</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Goyal</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Hambro</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Azhar</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Rodriguez</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Joulin</surname>
          </string-name>
          , E. Grave, G. Lample,
          <article-title>Llama: Open and eficient foundation language models</article-title>
          ,
          <source>ArXiv abs/2302</source>
          .13971 (
          <year>2023</year>
          ). URL: https: //api.semanticscholar.org/CorpusID:257219404.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Hofmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Borgeaud</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mensch</surname>
          </string-name>
          , E. Buchatskaya,
          <string-name>
            <given-names>T.</given-names>
            <surname>Cai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Rutherford</surname>
          </string-name>
          , D. de las Casas,
          <string-name>
            <given-names>L. A.</given-names>
            <surname>Hendricks</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Welbl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Clark</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Hennigan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Noland</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Millican</surname>
          </string-name>
          , G. van den Driessche,
          <string-name>
            <given-names>B.</given-names>
            <surname>Damoc</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Guy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Osindero</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Simonyan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Elsen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Vinyals</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. W.</given-names>
            <surname>Rae</surname>
          </string-name>
          ,
          <string-name>
            <surname>L. Sifre,</surname>
          </string-name>
          <article-title>An empirical analysis of compute-optimal large language model training</article-title>
          , in: A. H.
          <string-name>
            <surname>Oh</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Agarwal</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Belgrave</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          Cho (Eds.),
          <source>Advances in Neural Information Processing Systems</source>
          ,
          <year>2022</year>
          . URL: https://openreview.net/forum?id=iBBcRUlOAPR.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Chowdhery</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Narang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bosma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Mishra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Roberts</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Barham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. W.</given-names>
            <surname>Chung</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Sutton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gehrmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Schuh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Shi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Tsvyashchenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Maynez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Barnes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tay</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shazeer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Prabhakaran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Reif</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Du</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Hutchinson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Pope</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bradbury</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Austin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Isard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Gur-Ari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Yin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Duke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Levskaya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ghemawat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Dev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Michalewski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Garcia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Misra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Robinson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Fedus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ippolito</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Luan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Lim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Zoph</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Spiridonov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sepassi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dohan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Agrawal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Omernick</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Dai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. S.</given-names>
            <surname>Pillai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Pellat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lewkowycz</surname>
          </string-name>
          , E. Moreira,
          <string-name>
            <given-names>R.</given-names>
            <surname>Child</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Polozov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Saeta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Diaz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Firat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Catasta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Meier-Hellstern</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Eck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Dean</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Petrov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Fiedel</surname>
          </string-name>
          , Palm:
          <article-title>Scaling language modeling with pathways</article-title>
          ,
          <source>J. Mach. Learn. Res</source>
          .
          <volume>24</volume>
          (
          <year>2023</year>
          )
          <volume>240</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>240</lpage>
          :
          <fpage>113</fpage>
          . URL: http://jmlr.org/papers/v24/
          <fpage>22</fpage>
          -
          <lpage>1144</lpage>
          .html.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>P.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Bommasani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Tsipras</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Soylu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Yasunaga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Narayanan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Newman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Yuan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Yan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Cosgrove</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. D.</given-names>
            <surname>Manning</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Ré</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Acosta-Navas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. A.</given-names>
            <surname>Hudson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Zelikman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Durmus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ladhak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Rong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Ren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Yao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Santhanam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. J.</given-names>
            <surname>Orr</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Yüksekgönül</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Suzgun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Guha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. S.</given-names>
            <surname>Chatterji</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Khattab</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Henderson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Chi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Xie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Santurkar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ganguli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Hashimoto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Icard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Chaudhary</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Mai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , Y. Koreeda,
          <article-title>Holistic evaluation of language models</article-title>
          ,
          <source>CoRR abs/2211</source>
          .09110 (
          <year>2022</year>
          ). URL: https://doi.org/10.48550/arXiv.2211.09110. doi:
          <volume>10</volume>
          .48550/ARXIV.2211.09110. arXiv:
          <volume>2211</volume>
          .
          <fpage>09110</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>L.</given-names>
            <surname>Perez-Beltrachini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Jain</surname>
          </string-name>
          , E. Monti,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lapata</surname>
          </string-name>
          ,
          <article-title>Semantic parsing for conversational question answering over knowledge graphs (</article-title>
          <year>2023</year>
          )
          <fpage>2499</fpage>
          -
          <lpage>2514</lpage>
          . URL: https://doi.org/10. 18653/v1/
          <year>2023</year>
          .eacl-main.
          <volume>184</volume>
          . doi:
          <volume>10</volume>
          .18653/V1/
          <year>2023</year>
          .EACL-MAIN.
          <year>184</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <article-title>[6] Perplexity-a measure of the dificulty of speech recognition tasks</article-title>
          ,
          <source>The Journal of the Acoustical Society of America</source>
          <volume>62</volume>
          (
          <year>1977</year>
          )
          <fpage>S63</fpage>
          -
          <lpage>S63</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>K.</given-names>
            <surname>Papineni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Roukos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Ward</surname>
          </string-name>
          , W. Zhu,
          <article-title>Bleu: a method for automatic evaluation of machine translation</article-title>
          ,
          <source>in: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, July</source>
          <volume>6</volume>
          -
          <issue>12</issue>
          ,
          <year>2002</year>
          , Philadelphia, PA, USA, ACL,
          <year>2002</year>
          , pp.
          <fpage>311</fpage>
          -
          <lpage>318</lpage>
          . URL: https://aclanthology.org/P02-1040/. doi:
          <volume>10</volume>
          .3115/1073083.1073135.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>C.-Y.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <surname>ROUGE:</surname>
          </string-name>
          <article-title>A package for automatic evaluation of summaries, in: Text Summarization Branches Out, Association for Computational Linguistics</article-title>
          , Barcelona, Spain,
          <year>2004</year>
          , pp.
          <fpage>74</fpage>
          -
          <lpage>81</lpage>
          . URL: https://aclanthology.org/W04-1013.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.</given-names>
            <surname>Cox</surname>
          </string-name>
          ,
          <article-title>Power, value and supply chain management</article-title>
          ,
          <source>Supply chain management: An international journal 4</source>
          (
          <year>1999</year>
          )
          <fpage>167</fpage>
          -
          <lpage>175</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hildebrandt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Buchner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Inzko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Wernert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Weigel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Beyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Berbalk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Tresp</surname>
          </string-name>
          ,
          <article-title>A knowledge graph perspective on supply chain resilience</article-title>
          , in: S. Tramp,
          <string-name>
            <given-names>R.</given-names>
            <surname>Usbeck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Arndt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Holze</surname>
          </string-name>
          , S. Auer (Eds.), Proceedings of the Second In-
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>