<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>P. Kardos);</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Human-in-the-loop Entity Set Expansion using Knowledge Graphs</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Péter Kardos</string-name>
          <email>kardos@inf.u-szeged.hu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>András London</string-name>
          <email>london@inf.u-szeged.hu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Richárd Farkas</string-name>
          <email>rfarkas@inf.u-szeged.hu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Large Language Models.</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Entity Set Expansion, XAI</institution>
          ,
          <addr-line>Knowledge Graph</addr-line>
          ,
          <country>Large Language Models</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Szeged</institution>
          ,
          <addr-line>Szeged</addr-line>
          ,
          <country country="HU">Hungary</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>1957</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0003</lpage>
      <abstract>
        <p>Knowledge Graphs have the ability to represent our world in intricate detail within a structure that can be interpreted by both humans and machines. They not only provide descriptions of individual entities, but also indicate their interrelationships, thus serving as an efective resource for discovering semantic similarities. In this paper, we address the task of Entity Set Expansion, where the goal is to retrieve additional entities from a Knowledge Graph, given a semantically connected input seed set of entities. We developed a graph walk-based algorithm capable of recognizing and explaining the connections found between the seed elements, then it carries out the set expansion by these connections. As a set of entities usually have multiple plausible semantic connections, the user wants to choose among them. We simulated a human-in-the-loop system that allows users to influence the output of the system. In addition, we explored two baseline approaches: a purely text-based approach, using sentence transformers along with popular Large Language Models. We also proposed human-inthe-loop approaches for both baselines. We evaluated and compared all three solutions on a task derived from the KGQA LC-QuAD dataset, pointing out a huge diference in performance in favor of our graph walk-based solution. Additional experiments showed intriguing results regarding human involvement in the selection of connections within the graph walk approach and highlight the problem of the overly generalized perspective of XAI-KG'25: 1st International Workshop on Explainable AI and Knowledge Graphs (XAI+KG), June 01-05, 2025, Portoroz, Slovenia Workshop Proceedings</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Let us consider the problem where the user has a small entity set {BMW, Volkswagen, Audi} and
wants to list more elements that are semantically related. This task is called Entity Set Expansion (ESE)
[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The extended set can be beneficial for various downstream tasks such as Taxonomy Construction
[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] or Semantic Search [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>
        Considering the ESE example of {BMW, Volkswagen, Audi}, it is anticipated that there are
connections to car manufacturers, and to Germany as well. At this point, we would need more information
if the extended list is either German car brands, Car brands or German brands which can only be
provided by the user. In order to ease the challenge of this ambiguity, we propose three
human-in-theloop assisted approaches and simulate the ability to interact with the users for the models. Each of our
solutions can provide options for the user to choose from that can help disambiguation to achive an
extended set. Human feedback can be leveraged in various ways. For example, reinforcement learning
from human feedback (RLHF) has recently gained fame in the training of large language models (LLM),
as demonstrated by systems such as ChatGPT [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
      <p>
        Knowledge Graphs (KGs) have the ability to represent and organize entities and their connections in
a structured way that is readable by both humans and machines, making them appropriate building
blocks for human-in-the-loop ESE methods. It has been shown that KGs can boost the performance
of NLP applications in tasks such as Reasoning [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] or Text Generation [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. KGs such as DBPedia [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ],
YAGO or FreeBase reach substantial size as they contain fine details about their entities that can help
AI systems to better understand the semantic relations between concepts. ESE from KG (KG-ESE) is
CEUR
      </p>
      <p>ceur-ws.org
the task of expanding a given set of entities by identifying and incorporating additional, contextually
relevant entities contained within a Knowledge Graph.</p>
      <p>The main contribution of this paper is to propose a KG-based ESE algorithm that outputs
possible human-interpretable semantic connections among entities besides the expanded entity set, thus
providing a natural way of interaction with the human user.</p>
      <p>
        Since the available ESE benchmark datasets are all corpus-based, we had to create a KG-linked ESE
task to eliminate the KG entity-linking errors. We used a KGQA dataset, namely LC-QuaD [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] and ran
its SPARQL queries over DBPedia to acquire the extended sets and created the seed sets by sampling.
      </p>
      <p>We propose two other approaches for the KG-ESE task. First, we leverage sentence transformers to
generate dense vector embeddings for each entity within the KG and use vector similarity search to
expand the initial set of entities. Furthermore, we introduce both heuristic-based and human-involved
approaches for dimension weighting, aiming to enhance the refinement of the similarity metric. In our
second approach, we utilize the widely adopted GPT-4o generative AI model to address the KG-ESE task,
while also conducting experiments to evaluate the model’s capacity to explain the semantic relationships
between the provided entities. We comparatively evaluate the three approaches on the evaluation
dataset derived from LC-QuaD. In all of the three cases, we also simulate the human-in-the-loop use
case and present empirical results on these simulations.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Methods</title>
      <sec id="sec-2-1">
        <title>2.1. Problem definition</title>
        <p>We define the Entity Set Expansion (ESE) problem as the continuation of an initial seed of entities,
by listing additional entities. For instance, given the seed {Budapest, Bucharest, Belgrade} one
can continue with Bratislava, Zagreb, Prague if the presumed association is Eastern European
capitals. However, multiple correct continuations can be correct such as capitals starting with the
letter B, leading to ambiguous results.</p>
        <p>Formally, given a seed set  = { 1,  2, ...,   } where   is an entity from   = {(ℎ,  , ) ∈  ×  ×  } where
 is the set of entities,  is the set of relations, the goal is to expand the list with additional elements
arriving at the target set  = { 1,  2, ...,   } where  ⊆  ,   ∈  .</p>
        <p>We introduce a graph walk-based algorithm and evaluate its performance against traditional text
embedding-based methods as well as against a widely used large language model, GPT-4, through
prompting.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Truncated Graph Walk</title>
        <p>Our hypothesis is that the initial seed elements have a common ancestor that we can reach by traversing
the same type of edges.</p>
        <p>We propose an algorithm that not only extends the list of seed nodes, but also provides an explanation
of how they are connected by listing paths that lead to common ancestors.</p>
        <p>This solution was developed utilizing graph walks as the central methodology. We predefined path
templates that the algorithm uses to search for common connectivity paths in the graph. We restricted
each path to take a route over the same type of edges and to arrive at the same node. We call this node
the common ancestor or endpoint  for short. An ancestor does not necessarily have to be higher in
the hierarchy as there may be no hierarchy in the KG.</p>
        <p>The predefined templates of length one and length two are the following:
1. Forward (F): ( →  1 → )
2. Backward (B): ( →  1 → )
3. Forward-Backward (F-B): ( →  1 →  ) ∧ ( →  2 →  )
4. Forward-Forward (F-F): ( →  1 →  ) ∧ ( →  2 → )
5. Backward-Forward (B-F): ( →  1 → ) ∧ ( →  2 → )
6. Backward-Backward (B-B): ( →  1 → ) ∧ ( → 
2 →  )
where  ∈  ,  1,  2 ∈ ,  ∈  ,  = { 1,  2, ...,  ℓ ∶   ∈  } . A path that matches a given template is deemed
valid if the endpoint  is reachable from all seed elements using the specified relations (  1,  2), with the
only allowed variation being among the elements of set  . An example of the pattern F-F is shown in
the Figure 1. All seed nodes can reach the endpoint by taking the same relations, difering only in the
nodes  , which are the diferent Grand Prix races.</p>
        <sec id="sec-2-2-1">
          <title>2.2.1. Algorithm for matching patterns of length one</title>
          <p>For all  ∈  we query which relation takes us to which endpoint. For example (Niki Lauda → type
→ Person). We set a counter for how many times we have seen this relation-endpoint pair. If we
iterate over all seed elements and select all elements with a counter value of || (seed size), this would
include all 1-length paths connecting the seed nodes to a common endpoint.</p>
        </sec>
        <sec id="sec-2-2-2">
          <title>2.2.2. Algorithm for matching patterns of length two</title>
          <p>Finding a common endpoint without exploding the search field is one of the biggest challenges. For
example (? → birthPlace → United States) would result in a set of millions of entries. Our goal
is to keep the number of items in memory to a minimum, and to discard as many items as possible..
Figure 2 demonstrates how the algorithm works. We start by building the path from the seed nodes,
since the endpoints are unknown at this point. As a first step, we select all the relations that have
connections to all the seed nodes. We call these common relations. Next, we iterate through the
common relations one by one and in a nested loop, the seed nodes to query the possible Y sets while
limiting the size of the results (for each seed node-relation pair) to 2000 to avoid million-sized sets. The
term ’Truncated’ refers to this limitation in the name of the algorithm. We call this set the inside points.
In the figure this can be any of the colored ellipses when a common relation is selected. From the inside
points, we do the same as in the length one patterns, building the counter object, but the same seed
node can only increment one to the same counter record. This is an indicator that the endpoint can be
reached from the seed node. The key of the counter record is the relation-endpoint pair. If we iterate
through all the seed nodes for a single common relation, we can select all the records in the counter
object that have a value of || (seed size). In Figure 2 the intersection of the colored ellipses shows a path
that connects all seed nodes to the same endpoint e1 in a green box via diferent inside points. Since we
know which common relation was chosen we can construct the valid paths e.g. F-F:  → firstDriver
→  → location → Autodromo Enzo. The counter object is emptied for the next common relation.
For the other patterns to work, only the relation directions had to be reversed, so instead of querying
the tails, we query the heads. Due to the truncation step we might discard nodes that would provide
additional elements’ endpoints that all the seed nodes connect to, but in practice the algorithm already
ifnds plenty of possibilities to choose from. The pseudocode of the algorithm is given in Algorithm 1.
▷ Returns all relations starting from</p>
          <p>▷ Inside points
▷ Limiting the result set</p>
          <p>▷ endpoint candidates
▷
increments the counter of  2 + 
if</p>
          <p>It is possible to define templates of length three, but the size of DBPedia would most definitely explode
the search space making it infeasible.</p>
          <p>By running this algorithm the output will contain multiple paths connecting the seed nodes to an
endpoint. To list additional entities the seed nodes shall be mask out from the paths and run a SPARQL
query that would respond with all the nodes that fit the paths. It is a matter of preference whether all
found paths should be unified along an ∧ operator making it a very strict prediction-set. In most cases,
this option will result in only the seed nodes matching the query. The other option is to sample from
the queries based on a heuristic to get the final answer.</p>
        </sec>
        <sec id="sec-2-2-3">
          <title>2.2.3. Simulating the human use case</title>
          <p>To address the ambiguity problem and enhance interpretability, we require a well-formulated connection
representing all entities in the expanded set, reflecting the user’s vision. While our method can be
applied to any task with human involvement, we simulate the user in this study by using the gold query
from the evaluation dataset.</p>
          <p>We evaluate the Truncated Graph Walk method in a human-in-the-loop setup. In this system, three
alternative queries are presented to the user, who selects one based on personal preference. To simplify
decision-making, an ideal system should limit the number of alternatives while ensuring maximum
coverage of the query space. We propose a heuristic for selecting a small but diverse set of alternatives
by ranking predicted query paths based on the size of their result sets. From this ranking, we select
three queries and allow the user to choose the one that best fits their reasoning. In our simulations, we
assume the user selects the query with the highest F1 score relative to the gold standard.</p>
          <p>The heuristic on which we base our query selection is the following:
1. Select the smallest sized query
2. Discard any query with more than 100 results and select the one closest to the average
3. Select the query closest to 100 results, but not more than 100
Although selecting queries based on their graph structure is more intuitive for users, we chose to
present options based on result sets for simplicity as both approaches are interchangeable.</p>
        </sec>
        <sec id="sec-2-2-4">
          <title>2.2.4. Flaws of the method</title>
          <p>In cases where the user wants to list items that do not have a connection in the knowledge graph (KG),
such as all cities starting with ’B’ in DBPedia, the algorithm cannot reconstruct the full set. While the
nodes exist in the KG, the required connections are missing. However, since all our seed and gold sets
are based on DBPedia queries, this issue is not a factor in our dataset.</p>
        </sec>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Textual semantics-based ESE</title>
        <p>
          A baseline solution, we compare the proposed truncated graph walk method, is to expand the seed set
based on textual semantic similarities. We transform the texts into a semantic vector space using an
embedder, then all operations are performed in the vector space. We utilize the sentenceBert model
allMiniLM-L6-v2 [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] to obtain the embeddings for each entity of DBPedia using the string representation
of the nodes, which adds up to 7 million embedding vectors.
        </p>
        <sec id="sec-2-3-1">
          <title>2.3.1. Embedding</title>
          <p>Given a seed set, we assume that the correct expansion elements should be close to the seed elements
in the vector space. We take the mean of the seed vectors (we call it center vector ). Imagine a sphere
with the center as the center vector and the radius as the Euclidean distance between the center vector
and the most distant seed vector. The ESE prediction of this algorithm is the set of all DBPedia nodes
whose vector falls within this sphere.</p>
        </sec>
        <sec id="sec-2-3-2">
          <title>2.3.2. STDev-Embedding</title>
          <p>We also consider the assumption that the dimensions of the vector space should have diferent importance
values when calculating the distances for diferent input seed sets. For example, in the case of car
manufacturers versus german brands the importance of various dimensions in the vector space shall
difer since a seed set can fit into both groups, but the whole expanded sets have a slightly diferent
semantic meaning. To assign weights to the dimensions, we use the standard deviation of each dimension
over the seed vectors. Sorting by this value, we select the least spread out  dimensions and discard all
others. Based on our assumption the seed elements shall already be close in the vector space, therefore
other expansion elements shall align along these dimensions as well. We apply the Embedding approach
to ESE as described above over this new embedding space.</p>
        </sec>
        <sec id="sec-2-3-3">
          <title>2.3.3. Simulating human use case</title>
          <p>We propose a human-in-the-loop approach for the embedding-based ESE method, where the user labels
a small set of entities as correct/incorrect expansions to modify the dimensional weights.</p>
          <p>
            First, we compute the center vector from the seed entities. The user then labels the 5 closest unlabeled
entities to the center vector (from all DBPedia entities) using weighted Euclidean distance, with initial
weights set to 1. We set up a machine learning loop to adjust the Euclidean distance weights, where
seed and positively labeled entities are considered correct, and negatively labeled entities are incorrect.
We generate triples in the form of [center, correct, incorrect] from the labeled set. For each triple, the
model computes the weighted Euclidean distance between the center-correct and center-incorrect pairs,
then calculates the triplet loss. We use the Adam optimizer [
            <xref ref-type="bibr" rid="ref10">10</xref>
            ] in PyTorch with a learning rate of 0.75.
The weights are constrained within [
            <xref ref-type="bibr" rid="ref1">0,1</xref>
            ]. The labeling and weight training loop is repeated three
times. Finally, we use the final weights to select the most distant correct entity from the center vector
and expand the correct set with all DBPedia entities closer than this entity. A high learning rate ensures
weights can reach zero if a dimension is deemed unnecessary. For experimentation, we simulate human
labeling with an oracle indicating if the entity is part of the gold standard set.
          </p>
        </sec>
      </sec>
      <sec id="sec-2-4">
        <title>2.4. Large Language Models for the ESE task</title>
        <p>Generative Large Language Models (LLMs) like GPT-4 have recently gained popularity. We compare our
truncated graph walk method to ChatGPT by prompting GPT-4o (2024-08-06) to evaluate its performance
on the ESE task. We conducted two few-shot learning experiments under the assumption that DBpedia
was part of GPT-4o’s training data, without providing any additional information about the knowledge
graph.</p>
        <p>In the first experiment, we asked GPT-4 to generate possible explanations for connections between
seed elements using DBpedia paths. During testing, GPT-4 occasionally refused to provide answers,
citing a lack of database access. Its responses included only 5-6 triples, but the accuracy of gold-standard
connections improved as more triples were generated. To improve the model’s performance, we refined
the prompt iteratively. The final prompt is available in Appendix A. In the second experiment, our
goal was to expand the seed set by generating related entities. We included a clarifying question in the
prompt to help GPT-4 identify the relationships between the seed elements. The prompt is available in
Appendix B.</p>
        <p>While the model’s structured JSON responses are parsable, there were inconsistencies in formatting
(e.g., variations/abbreviations in entity names). To automate evaluation, we discarded the URL
component of each entity and used the Levenshtein distance to find the closest match in the gold standard.
Entities were considered equivalent if the Levenshtein similarity exceeded 80%. This process was also
applied to matching the connection triples. However, this method may lead to false positives or false
negatives. 1</p>
        <p>The implementation for dataset creation and methods is available on GitHub: https://github.com/
kiscsonti/ContinueTheList
1We set the threshold to 80% based on manual investigations and to ensure roughly equal number of false positives and
negatives.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Results and Discussion</title>
      <sec id="sec-3-1">
        <title>3.1. Experimental setup</title>
        <p>
          We utilize the LC-QuAD [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] dataset which contains 5000 records of natural language questions with
the corresponding SPARQL queries over the DBPedia [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] 2016-04 version as a target KG. By running
the SPARQL queries we got the gold set of responses (entites) for each question. From these gold sets
we kept all the records that had a multi-element response with more than 8 and less than 100 elements
simulating ESE use cases. While querying the DBPedia, we discarded all the non-URL like nodes
since DBpedia can be noisy and queries can return random literals like: '181', or both 'Memoir' and
'http://dbpedia.org/resource/Memoir'. The final dataset with the filters applied yield 310 {English
question, SPARQL query, response set of entities} tuples. We randomly sampled {4, 6, 8} sized initial
seed sets from each response sets and considered a solution perfect if all the gold responses are predicted
by the method starting from a seed set. The main evaluation metrics are Precision-Recall-F1 (PRF)
scores between the predicted and gold entity sets.
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Results</title>
        <p>Table 1 presents a comparison of the performance of the diferent algorithms. 2 Among the
embeddingbased approaches, STDev demonstrated the highest performance therefore in only put those numbers in
the table. A comparison of embedding-based approaches is provided in Table 3. The Truncated Graph
Walk algorithm outperforms all the other methods. Increasing the seed sample size results in a small
gain in the F1 score. The embedding-based methods finished second, even losing performance when
the seed size is larger.</p>
        <p>The Truncated Graph Walk is on par in speed with the STDev method at around 14 hours, even
when using a KG as large as DBPedia. GPT-4o performed the worst on the task while being the fastest
method, only taking 1 hour to query the responses.</p>
        <sec id="sec-3-2-1">
          <title>Sample size</title>
          <p>4
6
8</p>
        </sec>
        <sec id="sec-3-2-2">
          <title>Algorithm</title>
          <p>TruncatedGraphWalk
STDev-Embedding</p>
          <p>GPT-4o
TruncatedGraphWalk</p>
          <p>STDev-Embedding
TruncatedGraphWalk
STDev-Embedding</p>
        </sec>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Discussion on Truncated Graph Walk</title>
        <p>The Truncated Graph Walk method ofers the advantage of providing human-readable interpretations
of expansions by visualizing or verbalizing the graph paths. This allows us to translate graph paths
into queries. We evaluated the algorithm’s ability to recognize user queries from the original LC-QuAD
dataset. We assessed how many gold queries our method can identify without the human-in-the-loop
setup, considering all paths connecting the seed set to an endpoint. The percentage of exact matches,
where the method successfully identifies the gold SPARQL query, is shown in the ExactMatch column.
We observed that if two queries have the same result set size, their elements completely overlap. For
example, the queries:
{Skull Gang →currentMembers →?} and {Skull Gang →bandMember →?} produce the same results set.
This holds for longer paths as well:
{Y →property/layout →?, Y →ontology/designer →Pininfarina} and {Y →property/layout →?, Y
→ontology/class →Light commercial vehicle}
2We run GPT-4o only with the seed size of 4 because of cost issues.</p>
        <p>Since multiple queries can yield the same response entity set, we compared queries based on their
response entity sets, not the graph paths. A query was considered an exact match if its node set
completely overlapped with the gold set. Using this approach, we found that approximately 1 in 4
(27.9% for a seed size of 4) gold queries were missing from the pool of the found query options. Since
the user is only presented with 3 options in the human-in-the-loop setup, the realistic match rate for
exact gold queries is even lower. To increase exact matches, combining the identified paths using an
  operator is necessary, as increasing the seed size has a minimal efect.</p>
        <p>Table 2 summarizes the ranking metrics for exactly correct queries. We excluded rows without an
exact gold query match and sorted by result size. The ranking metrics used were Mean Rank (MR),
Mean Reciprocal Rank (MRR), and Hits at 1,5,10. Results show that half of the gold queries are ranked
ifrst, with this improving to 63% with a seed size of 8. However, increasing the hit size revealed that
only 75.5% of gold queries appear in the top 10.</p>
        <p>We also compared how similar the queries selected in the human-in-the-loop setup are. By matching
their triples to the gold set, we found overlaps of 42%, 46.2%, and 47% for seed sizes 4, 6, and 8, respectively.
Despite achieving a higher F1 score, less than half of the gold triples are present in the selected queries,
suggesting that very similar node sets can be generated from diferent queries.</p>
        <sec id="sec-3-3-1">
          <title>Seed Size</title>
          <p>4
6
8</p>
          <p>To conclude, our solution successfully reproduced the gold set in 72% of all examples, meaning that
among the paths connecting the seed elements, there was one that generated the gold set. Each of
these paths can be expressed as a natural language sentence, three of which were presented to the user.
Thanks to the readability of these sentences, the user can easily interpret the common factor connecting
the seed elements and select the one that best aligns with their own vision (gold set). For example, the
path mentioned at the beginning of this section is ”Current members of Skull Gang”. Since the number
of examples presented is limited, it is possible that the path that generates the gold set is missing from
the connections shown. However, the users are still able to choose the path that most closely matches
their view. This explains the low (40-50%) values observed with the matched triples.</p>
        </sec>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Discussion on the Embedding-based methods</title>
        <p>We report on 3 embedding-based variants. First, we have No annotation where all the vector space
dimensions are used as is, without any trimming using the Euclidean distance. Second, we have the
TOP100 where the vectors are truncated using the STDev embedding method. Finally, we used the
Iterative process as explained in 2.3.</p>
        <p>The results are summarized in the table 3. The No Annotation solution produces on average 100
times more nodes than the gold set, resulting in high recall, but poor precision. When we truncated to
the top 100 dimensions, it performed the best of all three. It is surprising that the dimension selection
heuristic could outperform the iterative variant, where even additional positive examples could leak
out during the annotation steps, however more than 95% of the annotated nodes were negatives. In
addition, the metrics vary greatly as the seed size increases. With more samples, we noticed that the
results dropped sharply. This can be explained by the vectors becoming less distinguishable (more
noisy) by their individual dimensions (e.g.: standard deviation). We can see the biggest drop with the
iterative approach.</p>
        <p>As for the running times, the iterative process took the longest. This is due to the constant search for
the nearest neighbors during the annotation loops. Since the vector space is constantly changing due to
the learned weights, the use of VectorDBs is not an option, as their structures have to be rebuilt in each
iteration, which has a huge overhead.</p>
        <sec id="sec-3-4-1">
          <title>SeedSize</title>
          <p>4
6
8</p>
        </sec>
        <sec id="sec-3-4-2">
          <title>Algorithm</title>
          <p>Iterative
No Annotation
STDev - Top100</p>
          <p>Iterative
No Annotation
STDev - Top100</p>
          <p>Iterative
No Annotation
STDev - Top100</p>
        </sec>
      </sec>
      <sec id="sec-3-5">
        <title>3.5. Discussion on LLM for ESE</title>
        <p>For prompt evaluations, we utilized only the seed set consisting of four elements. In the explanation
generation task, we were able to match 25.1% of the gold triples to at least one response triple, which is
directly comparable to the 42% match rate of the Truncated Graph Walk (see section 3.3). This does
not necessarily imply that GPT-4o lacks knowledge about the gold connections, but rather that it is
challenging to extract the desired information. To gain deeper insight into the model’s responses, we
manually examined some outputs and found that the model tends to prioritize more general or common
connections over specific ones. Since the gold connections in the dataset are relatively specific, this
preference likely explains the observed performance.</p>
        <p>In the element generation task, the model’s performance was the lowest of the three methods tested,
even though the input prompts explicitly specified the exact connection between elements. A major
factor contributing to this performance was the model’s output limitation in terms of the number of
entities generated. The model typically produced between 10 and 30 entities, a limitation that we were
unable to overcome through prompt engineering. Furthermore, the entities listed were not always
accurate. For example, when asked to list players from the 2016 NBA team, the Phoenix Suns, the
model occasionally included current players who were not on the team at the time or players who had
previously been on the team but in the most cases the response entities matched the desired group in
some aspects, but not the correct one. In appendix C, we provided an example over the three methods.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Related work</title>
      <sec id="sec-4-1">
        <title>4.1. Knowledge Graph-based ESE</title>
        <p>The first KG-based ESE solution, SEED [ 11], uses an earlier version of DBpedia (half the size of ours).
SEED addresses KG incompleteness by identifying relaxed paths where not all seed elements need to
be connected. It employs a probabilistic approach for path discovery and ranks entities for expansion.
In contrast, our method directly ranks common aspects and involves a human-in-the-loop annotation
process, requiring all seed elements to be present in identified paths. MetaPath [ 12] also builds trees
from seed nodes and expands them along KG edges, ranking elements based on their frequency along
these paths, similar to SEED. CoMeSe++ [13] combines KG random walks with textual embeddings
from word2vec to expand entity sets. It uses YAGO and DBPedia datasets, but we were unable to obtain
the implementation for comparison.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Corpus-based ESE</title>
        <p>
          Corpus-based ESE has traditionally relied on web data, with early approaches like Google Sets [14]
and SEAL [15] using textual sources. More recent methods use subsets of Wikipedia and the APR
dataset. SetExpan [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] introduces 40 queries and 41,000 entities, while our dataset includes 310 queries
and over 7 million entities. SetExpander [16, 17] uses classifiers with static textual embeddings to
predict entities belonging to the seed set. However, it struggles with multi-word expressions, unlike
our sentence-transformers, which handle such cases better. Modern approaches like FUSE [18] address
ambiguity within seed sets by using BERT’s Masked Language Modeling task and skip-gram models for
predicting new elements. FGExpan [19] leverages a three-layered taxonomy structure as an auxiliary
resource to identify the types of seed entities. By employing multiple scoring techniques, including an
MLM and an NLI task, it expands the input set by associating entity types with corresponding elements
in the taxonomy. This approach is quite promising in our regard, but efectiveness is constrained by the
taxonomy’s limited size, consisting of only 28 classes.
        </p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Other Related Works</title>
        <p>Corpus-independent methods leverage pre-trained large language models, bypassing the need for
external corpora. For example, CGExpan [20] uses a fill-in-the-blank task to identify and rank potential
connections between seed elements. GenExpan [21] uses GPT-2 for ESE, expanding sets through
multiple fill-in-the-blank tasks, with an approach diferent from our GPT-4o workflow, which ofers
token probability distributions for expansions.</p>
        <p>The primary focus of the most popular ESE datasets is on named entities. Gajbhiye et al. [22] created
a dataset based on ConceptNet and ChatGPT, focusing on more general concepts like advertisements
or accomodation. These approaches address domain variety, unlike previous benchmarks based on
Wiki/News data. Our LC-QuAD-based dataset also emphasizes named entities, and expanding the
Question to SPARQL dataset could address domain variety in KG-ESE.</p>
        <p>MArBLE [23] highlights the variety of available models and notes that their performance varies
depending on the seed set. It employs a human-in-the-loop setup to assess which model best
represents the current expansion set and incorporates suggestions from that model. As
sentencetransformers are also numerous there is the possibility to fuse their method with our embedding based
solution, however it would result in much higher run times.</p>
        <p>Finally, similar tasks like MetaQA [24], ComplexWebQuestions [25], and QALD [26] could serve as
alternatives for datasets similar to ours.</p>
        <p>No prior work was identified that compares generative models with KG-based ESE solutions.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>In this paper, we proposed a Truncated Graph Walk algorithm for the Knowledge Graph-based Entity
Set Expansion problem. To evaluate the methods for this task, we modified LC-QuAD, a Knowledge
Graph Question Answering (KGQA) dataset, and comparatively evaluated three distinct approaches
based on diferent underlying principles. The proposed graph walk-based algorithm uses templates to
identify common paths in the graph that connect all seed nodes to a shared endpoint. A second approach
involved embedding the KG entities in a semantic vector space using sentence transformers, followed
by experiments with dimension selection and iterative weighting techniques. The third approach
leveraged few-shot learning through GPT-4o prompting. All approaches were designed to facilitate the
integration of human input into the decision-making process, ensuring a common basis for evaluation
by simulating human behavior.</p>
      <p>The experimental results show that the graph walk-based method outperformed the other approaches,
additionally providing explanations in the form of graph paths that are interpretable by humans. In
contrast, GPT-4o struggled to identify the correct connections between the seed nodes, as shown by
response elements that were semantically close but not entirely accurate. GPT-4o also had dificulty
ifnding the correct connections (i.e. quasi-explanations) between the seed nodes.</p>
      <p>For future directions we plan on applying these methods to other datasets and evaluating a KG-RAG
based GenAI solution.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This research has been supported by the European Union project RRF-2.3.1-21-2022-00004 within the
framework of the Artificial Intelligence National Laboratory.</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors used ChatGPT in order to: Grammar and spelling
check. After using this tool/service, the authors reviewed and edited the content as needed and take
full responsibility for the publication’s content.
[11] J. Chen, Y. Chen, X. Zhang, X. Du, K. Wang, J.-R. Wen, Entity set expansion with semantic features
of knowledge graphs, Journal of Web Semantics 52-53 (2018) 33–44. doi:https://doi.org/10.
1016/j.websem.2018.09.001.
[12] Y. Zheng, C. Shi, X. Cao, X. Li, B. Wu, Entity set expansion with meta path in knowledge graph,
in: Advances in Knowledge Discovery and Data Mining, Springer International Publishing, 2017,
pp. 317–329.
[13] C. Shi, J. Ding, X. Cao, L. Hu, B. Wu, X. Li, Entity set expansion in knowledge graph: a
heterogeneous information network perspective, Frontiers of Computer Science 15 (2020) 151307.
doi:10.1007/s11704-020-9240-8.
[14] S. Tong, J. Dean, System and methods for automatically creating lists, 2008.
[15] R. C. Wang, W. W. Cohen, Language-independent set expansion of named entities using the
web, in: Seventh IEEE International Conference on Data Mining (ICDM 2007), 2007, pp. 342–350.
doi:10.1109/ICDM.2007.104.
[16] J. Mamou, O. Pereg, M. Wasserblat, I. Dagan, Y. Goldberg, A. Eirew, Y. Green, S. Guskin,
P. Izsak, D. Korat, SetExpander: End-to-end term set expansion based on multi-context term
embeddings, in: Proceedings of the 27th International Conference on Computational
Linguistics: System Demonstrations, Association for Computational Linguistics, 2018, pp. 58–62. URL:
https://aclanthology.org/C18-2013.
[17] J. Mamou, O. Pereg, M. Wasserblat, A. Eirew, Y. Green, S. Guskin, P. Izsak, D. Korat, Term set
expansion based NLP architect by Intel AI lab, in: Proceedings of the 2018 Conference on Empirical
Methods in Natural Language Processing: System Demonstrations, Association for Computational
Linguistics, 2018, pp. 19–24. URL: https://aclanthology.org/D18-2004. doi:10.18653/v1/D18-2004.
[18] W. Zhu, H. Gong, J. Shen, C. Zhang, J. Shang, S. Bhat, J. Han, FUSE: multi-faceted set expansion
by coherent clustering of skip-grams, CoRR abs/1910.04345 (2019). URL: http://arxiv.org/abs/1910.
04345.
[19] J. Xiao, M. Elkaref, N. Herr, G. D. Mel, J. Han, Taxonomy-Guided Fine-Grained Entity Set
Expansion, ????, pp. 631–639. URL: https://epubs.siam.org/doi/abs/10.1137/1.9781611977653.ch71.
doi:10.1137/1.9781611977653.ch71.
[20] Y. Zhang, J. Shen, J. Shang, J. Han, Empower entity set expansion via language model probing,
in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics,
Association for Computational Linguistics, Online, 2020, pp. 8151–8160. doi:10.18653/v1/2020.
acl-main.725.
[21] S. Huang, S. Ma, Y. Li, Y. Li, Y. Jiang, H.-T. Zheng, Y. Shen, From retrieval to generation: Eficient
and efective entity set expansion, 2023. arXiv:2304.03531.
[22] A. Gajbhiye, Z. Bouraoui, N. Li, U. Chatterjee, L. Espinosa-Anke, S. Schockaert, What do deck
chairs and sun hats have in common? uncovering shared properties in large concept
vocabularies, in: Proceedings of the 2023 Conference on Empirical Methods in Natural Language
Processing, Association for Computational Linguistics, Singapore, 2023, pp. 10587–10596. URL:
https://aclanthology.org/2023.emnlp-main.654. doi:10.18653/v1/2023.emnlp-main.654.
[23] M. Wahed, D. Gruhl, I. Lourentzou, Marble: Hierarchical multi-armed bandits for
human-in-theloop set expansion, in: Proceedings of the 32nd ACM International Conference on Information and
Knowledge Management, CIKM ’23, Association for Computing Machinery, 2023, p. 4857–4863.</p>
      <p>URL: https://doi.org/10.1145/3583780.3615485. doi:10.1145/3583780.3615485.
[24] T. Gao, P. Fodor, M. Kifer, Querying knowledge via multi-hop english questions, CoRR
abs/1907.08176 (2019). URL: http://arxiv.org/abs/1907.08176.
[25] A. Talmor, J. Berant, The web as a knowledge-base for answering complex questions, in:
North American Chapter of the Association for Computational Linguistics, 2018. URL: https:
//api.semanticscholar.org/CorpusID:3986974.
[26] A. Perevalov, D. Diefenbach, R. Usbeck, A. Both, Qald-9-plus: A multilingual dataset for question
answering over dbpedia and wikidata translated by native speakers, in: 2022 IEEE 16th
International Conference on Semantic Computing (ICSC), 2022, pp. 229–234. doi:10.1109/ICSC52841.
2022.00045.</p>
    </sec>
    <sec id="sec-8">
      <title>A. Connection listing prompt</title>
      <p>Given a s e e d s e t o f 4 nodes from a DBpedia 2016 dump , l i s t a l l t h e
c o n n e c t i o n s t h a t c o n n e c t t h e s e e d e l e m e n t s so t h a t i t can be used
t o l i s t a d d i t i o n a l s i m i l a r nodes . Even though you do not have
a c c e s s t o t h e d a t a b a s e , t r y t o answer with a f o r m a t t e d JSON
l i s t i n g t h e t r i p l e t s and n o t h i n g e l s e ! We l i s t some examples with
o n l y a few c o n n e c t i o n s b u t l i s t a s many c o n n e c t i o n s a s you can .</p>
      <p>Make t h e l i s t a t l e a s t 20 l o n g !
&lt;EXAMPLE 1&gt;
Seed e l e m e n t s : &lt;&lt;ELEMENTS&gt;&gt;
JSON : &lt;&lt;TRIPLES &gt;&gt;
&lt;/EXAMPLE 1&gt;
&lt;EXAMPLE 2&gt;
Seed e l e m e n t s : &lt;&lt;ELEMENTS&gt;&gt;
JSON : &lt;&lt;TRIPLES &gt;&gt;
&lt;/EXAMPLE 2&gt;
Seed e l e m e n t s :</p>
    </sec>
    <sec id="sec-9">
      <title>B. Entity listing prompt</title>
      <p>TASK : L i s t a l l t h e e n t i t i e s t h a t f u l f i l l t h e s t a t e d q u e s t i o n a s i f
t h e y e a r was 2 0 1 6 ! To h e l p you i n t h e t a s k we p r o v i d e 4 example
e n t i t i e s t h a t a r e c o r r e c t answers from DBPedia . We can a l s o
g u a r a n t e e t h a t t h e number o f c o r r e c t e n t i t i e s a r e between 8 and
100 and a l l o f them a r e p r e s e n t i n DBPedia 2016 v e r s i o n . Answer
i n a JSON l i s t f o r m a t with a l l t h e e n t i t i e s t h a t a r e c o r r e c t
answers f o r t h e q u e s t i o n !
QUESTION : &lt;&lt;QUESTION&gt;&gt;
EXAMPLES : &lt;&lt;SEED_SET&gt;&gt;
JSON :</p>
    </sec>
    <sec id="sec-10">
      <title>C. Examples</title>
      <p>Seed
Livewire_(DC_Comics)
Roland_Daggett
Roxy_Rocket
The_New_Batman_Adventures
Example of the three methods for the question "Name some comic characters created by Bruce
Timm?". We use ⊕to indicate positive elements for GPT-4o.
?uri creator Bruce_Timm
?x type Person
?x appearance The_New_Batman_Adventures
?x afiliation Roxy_Rocket
?x villainIn Livewire_(DC_Comics)
?x rival ?seed
?x voiceActor Mark_Hamill
Batman enemyOf ?x
Bruce_Wayne hasSecretIdentity ?x
TruncatedGraphWalk
Roland_Daggett enemyOf ?seed</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Shang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Ren</surname>
          </string-name>
          , J. Han,
          <article-title>Setexpan: Corpus-based set expansion via context feature selection and rank ensemble</article-title>
          ,
          <source>in: Machine Learning and Knowledge Discovery in Databases</source>
          , Springer International Publishing,
          <year>2017</year>
          , pp.
          <fpage>288</fpage>
          -
          <lpage>304</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Ren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. T.</given-names>
            <surname>Vanni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. M.</given-names>
            <surname>Sadler</surname>
          </string-name>
          , J. Han,
          <article-title>Hiexpan: Task-guided taxonomy construction by hierarchical tree expansion</article-title>
          ,
          <source>in: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery &amp; Data Mining, KDD '18</source>
          ,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery,
          <year>2018</year>
          , p.
          <fpage>2180</fpage>
          -
          <lpage>2189</lpage>
          . doi:
          <volume>10</volume>
          .1145/3219819.3220115.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Xiao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Shang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sinha</surname>
          </string-name>
          , J. Han,
          <article-title>Entity set search of scientific literature: An unsupervised ranking approach</article-title>
          ,
          <source>in: The 41st International ACM SIGIR Conference on Research &amp; Development in Information Retrieval</source>
          , SIGIR '18,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery,
          <year>2018</year>
          , p.
          <fpage>565</fpage>
          -
          <lpage>574</lpage>
          . doi:
          <volume>10</volume>
          .1145/3209978.3210055.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>P. P.</given-names>
            <surname>Ray</surname>
          </string-name>
          ,
          <article-title>Chatgpt: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope, Internet of Things and Cyber-Physical Systems 3 (</article-title>
          <year>2023</year>
          )
          <fpage>121</fpage>
          -
          <lpage>154</lpage>
          . doi:https://doi.org/10.1016/j.iotcps.
          <year>2023</year>
          .
          <volume>04</volume>
          .003.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>O.</given-names>
            <surname>Yoran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Wolfson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Bogin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Katz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Deutch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Berant</surname>
          </string-name>
          ,
          <article-title>Answering questions by metareasoning over multiple chains of thought</article-title>
          ,
          <source>in: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>5942</fpage>
          -
          <lpage>5966</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2023</year>
          .emnlp-main.
          <volume>364</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>R.</given-names>
            <surname>Logan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. F.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. E.</given-names>
            <surname>Peters</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gardner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <article-title>Barack's wife hillary: Using knowledge graphs for fact-aware language modeling, in: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics</article-title>
          , Association for Computational Linguistics,
          <year>2019</year>
          , pp.
          <fpage>5962</fpage>
          -
          <lpage>5971</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>P19</fpage>
          -1598.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Auer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          , G. Kobilarov,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lehmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Cyganiak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Ives</surname>
          </string-name>
          ,
          <article-title>Dbpedia: a nucleus for a web of open data</article-title>
          ,
          <source>in: Proceedings of the 6th International The Semantic Web and 2nd Asian Conference on Asian Semantic Web Conference</source>
          , Springer-Verlag,
          <year>2007</year>
          , p.
          <fpage>722</fpage>
          -
          <lpage>735</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>P.</given-names>
            <surname>Trivedi</surname>
          </string-name>
          , G. Maheshwari,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dubey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lehmann</surname>
          </string-name>
          ,
          <article-title>Lc-quad: A corpus for complex question answering over knowledge graphs</article-title>
          ,
          <source>in: The Semantic Web - ISWC 2017</source>
          , Springer International Publishing,
          <year>2017</year>
          , pp.
          <fpage>210</fpage>
          -
          <lpage>218</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>N.</given-names>
            <surname>Reimers</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Gurevych</surname>
          </string-name>
          ,
          <article-title>Sentence-bert: Sentence embeddings using siamese bert-networks</article-title>
          , CoRR abs/
          <year>1908</year>
          .10084 (
          <year>2019</year>
          ). URL: http://arxiv.org/abs/
          <year>1908</year>
          .10084.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>D. P.</given-names>
            <surname>Kingma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ba</surname>
          </string-name>
          ,
          <article-title>Adam: A method for stochastic optimization</article-title>
          ,
          <source>CoRR abs/1412</source>
          .6980 (
          <year>2014</year>
          ). URL: https://api.semanticscholar.org/CorpusID:6628106.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>