<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Knowledge Graph Entity Linking via Interactive Reasoning and Exploration with GRASP</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sebastian Walter</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hannah Bast</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Freiburg</institution>
          ,
          <addr-line>Georges-Köhler-Allee 51, 79110 Freiburg im Breisgau</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <abstract>
        <p>Entity linking is the task of linking mentions of entities from a given knowledge graph in a given text. We present a new approach for entity linking built on GRASP, a zero-shot method originally developed for translating natural-language questions to SPARQL queries for a given knowledge graph. We evaluate our approach on the cell entity annotation task of the MammoTab track of the SemTab 2025 challenge and achieve an F1-score of 75.8%, ranking first among all participants.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Entity Linking</kwd>
        <kwd>Large Language Model</kwd>
        <kwd>Knowledge Graphs</kwd>
        <kwd>Cell Entity Annotation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <sec id="sec-1-1">
        <title>Contributions</title>
        <p>Only annotate row 2.</p>
        <p>Annotate all columns.</p>
        <p>Annotate the following table with entities from the available knowledge graphs. If there
already are annotations for some cells, they are shown in parentheses after the cell
value.</p>
        <p>Reasoning
Function Call
FUNCTION annotate
The best match is wd:Q79859- "2014 FIFA World Cup".</p>
        <p>We'll annotate colo (row 2) with this entity.</p>
        <p>Annotated cell (2, 0) with entity wd:Q79859
kg wikidata
row 2
column O
entity wd:Q79859</p>
        <p>Reasoning
Function Call
Reasoning
Function Call</p>
        <p>Reasoning
Step 3: Annotate col1: Group E  Group E of 2014 FIFA World Cup
We need to find the entity for "Group E" in the 2014 FIFA World Cup.</p>
        <p>We can search for properties related to group stages in the World Cup.
First, let's check what properties are associated with the 2014 World Cup entity.</p>
        <p>Function Call
FUNCTION list
kg wikidata</p>
        <p>subject wd:Q79859
Got 724 rows and 3 columns, showing the first 10 rows below:
subject
property
object
2014 FIFA World Cup instance of (wdt:P31)
season (wd:Q27020041)
2014 FIFA World Cup country (wdt:P17)
Brazil (wd:Q155)
2014 FIFA World Cup location (wdt:P276)
Estádio Beira-Rio (wd:Q1050224)
2014 FIFA World Cup has part(s) (wdt:P527) 2014 FIFA World Cup knockout stage (</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Approach</title>
      <p>
        Our approach is based on GRASP, which equips an LLM with a set of functions to search and query
knowledge graphs in an interactive fashion. Originally, GRASP was developed for SPARQL QA
(translating natural-language questions to SPARQL queries on a knowledge graph); see [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] for the details of
the method, and [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] for an implementation. In [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], the method was extended to support general QA and
follow-up questions. In this work, we extend it to entity linking.
      </p>
      <p>GRASP provides the following core functions, which allow the LLM to interactively explore knowledge
graphs: EXE (execute an arbitrary SPARQL query), LST (list triples with given constraints), SEN (search
for entities matching a given query string), SPR (search for properties matching a given query string),
SPE (search for properties of a given entity), SOP (search for objects of a given property), SCN (search
for items given triple constraints), and SAC (search for items given a constraining SPARQL query).</p>
      <p>For the CEA task, we remove the SPARQL-QA-specific functions ANS (answer and stop) and CAN
(cancel and stop), we keep the other functions, and we introduce the following four new functions:</p>
      <sec id="sec-2-1">
        <title>ANN: annotate(kg: str, entity: str, row: int, col: int)</title>
        <p>Add the given entity from the given knowledge graph as annotation for the table cell at the given row
and column. If there already exists an annotation, it is overwritten.</p>
      </sec>
      <sec id="sec-2-2">
        <title>DAN: delete_annotation(row: int, col: int)</title>
        <p>Delete the annotation for the table cell at the given row and column.</p>
      </sec>
      <sec id="sec-2-3">
        <title>SAN: show_annotations()</title>
        <p>Show the current annotations.</p>
      </sec>
      <sec id="sec-2-4">
        <title>STP: stop()</title>
        <p>Stop the annotation process. The current annotations are returned as result.</p>
        <p>Intuitively, GRASP tackles entity linking using the above functions in the following way: It identifies
entity candidates using one of the various search functions, typically SEN. If multiple entities match
well, it can disambiguate them by inspecting knowledge graph triples for each, typically using LST, or
by performing more restrictive searches with constraints from already known entities or other patterns
observed in the input, typically using SCN or SAC. If no entity matches well, it can retry diferent search
queries or explore the knowledge graph starting from related entities in the hope to come across a
match. If multiple attempts to find an appropriate entity fail, no annotation is made. In case there is
an obvious relationship between multiple entities in the input, it can execute SPARQL queries with
EXE to find candidates for multiple entities at once. Multilingual inputs and synonyms can be handled
by adding the corresponding data to the search indices underlying GRASP’s search functions. For
maximum flexibility, annotations can be made individually via ANN, checked via SAN, and removed via
DAN at all times during the interaction.</p>
        <p>The annotation functions above are tailored to table inputs, because we evaluate our approach on
the CEA task (see Section 3). For other input formats, like regular text, the function signatures and
implementations would need to be adapted slightly, but the overall idea stays the same. Let us denote
the set of all entity linking functions for CEA as CEA = {ANN, DAN, SAN, STP} for later reference.</p>
        <p>
          Importantly, we always set know_before_use: true. This is a configuration option of GRASP
that restricts the agent to SPARQL queries on items seen during the interaction. See [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] for more details
about this option. For entity linking, we extend the scope of this option to also apply to the ANN function,
meaning that only knowledge graph items seen during the interaction can be used for annotation. This
ensures that all annotations are grounded in knowledge graphs and not hallucinated.
        </p>
        <p>
          Our implementation supports annotating a subset of rows and/or columns for a given table, varying
the number of context rows, as well as providing already known annotations (e.g. useful for incremental
row-wise annotation). See Fig. 1 for an exemplary trace produced by our approach for the CEA task.
3. SemTab 2025 challenge - MammoTab track
We evaluate our approach on the CEA task of the MammoTab track of the SemTab 2025 challenge.
CEA is the task of assigning to each cell in a given input table the corresponding entity from a given
knowledge graph, or NIL if there is no such entity. In this challenge, the table inputs come from a subset
of an updated version of the MammoTab dataset [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] and have to be linked to the Wikidata knowledge
graph [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. In total, 84,907 table cells from 870 tables have to be annotated with entities. See Table 1 for
an exemplary MammoTab table, with annotations. There is no separate training or validation dataset
with ground truth annotations, forcing participants to either use data from previous challenges in the
series for tuning or, as we do, perform CEA in a zero-shot setting. The CEA task is scored using an
F1-score based on the number of correct cell annotations. Since all cells that need to be annotated
are known a priori in this task, the F1-score is equal to a simple accuracy measure if one produces an
annotation for each cell. Multiple submissions are allowed, but only the score of the best solution so far
is visible on the challenge website for each participant.
        </p>
        <p>Our approach annotates each row of each table individually. We do this because the tables from
MammoTab can get reasonably large, containing up to 256 cells, which is problematic in two ways:
ifrst, the annotation process for the whole table might not fit in the context window of the underlying
LLM; second, longer annotation processes are more time- and memory-consuming. However, for each
row to be annotated, we provide the 10 rows before and after that row as context, without taking the
annotations from already annotated rows from the same table into account. In a side experiment, we
found that ignoring already existing annotations from other rows did not decrease annotation quality
significantly. In fact, note that giving the LLM access to its own annotations for other rows also has the
potential to deteriorate the overall result quality by propagating errors.</p>
        <p>If the agent tries to annotate a row diferent from the one to annotate, an error message is returned for
the corresponding function call and no annotation made. We use two diferent sets of functions for the
GRASP agent for our evaluations: CEA1 = CEA ∖ {SAN} ∪ {EXE, LST, SPE, SOP, SPR, SEN}, and CEA2 =
CEA1 ∪ {SAN, SCN, SAC}. For our first submission where we used CEA 1, we did not have a dedicated
SAN function yet. For the later submissions we added this function as well as GRASP’s two advanced
functions for constrained search. To support multilingual searches we build multilingual search indices
for GRASP from English, German, Spanish, French, Italian, Dutch, Portuguese, and Russian Wikidata
labels and aliases. We do this both for a Wikidata dump from October 2024, which we already had set
up on our machines, as well as for the oficial challenge Wikidata dump linked on the SemTab2025
website. See Table 2 for an overview.</p>
        <sec id="sec-2-4-1">
          <title>Results</title>
          <p>
            We made three major submissions1 for the MammoTab track, the results of which are shown in Table 3.
All submissions used open-source LLMs from the Qwen3 [
            <xref ref-type="bibr" rid="ref9">9</xref>
            ] family. The first one used Qwen3 30B
A3B Thinking, a small reasoning model providing a good balance between quality and inference time,
together with the CEA1 function set and GRASP’s default top  value of 10 for LST and search functions.
For the second submission, we used the larger and more recent Qwen3 Next 80B A3B model in its
non-reasoning variant, together with the extended function set CEA2 and a more generous top  value
of 20. As expected, the second submission improved the score compared to the first, though only by a
rather small margin of 2.8 p.p.
          </p>
          <p>Interestingly, our second submission takes more steps on average to annotate each row, produces
significantly fewer NIL annotations, and shows a very diferent call distribution compared to our first
submission, performing more searches, LST and EXE calls. See Table 4 for full details about the last
aspect. Considering that each row has on average 3.5 columns to annotate, the model of the first
submission only takes 1.6 steps to annotate a column on average, whereas the model of the second
submission takes about 2.7. Between the second and third submission, we only changed the Wikidata
version from our own to the one recommended by the challenge. However, this only leads to a tiny
improvement of 0.2 pp and no significantly diferent annotation behavior of the agent. In Fig. 2 we
show some statistics about the annotations made by our third submission. The curve in this figure
shows a Zipf-like distribution, where there are few frequent entities and a long tail of infrequent ones.
Overall, 29,743 distinct entities were used to annotate 84,907 table cells, which shows a good diversity
in the dataset.</p>
          <p>To serve the LLMs we use vLLM [10] and one (Qwen3 30B A3B Thinking) or two (Qwen3 Next 80B
A3B) NVIDIA H100 GPUs. All submissions took 3-5 days to finish with 3-6 workers running in parallel.
The average time to annotate a single row ranged from 47.2 to 51.6 seconds across all submissions.
1There was one other experimental submission, and one accidental duplicate submission.</p>
          <p>sn800
o
i
tta600
o
nn400
a
fo200
.
oN 0
0</p>
          <p>Rank of entity
1000
United States (Q30)
Conservative Party (Q9626)
Labour Party (Q9630)
Liberal Party of Canada (Q138345)
2010 United Kingdom general election (Q215622)
#</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>4. Conclusion</title>
      <p>We present an approach for knowledge graph entity linking that is based on an LLM agent interacting
with knowledge graphs by calling functions to search and query them. We evaluate the approach on the
SemTab 2025 MammoTab track, where the task is to annotate 84,907 cells from 870 tables with Wikidata
entities, and reach first place with a final F 1-score of 75.8%, clearly outperforming other participants.</p>
      <p>Currently, our functions for adding, removing, and showing entity links (ANN, DAN, and SAN) are
adapted to the tabular inputs of the cell entity annotation task. For future work, we aim to generalize
the function interface to support general-purpose entity linking on arbitrary natural-language inputs.</p>
    </sec>
    <sec id="sec-4">
      <title>Acknowledgments</title>
      <p>Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – Project-ID
499552394 – SFB 1597.</p>
    </sec>
    <sec id="sec-5">
      <title>Declaration on Generative AI</title>
      <p>The author(s) have not employed any Generative AI tools.
[10] W. Kwon, Z. Li, S. Zhuang, Y. Sheng, L. Zheng, C. H. Yu, J. E. Gonzalez, H. Zhang, I. Stoica, Eficient
Memory Management for Large Language Model Serving with PagedAttention, in: Proceedings
of the ACM SIGOPS 29th Symposium on Operating Systems Principles, 2023.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>K.</given-names>
            <surname>Balog</surname>
          </string-name>
          ,
          <string-name>
            <surname>Entity-Oriented</surname>
            <given-names>Search</given-names>
          </string-name>
          , volume
          <volume>39</volume>
          <source>of The Information Retrieval Series</source>
          , Springer,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Ö.</given-names>
            <surname>Sevgili</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Shelmanov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. Y.</given-names>
            <surname>Arkhipov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Panchenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Biemann</surname>
          </string-name>
          ,
          <article-title>Neural entity linking: A survey of models based on deep learning</article-title>
          ,
          <source>Semantic Web</source>
          <volume>13</volume>
          (
          <year>2022</year>
          )
          <fpage>527</fpage>
          -
          <lpage>570</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>H.</given-names>
            <surname>Bast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hertel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Prange</surname>
          </string-name>
          ,
          <article-title>A fair and in-depth evaluation of existing end-to-end entity linking systems</article-title>
          , in: EMNLP, Association for Computational Linguistics,
          <year>2023</year>
          , pp.
          <fpage>6659</fpage>
          -
          <lpage>6672</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Walter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Bast</surname>
          </string-name>
          , GRASP:
          <article-title>Generic Reasoning And SPARQL Generation across Knowledge Graphs</article-title>
          , in: ISWC,
          <year>2025</year>
          .
          <article-title>Accepted for publication</article-title>
          . Preprint available at https://arxiv.org/abs/2507.08107.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <article-title>SemTab 2025 Organizers</article-title>
          ,
          <article-title>SemTab 2025 - Semantic Web Challenge on Tabular Data to Knowledge Graph Matching - LLMs &amp; Tabular Data Matching, https://sem-tab-challenge.github</article-title>
          .io/2025/,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S.</given-names>
            <surname>Walter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Bast</surname>
          </string-name>
          , GRASP:
          <article-title>Generic Reasoning And SPARQL Generation across Knowledge Graphs - Demo System, in: ISWC (Posters, Demos</article-title>
          &amp; Industry Tracks),
          <year>2025</year>
          .
          <article-title>Accepted for publication</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Marzocchi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Cremaschi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Pozzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Avogadro</surname>
          </string-name>
          , M. Palmonari,
          <article-title>MammoTab: A giant and comprehensive dataset for semantic table interpretation</article-title>
          , in: SemTab@ISWC, volume
          <volume>3320</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>28</fpage>
          -
          <lpage>33</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>D.</given-names>
            <surname>Vrandecic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Krötzsch</surname>
          </string-name>
          ,
          <article-title>Wikidata: A free collaborative knowledgebase</article-title>
          ,
          <source>Commun. ACM</source>
          <volume>57</volume>
          (
          <year>2014</year>
          )
          <fpage>78</fpage>
          -
          <lpage>85</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Qwen</given-names>
            <surname>Team</surname>
          </string-name>
          ,
          <source>Qwen3 Technical Report</source>
          ,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>