<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Vienna, Austria
* Corresponding author.
$ sara.todorovikj@informatik.tu-chemnitz.de (S. Todorovikj)</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Characterizing Knowledge Graph Tasks in LLM Benchmarks Using Cognitive Complexity Frameworks</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sara Todorovikj</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lars-Peter Meyer</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michael Martin</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Chemnitz University of Technology</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>InfAI</institution>
          ,
          <addr-line>Leipzig</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>Large Language Models (LLMs) are increasingly used for tasks involving Knowledge Graphs (KGs), whose evaluation typically focuses on accuracy and output correctness. We propose a complementary task characterization approach using three complexity frameworks from cognitive psychology. Applying this to the LLM-KG-Bench framework, we highlight value distributions, identify underrepresented demands and motivate richer interpretation and diversity for benchmark evaluation tasks.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Task characterization</kwd>
        <kwd>Benchmark evaluation</kwd>
        <kwd>LLM</kwd>
        <kwd>Knowledge Graph</kwd>
        <kwd>RDF</kwd>
        <kwd>SPARQL</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
    </sec>
    <sec id="sec-2">
      <title>2. Background and Related Work</title>
      <p>
        Understanding the dificulty and structure of tasks often requires going beyond surface-level features.
In cognitive science and educational research, several frameworks have been developed to describe
the complexity of tasks based on the type of knowledge involved and the mental operations required.
One of the most well-known is Bloom’s Taxonomy [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], which was originally developed for classification
of educational goals based on the required cognitive complexity level. The taxonomy is grounded on
behavioral observations of learning processes and classifies cognitive processes from simple recall to
higher-level reasoning and creative generation. A revision was made in order to better fit modern views
of cognitive psychology [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], which also introduced a complementary dimension. The new Knowledge
Dimension distinguishes between types of knowledge required for completing diferent tasks. In parallel,
Relational Complexity Theory [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] originates from developmental and comparative psychology and draws
the notion of relational arity from formal systems in logic and computer science, including relational
database theory. It formalizes task dificulty in terms of the number of entity and relations that must be
simultaneously processed.
      </p>
      <p>
        These frameworks form the basis for our task characterization approach, which we apply to
LLM-KGBench as an illustration. The LLM-KG-Bench framework was developed to address the lack of scalable
evaluation tools for LLMs targeting KG tasks such as RDF serialization, SPARQL query generation and
structured extraction. The framework supports a wide range of tasks with built-in correction cycles and
output validation, emphasizing automated, reproducible evaluation across a broad selection of models.
Here, we provide a short overview and description of the task groups used in LLM-KG-Bench, for more
details, see [
        <xref ref-type="bibr" rid="ref10 ref11 ref3 ref4 ref5 ref9">3, 9, 10, 11, 4, 5</xref>
        ].
      </p>
      <sec id="sec-2-1">
        <title>RDF-related Tasks</title>
        <sec id="sec-2-1-1">
          <title>FactExtractStatic</title>
        </sec>
        <sec id="sec-2-1-2">
          <title>RdfFriendCount</title>
        </sec>
        <sec id="sec-2-1-3">
          <title>RdfSyntaxFixList</title>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>SPARQL-related Tasks</title>
        <sec id="sec-2-2-1">
          <title>Sparql2AnswerList</title>
        </sec>
        <sec id="sec-2-2-2">
          <title>Text2AnswerList</title>
        </sec>
        <sec id="sec-2-2-3">
          <title>Text2SparqlList</title>
        </sec>
        <sec id="sec-2-2-4">
          <title>SparqlSyntaxFixingList</title>
          <p>
            Extract facts from a textual fact sheet and create a KG [
            <xref ref-type="bibr" rid="ref3 ref9">3, 9</xref>
            ].
          </p>
          <p>
            Identify the node with the most incoming edges [
            <xref ref-type="bibr" rid="ref4 ref9">9, 4</xref>
            ].
          </p>
          <p>
            Correct a syntactically invalid RDF graph [
            <xref ref-type="bibr" rid="ref4">4</xref>
            ].
          </p>
          <p>
            Given a small KG and a SPARQL SELECT query, return the respective
result set for the query [
            <xref ref-type="bibr" rid="ref11">11</xref>
            ].
          </p>
          <p>
            Return the result set answering a given textual question on a given KG
(withot a SPARQL SELECT query) [
            <xref ref-type="bibr" rid="ref11">11</xref>
            ].
          </p>
          <p>
            Given a KG and its description, construct a SPARQL SELECT query
corresponding to a given natural language query [
            <xref ref-type="bibr" rid="ref11">11</xref>
            ].
          </p>
          <p>
            Given a SPARQL SELECT query with syntax errors, return a corrected
query [
            <xref ref-type="bibr" rid="ref11">11</xref>
            ].
          </p>
          <p>
            RdfConnectionExplainStatic Find the shortest connection between two nodes in an RDF graph [
            <xref ref-type="bibr" rid="ref4 ref9">9, 4</xref>
            ].
          </p>
        </sec>
        <sec id="sec-2-2-5">
          <title>TurtleSampleGeneration</title>
          <p>
            Generate small Turtle KGs satisfying given requirements [
            <xref ref-type="bibr" rid="ref3 ref9">3, 9</xref>
            ].
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Task Characterization</title>
      <p>To understand what kinds of abilities and operations are required by benchmarking tasks, we apply
structured characterization criteria drawn from the three established frameworks, as introduced above.
While we adopt terminology from cognitive psychology, we do not claim that LLMs engage in these
processes in a human sense. Rather, we assess the extent to which their outputs reflect behavior
consistent with such operations. Table 1 provides an overview of all possible values across the three
frameworks. In the following, we describe the interpretation and assignment criteria for each value.</p>
      <p>The task depends primarily on “mechanically” recalling facts or definitions without
further processing.</p>
      <p>The task requires interpreting given information, structures or queries without
fundamentally transforming or generating new representations.</p>
      <p>A known procedure or pattern must be correctly executed, such as retrieval or following
syntactic rules.</p>
      <p>The task demands recognizing or decomposing relationships between data, especially
when multiple elements or steps must be coordinated.</p>
      <p>A task involves judging the correctness, relevance or quality of a result.</p>
      <p>The task involves generating new content, such as generating queries or data structures.</p>
      <sec id="sec-3-1">
        <title>Knowledge Dimensions</title>
        <p>Factual Task execution success depends on recalling or recognizing specific terminology, syntax
elements or concrete information.</p>
        <p>Conceptual Structural or relational understanding is necessary, such as schema structure, data
models or logical organization.</p>
        <p>Procedural The task requires a correct application of known methods, routines or transformation
steps.</p>
        <p>Metacognitive Awareness and control over one’s strategies and thinking processes, such as selecting
appropriate approaches, planning task execution or monitoring correctness, which
might be relevant for more complex or interactive settings.</p>
      </sec>
      <sec id="sec-3-2">
        <title>Relational Complexity</title>
        <p>Low</p>
        <sec id="sec-3-2-1">
          <title>Medium</title>
        </sec>
        <sec id="sec-3-2-2">
          <title>High</title>
          <p>The task involves interpreting or manipulating individual binary relations or isolated,
simple structures with minimal dependencies.</p>
          <p>Multiple relations and entities must be processed simultaneously, such as coordinating
several triples or variables in a query.</p>
          <p>The task involves multiple interrelated entities or nested dependencies that must be
simultaneously considered, often requiring more abstract or hierarchical reasoning.
3.1. Application to Benchmark Tasks
The assigned values for each task are displayed in Table 2. Note that not all values across the frameworks
are represented, as the current set of tasks does not span the full theoretical space. The assigned values
represent the minimal operational and structural requirements. Some variability in the relational
complexity dimension is certainly possible given a prompt that requires more complex operations.</p>
          <p>We can observe several recurring value combinations. Most tasks fall into a characterization
combining Understand and Apply as cognitive processes with Conceptual and Procedural knowledge dimensions
and a Low level of relational complexity. This reflects the prevalence of tasks requiring interpretation
and rule application without substantial structural coordination. Tasks involving generation
(FactExtractStatic, TurtleSampleGeneration and Text2SparqlList) are naturally the only ones annotated with the
Create process. Among them, only the RDF-based generation tasks are assigned Medium relational
complexity, reflecting the need to coordinate multiple entities and their relationships when constructing
a graph. In contrast, SPARQL generation tasks tend to result in single triple pattern and are thereby
assigned Low relational complexity.</p>
          <p>A consistent, expected dependency can be observed between some processes and knowledge types.
Factual and Conceptual knowledge always coincide with Understand, as interpreting a meaning
inherently involves factual or structural knowledge. On the other hand, Procedural knowledge always
coincides with Apply, Analyze or Create, since carrying out a certain procedure by definition requires
knowing the necessary steps. In the current task set, Factual and Conceptual knowledge do not co-occur,
distinguishing between surface-level terminology and deeper structural comprehension. Similarly,
Apply, Analyze and Create do not co-occur, as they all describe mutually exclusive operations that either
follow a procedure, decompose a structure, or generate new ones.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Discussion and Outlook</title>
      <p>In this paper we proposed a task characterization that provides a complementary perspective on
benchmark design and evaluation beyond accuracy metrics, inspired by theories of cognitive complexity.
This can guide the creation of more balanced and targeted benchmarks by ensuring diversity across the
diferent dimensions. Moreover, it enables identification of potential blind spots in model behavior for
tasks that require similar processing.</p>
      <p>We demonstrate how to assign the characterization values on a set of evaluation tasks from the
LLMKG-Bench framework. Several values do not appear in the task set due to current design preferences,
but that does not imply that such dimensions are irrelevant or unassignable. In cognitive processes, we
note Remember which would describe a task that asks for reproduction of terminology or exact syntax,
e.g., listing reserved SPARQL keywords from memory, while Evaluate would require making judgments
between alternative options, e.g., selecting the most eficient query. One knowledge dimension was not
assigned, Metacognitive knowledge, which might be tackled by tasks that require justification, such as
explaining the reasoning behind a generated query. Finally, High relational complexity would emerge
in tasks requiring coordination of more than two entity roles simultaneously, like multi-dimensional
event data or nested dependencies. This suggests a direction for extending task design to capture a
broader range of structural demands.</p>
      <p>The proposed framework could be applied to other benchmarks in the semantic web and beyond,
allowing for cross-benchmark comparisons of task complexity profiles. It could also be integrated into
such evaluation pipelines, helping understand the types of processes the models succeed or struggle
with. In turn, this could support more systematic error analysis, design, and task selection.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>This work was partially supported by grants from the German Federal Ministry of Education and
Research (BMBF) to the projects ScaleTrust (16DTM312D) and KupferDigital2 (13XP5230L).</p>
    </sec>
    <sec id="sec-6">
      <title>Declaration on Generative AI</title>
      <p>The authors have not employed any Generative AI tools.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Pan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Luo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <article-title>Unifying large language models and knowledge graphs: A roadmap</article-title>
          ,
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          <volume>36</volume>
          (
          <year>2024</year>
          )
          <fpage>3580</fpage>
          -
          <lpage>3599</lpage>
          . doi:
          <volume>10</volume>
          .1109/tkde.
          <year>2024</year>
          .
          <volume>3352100</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>L.-P.</given-names>
            <surname>Meyer</surname>
          </string-name>
          , C. Stadler,
          <string-name>
            <given-names>J.</given-names>
            <surname>Frey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Radtke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Junghanns</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Meissner</surname>
          </string-name>
          , G. Dziwis,
          <string-name>
            <given-names>K.</given-names>
            <surname>Bulert</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Martin, LLM-assisted knowledge graph engineering: Experiments with ChatGPT</article-title>
          , in: C.
          <string-name>
            <surname>Zinke-Wehlmann</surname>
          </string-name>
          , J. Friedrich (Eds.),
          <source>First Working Conference on Artificial Intelligence Development for a Resilient and Sustainable Tomorrow (AITomorrow)</source>
          <year>2023</year>
          , Informatik aktuell,
          <year>2024</year>
          , pp.
          <fpage>103</fpage>
          -
          <lpage>115</lpage>
          . doi:
          <volume>10</volume>
          . 1007/978-3-
          <fpage>658</fpage>
          -43705-
          <issue>3</issue>
          _
          <fpage>8</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>L.-P.</given-names>
            <surname>Meyer</surname>
          </string-name>
          , J. Frey,
          <string-name>
            <given-names>K.</given-names>
            <surname>Junghanns</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Brei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Bulert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gründer-Fahrer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Martin</surname>
          </string-name>
          ,
          <article-title>Developing a scalable benchmark for assessing large language models in knowledge graph engineering</article-title>
          , in: N.
          <string-name>
            <surname>Keshan</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Neumaier</surname>
            ,
            <given-names>A. L.</given-names>
          </string-name>
          <string-name>
            <surname>Gentile</surname>
          </string-name>
          , S. Vahdati (Eds.),
          <source>Proceedings of the Posters and Demo Track of the 19th International Conference on Semantic Systems (SEMANTICS</source>
          <year>2023</year>
          ), volume
          <volume>3526</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2023</year>
          . URL: https://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>3526</volume>
          /paper-04.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>L.-P.</given-names>
            <surname>Meyer</surname>
          </string-name>
          , J.
          <string-name>
            <surname>Frey</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Heim</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Brei</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Stadler</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Junghanns</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <article-title>Martin, LLM-KG-Bench 3.0: A compass for semantic technology capabilities in the ocean of LLMs</article-title>
          , in: E. Curry,
          <string-name>
            <given-names>M.</given-names>
            <surname>Acosta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Poveda-Villalón</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. van Erp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ojo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Hose</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Shimizu</surname>
          </string-name>
          , P. Lisena (Eds.),
          <source>The Semantic Web. ESWC 2025. Lecture Notes in Computer Science</source>
          , volume
          <volume>15719</volume>
          , Springer Nature Switzerland,
          <year>2025</year>
          , pp.
          <fpage>280</fpage>
          -
          <lpage>296</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>031</fpage>
          -94578-6_
          <fpage>16</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>L.-P.</given-names>
            <surname>Meyer</surname>
          </string-name>
          , J.
          <string-name>
            <surname>Frey</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Brei</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Heim</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Gründer-Fahrer</surname>
            , S. Todorovikj2,
            <given-names>C. S.</given-names>
          </string-name>
          <string-name>
            <surname>Stadler</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Schröder</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Arndt</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Martin</surname>
          </string-name>
          ,
          <article-title>Evaluating large language models for RDF knowledge graph related tasks - the LLM-KG-Bench-Framework 3</article-title>
          ,
          <string-name>
            <given-names>Semantic</given-names>
            <surname>Web</surname>
          </string-name>
          (
          <year>2025</year>
          ). doi:
          <volume>10</volume>
          .5281/zenodo.16779481, submitted for review 05/
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>B. S.</given-names>
            <surname>Bloom</surname>
          </string-name>
          ,
          <source>Taxonomy of Educational Objectives: The Classification of Educational Goals. Handbook</source>
          <volume>1</volume>
          :
          <string-name>
            <given-names>Cognitive</given-names>
            <surname>Domain</surname>
          </string-name>
          , New York: Longman,
          <year>1956</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>L. W.</given-names>
            <surname>Anderson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. R.</given-names>
            <surname>Krathwohl</surname>
          </string-name>
          ,
          <article-title>A Taxonomy for Learning, Teaching, and Assessing: A Revision of Bloom's Taxonomy of Educational Objectives, Addison Wesley Longman, Inc</article-title>
          .,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>G. S.</given-names>
            <surname>Halford</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. H.</given-names>
            <surname>Wilson</surname>
          </string-name>
          , S. Phillips,
          <article-title>Processing capacity defined by relational complexity: Implications for comparative, developmental, and cognitive psychology</article-title>
          ,
          <source>Behavioral and Brain Sciences</source>
          <volume>21</volume>
          (
          <year>1998</year>
          )
          <fpage>803</fpage>
          -
          <lpage>831</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J.</given-names>
            <surname>Frey</surname>
          </string-name>
          , L.-P. Meyer,
          <string-name>
            <given-names>N.</given-names>
            <surname>Arndt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Brei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Bulert</surname>
          </string-name>
          ,
          <article-title>Benchmarking the abilities of large language models for RDF knowledge graph creation and comprehension: How well do LLMs speak turtle?</article-title>
          , in: M.
          <string-name>
            <surname>Alam</surname>
          </string-name>
          , M. Cochez (Eds.),
          <source>Proceedings of the Workshop on Deep Learning for Knowledge Graphs (DL4KG</source>
          <year>2023</year>
          )
          <article-title>co-located with the 21th International Semantic Web Conference (ISWC</article-title>
          <year>2023</year>
          ), Athens, November 6-
          <issue>10</issue>
          ,
          <year>2023</year>
          , volume
          <volume>3559</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2023</year>
          . URL: https://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>3559</volume>
          /paper-3.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J.</given-names>
            <surname>Frey</surname>
          </string-name>
          , L.-P. Meyer,
          <string-name>
            <given-names>F.</given-names>
            <surname>Brei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gruender</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Martin, Assessing the evolution of LLM capabilities for knowledge graph engineering in 2023, in: The Semantic Web: ESWC 2024 Satellite Events</article-title>
          , Springer Nature Switzerland,
          <year>2025</year>
          , pp.
          <fpage>51</fpage>
          -
          <lpage>60</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>031</fpage>
          -78952-
          <issue>6</issue>
          _
          <fpage>5</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>L.-P.</given-names>
            <surname>Meyer</surname>
          </string-name>
          , J.
          <string-name>
            <surname>Frey</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Brei</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Arndt</surname>
          </string-name>
          ,
          <article-title>Assessing SPARQL capabilities of large language models</article-title>
          , in: E. Vakaj,
          <string-name>
            <given-names>S.</given-names>
            <surname>Iranmanesh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Stamartina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Mihindukulasooriya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Tiwari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ortiz-Rodríguez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Mcgranaghan</surname>
          </string-name>
          (Eds.),
          <source>Proceedings of the 3rd International Workshop on Natural Language Processing for Knowledge Graph Creation co-located with 20th International Conference on Semantic Systems (SEMANTiCS</source>
          <year>2024</year>
          ), volume
          <volume>3874</volume>
          <source>of CEUR Workshop Proceedings</source>
          ,
          <year>2024</year>
          , pp.
          <fpage>35</fpage>
          -
          <lpage>53</lpage>
          . URL: https://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>3874</volume>
          /paper3.pdf.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>