<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Leveraging small language models for Text2SPARQL tasks to improve the resilience of AI assistance</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Felix</forename><surname>Brei</surname></persName>
							<email>brei@infai.org</email>
							<affiliation key="aff0">
								<orgName type="department" key="dep1">ETi Competence Center</orgName>
								<orgName type="department" key="dep2">Institute for Applied Informatics</orgName>
								<orgName type="institution">https</orgName>
								<address>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Johannes</forename><surname>Frey</surname></persName>
							<affiliation key="aff1">
								<orgName type="department" key="dep1">KMI Competence Center</orgName>
								<orgName type="department" key="dep2">Institute for Applied Informatics</orgName>
								<orgName type="institution">https</orgName>
								<address>
									<settlement>leipzig</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
							<affiliation key="aff2">
								<orgName type="department" key="dep1">Institute of Computer Science</orgName>
								<orgName type="department" key="dep2">https</orgName>
								<orgName type="institution">Leipzig University</orgName>
								<address>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
							<affiliation key="aff3">
								<orgName type="institution">uni-leipzig.de</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Lars-Peter</forename><surname>Meyer</surname></persName>
							<email>lpmeyer@infai.org</email>
							<affiliation key="aff0">
								<orgName type="department" key="dep1">ETi Competence Center</orgName>
								<orgName type="department" key="dep2">Institute for Applied Informatics</orgName>
								<orgName type="institution">https</orgName>
								<address>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
							<affiliation key="aff2">
								<orgName type="department" key="dep1">Institute of Computer Science</orgName>
								<orgName type="department" key="dep2">https</orgName>
								<orgName type="institution">Leipzig University</orgName>
								<address>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
							<affiliation key="aff3">
								<orgName type="institution">uni-leipzig.de</orgName>
							</affiliation>
						</author>
						<title level="a" type="main">Leveraging small language models for Text2SPARQL tasks to improve the resilience of AI assistance</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">25D6B5B70E69982605CA30D363343C28</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T17:30+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Language models</term>
					<term>SPARQL generation</term>
					<term>Question Answering</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>In this work we will show that language models with less than one billion parameters can be used to translate natural language to SPARQL queries after fine-tuning. Using three different datasets ranging from academic to real world, we identify prerequisites that the training data must fulfill in order for the training to be successful. The goal is to empower users of semantic web technology to use AI assistance with affordable commodity hardware, making them more resilient against external factors.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>The usage of Large Language Models (LLMs) has increased exponentially since the advent of ChatGPT. According to Similarweb, the website of OpenAI alone was visited more than 1.6 billion times by February 2024 1 . In addition to that, Microsoft has launched several AI assistants called 'Copilots' which are based on LLMs 2,3 , as well as Google releasing their AI called Bard which is now known as Gemini 4,5 . This suggests that the big tech companies believe in the potential of LLMs to become part of our daily lives, just like smartphones or computers in general. But do they hold up to the expectations?</p><p>Several test suites were derived to assess the generative capabilities of LLMs, for example TruthfulQA <ref type="bibr" target="#b0">[1]</ref>, HellaSwag <ref type="bibr" target="#b1">[2]</ref> or the Abstraction and Reasoning Corpus (ARC) <ref type="bibr" target="#b2">[3]</ref>. These test suites, among others, are run regularly on the latest entries to the LLM circus and the results for open LLMs are presented publicly on the Huggingface OpenLLM leaderboard <ref type="bibr" target="#b3">[4]</ref>. We can see that the performance increases drastically over time, with Bloom <ref type="bibr" target="#b4">[5]</ref> scoring an average of 46.06 in August 2023 and Smaug-72B <ref type="foot" target="#foot_0">6</ref> holding the record in February 2024 with a score of 80.48, only half a year later.</p><p>These test suites however cover mostly natural language domains, like the task to continue a sentence, answer questions or extract information from a paragraph of text. Based on the experience from early experiments <ref type="bibr" target="#b5">[6]</ref>, a test suite <ref type="bibr" target="#b6">[7]</ref> was developed that evaluates capabilities of LLMs to interface knowledge graphs and assist in knowledge engineering tasks. While the smaller open-source GPT4All models severely struggled, the state-of-the-art commercial LLMs GPT4 and Claude showed promising results <ref type="bibr" target="#b7">[8]</ref> and a trend of performance improvements over the course of 2023 <ref type="bibr" target="#b8">[9]</ref> in dealing with KGs in Turtle format.</p><p>Alas, these results come with several caveats:</p><p>1. The commercial LLMs that were tested are all hosted externally. This can be problematic regarding data protection, because a user has to send a information to a third party. 2. Because of their sheer size (GPT4 has one trillion parameters <ref type="foot" target="#foot_1">7</ref> ), running these models locally is prohibitively costly and therefore not an option for a lot of research institutes and other parties. On top of that, training a model of such size is also extremely expensive<ref type="foot" target="#foot_2">8</ref> . 3. Even these commercial models were at the time of writing still significantly challenged by SPARQL query generation or RML mapping generation <ref type="bibr" target="#b7">[8,</ref><ref type="bibr" target="#b9">10]</ref> indicating a need for specific training or fine-tuning of all models w.r.t. handling those tasks in a reliable and efficient way. 4. Since all these larger are hosted on third party platforms, users are at the mercy of the vendors to keep the services running and affordable. However, vendors suddenly changing their licensing and cost model has already happened in the past <ref type="foot" target="#foot_3">9</ref> , as well as deep sea cables being damaged <ref type="foot" target="#foot_4">10</ref> , separating certain areas of the world from the internet and leaving local companies only with the computational resources they have on site.</p><p>So we ask ourselves the following question: Given a single task that we want so solve using LLMs, is it possible to achieve a similar performance of these large models with a much smaller one? This would enable small businesses to use AI assistance with affordable hardware they can host on site, increasing their resilience against outages, vendors changing their pricing models, disruption due to trade embargoes and other external factors.</p><p>As a first step into this direction, we study the task of translating a natural language question into a SPARQL query because we think that this task enables people who are not familiar with SPARQL to extract knowledge and insights from a knowledge graph which would otherwise not be possible for them. The paper is organized as follows: First, we look at related research in this field and explain where we fit into the big picture. Then we explain the setup of our experiments, namely which model families were chosen and why and which datasets we trained them on. After that, we present and explain the results of our work and finally, we draw conclusions and give an outlook on the directions that our research will head next.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related Work</head><p>Current approaches focus on fine-tuning large language models. For example the authors of <ref type="bibr" target="#b10">[11]</ref> propose a methodology for fine-tuning OpenLLaMA to generate SPARQL queries over life science knowledge graphs using data augmentation techniques, such as providing meaningful variable names and inline comments, improving the performance of the model in generating accurate SPARQL queries. The authors of <ref type="bibr" target="#b11">[12]</ref> use Llama as their basis for fine-tuning to generate SPARQL queries over Wikidata.</p><p>These two papers have shown that translating natural language to SPARQL queries is possible, but they use models with at least three (OpenLLaMA) resp. seven (LLaMA) billion parameters. The hardware required to train these models can be expensive, which is why we want to explore models that are even smaller.</p><p>Smaller, fine-tuned models for one specific task are also able to beat the performance of LLMs, e.g. SQLCoder-7B <ref type="foot" target="#foot_5">11</ref> performs better on SQL than state of the art GPT4. Our research is comparable to that, but with much less parameters and SPARQL instead of SQL.</p><p>[13] manages to fine-tune T5 on SPARQL queries for Wikidata, but to achieve these results, the data had to be preprocessed in a way that is specific to T5. Furthermore, while this paper explores other ways to tackle this task in general, it only looks at T5 instead of other model families as we do.</p><p>[14] gives a comprehensive overview and performs a comparison of pre-trained LMs (PLMs), non-pre-trained LMs (NPLMs), and LLMs, testing various fine-tuning methods using LLMs. <ref type="bibr" target="#b14">[15]</ref> fine-tunes a lightweight model for SPARQL generation using synthetic training data generated by the FlexKBQA framework on a target knowledge graph (sampling structured query templates that are converted into SPARQL query instances and translated into natural language questions using LLMs). The light-weight model can perform further self-guided training on real queries to address a distribution shift between synthetic and real queries. <ref type="bibr" target="#b15">[16]</ref> uses a GPT model to investigate what parts of the Text2SPARQL task are the hardest for the model to solve so appropriate countermeasures can be taken.</p><p>[17] proposes a whole new architecture specific for SPARQL generation based on GPT. This direction we assume promising for the future, but here we are focusing on more foundational research first to understand which model families work best on a given dataset and why.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Experimental Setup</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Model families</head><p>As was mentioned in the introduction, the focus of our work is to fine-tune language models that can be considered small by modern standards. We chose one billion parameters as an arbitrary limit on the number of parameters, but as a general guideline we consulted the Steam Hard-and Software Survey <ref type="foot" target="#foot_6">12</ref> and found that 57.22% of their users use a GPU with 8GB of VRAM or more (January 2024). A model with less than one billion parameters should fit into this amount of VRAM comfortably, showing again that these LLMs can be trained and run locally.</p><p>Another consideration is the public availability of the models. We believe that research should be available to anyone who is interested and this should be reflected in the choice of models. Therefore, we only select models that are openly available on Huggingface.</p><p>Following these criteria, we observe quickly that there are only three large model families that fit the bill, which we introduce here briefly. A full list of models evaluated is given in table 1</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.1.">T5 and Flan-T5</head><p>In June 2020 Google released an LLM called Text-To-Text Transfer Transformer, or T5 in short <ref type="bibr" target="#b17">[18]</ref>. The base version consists of roughly 220 million parameters, with smaller and larger versions available. With T5, Google wanted to provide a single LLM that can solve any NLP task like text classification, sentiment analysis and so on. A user must provide a prefix like 'Translate the following sentence to french:' and the LLM then infers how to process the rest of the prompt. In 2022, researchers at Google released new versions of T5 called FLAN-T5 <ref type="bibr" target="#b18">[19]</ref> (FLAN stands for fine-tuning language models <ref type="bibr" target="#b19">[20]</ref>) which, according to the authors, should outperform T5 on any given task.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.2.">BART</head><p>BART was developed by Facebook and released in October 2019 <ref type="bibr" target="#b20">[21]</ref>. It consists of 139 million parameters and is a combination of a BERT-like encoder <ref type="bibr" target="#b21">[22]</ref> with a GPT-like autoregressive decoder <ref type="bibr" target="#b22">[23]</ref>. In August 2020, a multilingual version called mBART was released <ref type="bibr" target="#b23">[24]</ref>. The authors put special emphasis on the fact that BART is just a pretrained model and needs to be fine-tuned for a given specific task. We also included mREBEL models as a specialized version of BART for multilingual relation extraction <ref type="bibr" target="#b24">[25]</ref> since it was finetuned with knowledge graphs in mind.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.3.">M2M100 and NLLB-200</head><p>The M2M100 model was introduced in 2020 <ref type="bibr" target="#b25">[26]</ref> as a many-to-many translation tool for 100 languages. The original version consists of 1.3 billion parameters which exceeds the upper bound we imposed. But there is a distilled version available directly from the Facebook research team at Huggingface called M2M100-418M <ref type="foot" target="#foot_7">13</ref> which we use in our experiments.</p><p>Its successor, the NLLB-200 model, was introduced in 2022 <ref type="bibr" target="#b26">[27]</ref> and stands for 'no language left behind'. Again we use the distilled version NLLB-200-Distilled-600M<ref type="foot" target="#foot_8">14</ref> instead of the 3.3 billion full version of the model. As the authors state, the model is 'primarily intended for research in machine translation' which fits our bill perfectly.</p><p>This leaves us with a selection of models to be assessed in our experiment that can be seen in table <ref type="table" target="#tab_0">1</ref>. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Datasets used for Fine Tuning and Evaluation</head><p>In order to study how well the models can be fine-tuned towards a target KG, we use three evaluation datasets from different domains and with varying complexity. These datasets are comprised of a number of natural language questions, which are mapped to a SPARQL query w.r.t. the target KG. For the first two datasets (organizational graph and CoyPu graph) we generate questions and queries by sending the graph via the OpenAI API to GPT4 and prompting it to generate tuples of natural language question, matching SPARQL query, and the expected result of the query. These tuples are filtered by checking if the results that the SPARQL query returns match with the expected results. Both datasets are then augmented by sending each remaining question again to GPT and asking it to paraphrase the question, giving us a total of two natural language questions per SPARQL query.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.1.">Organizational Graph</head><p>Introduced in <ref type="bibr" target="#b5">[6]</ref>, this small knowledge graph uses established vocabularies to describe an organization with departments and employees. There is a clear schema that maps person and department names to their corresponding RDF resource, for example "Anne Miller" maps to :anne while "Bob Tanner" maps to :bob. In this dataset and the next we also let the language model omit the prefix definitions for the queries and assume they are already present in the preamble of the executed SPARQL query. Using GPT4 we generated a dataset consisting of 69 datapoints, which were split into 53 tuples for training and 16 for testing.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.2.">A subset of the CoyPu graph</head><p>The CoyPu project <ref type="foot" target="#foot_9">15</ref> aims to improve supply chain resiliency for corporations by combining different data sources about public infrastructure, trades and trade agreements, events like disasters and conflicts and many more into a large knowledge graph. Querying this knowledge graph has the potential to help businesses identify risks like single points of failures and mitigate them. This usefulness combined with the fact that the other two datasets have more of an academic background made us decide that we use a subset of the CoyPu knowledge graph as another dataset for training. Creating a viable subset lead to its own difficulties and hurdles however, which we consider as future work. This dataset contains 131 tuples in total, which were split into 105 for training and 26 for testing.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.3.">QALD10</head><p>The Question Answering over Linked Data (QALD) dataset is a standard benchmark <ref type="foot" target="#foot_10">16</ref> with QALD10 being the latest incarnation <ref type="bibr" target="#b27">[28]</ref>. It consists of SPARQL queries along with matching questions in different natural languages, w.r.t. Wikidata. In this work, we focus on English and filter the dataset accordingly. This dataset is especially difficult for a language model to handle because there is no clear indication how to link entities from a given question like "Barack Obama" to their Wikidata entity ID (:Q76), giving rise to a whole field of research called Entity Linking <ref type="bibr" target="#b28">[29]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">Fine-tuning</head><p>For every evaluation dataset individually, we perform fine-tuning of the selected models using PyTorch (100 epochs). Since a single run of fine-tuning does not hold much statistical significance and involves random parameters, we performe isolated runs of the training for a total of ten times. For each run we shuffle the training data with a predetermined random seed to make the results reproducible. Specifically, each run is given an ID from 𝑅01 to 𝑅10 and the seeds are generated by calculating the SHA512 sum of the ID and taking the first eight digits, so 𝑅01 results in the seed 99975818, 𝑅02 in 56899599 and so on.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Results</head><p>In the following subsections we only include those language models in the plots that generated at least one correct query. The T5 family consistently generated not a single correct query on the organizational graph which is why it is absent in the result tables and figures. In fact, all T5 models did not produce a single correct result across all runs.</p><p>To generate the datapoints for each plot, we interrupted the training every five epochs and made the language models translate the questions from the evaluation split into SPARQL queries. We then executed the queries and compared the result sets to determine whether the answers is correct.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Organizational Graph</head><p>Figure <ref type="figure" target="#fig_0">1</ref> shows that all models from the BART and M2M100 families manage to learn the structure of the knowledge graph at least to a certain degree. When taking the best results for each model, aside from NLLB-200, all models turn at least eleven of the sixteen questions into correct SPARQL queries. The performance however fluctuates extremely during the course of the training which is indicative of overfitting.</p><p>Repeating the experiment we can see that performance varies a bit depending on the order the training data is ingested into the network. The statistics are shown in table 3 and the raw Table <ref type="table">2</ref> An example of the errors that were made in the context of the organizational graph: no binding variable, wrong entity ID :charles, wrong property foaf:firstName, wrong literal BobTanner Question What is the surname of Bob Tanner? Gold answer SELECT ?surname WHERE { :bob foaf:surname ?surname . } Generated query SELECT ?surname WHERE { :charles foaf:firstName 'BobTanner' } data is plotted on the left side of figure <ref type="figure">3</ref>. We can see that for this dataset, BART-L performs best (as well as the other sizes of BART), with M2M100 being close behind. Another thing we see from the left plot in figure <ref type="figure">3</ref> is that except for one outlier coming from mREBEL-L, the success of fine-tuning is reliable and reproducible.</p><p>Looking at common errors made during the translation, we found that the best models rarely generated SPARQL that could not be parsed, but they rather mixed up terms and injected parts of the training data into the queries. An example is shown in table <ref type="table">2</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">CoyPu</head><p>In figure <ref type="figure">2</ref> we can see that the performance during the first run of the experiment varies less drastically than for the Organizational Graph. The standard deviations seen in table 5 are similar though so we think this is just a coincidence. Again, the (FLAN-)T5 models never generate even a single correct query, so they are excluded from consecutive runs.</p><p>We can also see that for this dataset, M2M100 outperformed the other models and BART-L is in fact one of the worst, which is a complete shift from the results before. This again shows</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 3</head><p>Average number of correctly translated questions in the context of the organizational graph. Standard deviation is also shown as a percentage of the average to measure the reliability of the fine-tuning.  that one should always evaluate more than a single model since the performance seems to be tied to the structure of the underlying data for fine-tuning. Again BART-L did not generate a lot of parsing errors, but instead mixes up terms from the supply chain domain, see for example table <ref type="table" target="#tab_2">4</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Model name Average</head><p>On the other hand, the main errors that M2M100 made were generating SPARQL queries that could not be parsed (i.e. grammatically incorrect).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.">QALD10</head><p>The language models have a really hard time with the QALD10 dataset. While the structure of the generated query comes close to the correct ones, the models cannot handle the translation from entity names like Kobe Bryant to their corresponding Wikidata IDs like Q25369. We Another problem here is that the QALD10 dataset requires the inclusion of all necessary prefix definitions as part of the query, which was not a requirement both in the organizational as well as the CoyPu graph datasets.</p><p>To provide some numbers for clarification: The best performing model was M2M100-418M.</p><p>The validation dataset contains 394 questions, and only 104 were turned into SPARQL queries that could be parsed. Out of these 104 parsable questions, 51 returned an empty result set. The remaining queries except for three used COUNT and returned 0 because the result set of the underlying query was empty. The three final ones did return wrong bindings. BART-L only managed to translate one single question into a valid SPARQL query, but the result set was not correct. Interestingly, mBART-L generated 101 parsable SPARQL queries, which makes it a close second to M2M100-418M. The error distribution is about the same as for M2M100-418M though, so no question was correctly answered.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Conclusion and Future Work</head><p>In this paper we have shown that fine-tuning language models for the translation of natural language to SPARQL queries is indeed possible, although there are some limitations like the requirement of a clear and concise mapping from entities in a question to entities in the knowledge graph, like Anne Miller to :anne instead of :person1234. If this requirement is met, both the BART family as well as the M2M100 family are able to fulfill this task.</p><p>There is a large amount of avenues that can be explored from here on. First, we should find a better way to define the limit on the number of parameters that a model is allowed to have. Here we have focused on maximum one billion, as this is a limit for most consumer GPUs, but probably there is a connection between model size and sparql generation capabilities.</p><p>Secondly, we want to explore how these results can be used to deploy a fine-tuned language model next to a RAG agent to improve its question answering capabilities. So far, LLMs used by RAG agents often lack the ability to correctly apply aggregate functions, which could be remedied by offering the RAG agent a SPARQL query as another source of information.</p><p>Since all these models are open source, we can also modify them by manipulating existing layers as well as removing some or inserting new ones. This might be a way to reduce inference time and improve the performance even further. One could also derive completely new models from scratch, since most pre-training datasets are openly available and pre-training is fast due to the small size of the models.</p><p>And on top of that, we still have the problem that both the organizational graph dataset as well as the CoyPu dataset were generated using GPT which defeats the purpose of being independent from third parties. We will also investigate in the future how the training data can be generated with open source LLMs like Falcon, Bloom and others so even this step of the pipeline can be executed locally. Here it does not matter if we have enough GPU memory available, since the creation of the training and testing datasets is only done once, so it is not an issue if this step takes a bit longer.</p><p>The goal of this paper was to do a small survey of the out-of-the-box capabilities of readily available language models. What we have seen so far looks promising and there is a lot of intriguing research to be done in the near future.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Number of correct SPARQL translations across epochs for the organizational graph for one fine-tuning run per model. 16 questions were presented.</figDesc><graphic coords="7,89.29,176.19,416.69,234.39" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :Figure 3 :</head><label>23</label><figDesc>Figure 2: Number of correct SPARQL translations across epochs for the CoyPu dataset for one finetuning run per model. 26 questions were presented.</figDesc><graphic coords="9,89.29,84.19,416.69,234.39" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>Model names and their number of parameters, as used in our experiments.</figDesc><table><row><cell>Name T5-Small T5-Base T5-Large FLAN-T5-Small FLAN-T5-Base FLAN-T5-Large</cell><cell>No. parameters 60.5M 223M 738M 77M 248M 783M</cell><cell>Name BART-Base BART-Large mBART-LARGE-50 mREBEL-Base mREBEL-Large M2M100-418M NLLB200-Distilled-600M</cell><cell>No. parameters 139M 406M 611M 484M 611M 418M 600M</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Standard deviation Std. dev percent</head><label></label><figDesc></figDesc><table><row><cell>BART</cell><cell>12.90</cell><cell>1.14</cell><cell>8.80</cell></row><row><cell>BART-L</cell><cell>13.30</cell><cell>0.64</cell><cell>4.81</cell></row><row><cell>mBART-50</cell><cell>12.80</cell><cell>0.75</cell><cell>5.85</cell></row><row><cell>mREBEL-L</cell><cell>11.10</cell><cell>1.92</cell><cell>17.31</cell></row><row><cell>M2M100</cell><cell>12.50</cell><cell>1.02</cell><cell>8.20</cell></row><row><cell>NLLB-200</cell><cell>8.20</cell><cell>0.75</cell><cell>9.13</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 4</head><label>4</label><figDesc>An example of the errors that were made in the context of the CoyPu graph: latitude and longitude got mixed. The ellipsis on the IRI was inserted by ourselves to keep the line shorter. In fact, the language model generated the correct IRI to use in the query.Question What is the latitude of the port with the ID 'AUDKB'? Gold answer SELECT ?latitude WHERE { &lt;https://data.coypu.org/...</figDesc><table><row><cell>ns2:hasLatitude ?latitude }</cell></row><row><cell>Generated query SELECT ?latitude WHERE { &lt;https://data.coypu.org/...</cell></row><row><cell>ns2:hasLatitude ?longitude }</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 5</head><label>5</label><figDesc>Average number of correctly translated questions in the context of the CoyPu graph. Standard deviation is also shown as a percentage of the average to measure the reliability of the fine-tuning.</figDesc><table><row><cell cols="4">Model name Average Standard deviation Std. dev percent</cell></row><row><cell>BART</cell><cell>11.90</cell><cell>0.83</cell><cell>6.98</cell></row><row><cell>BART-L</cell><cell>14.00</cell><cell>1.18</cell><cell>8.45</cell></row><row><cell>mBART-50</cell><cell>18.50</cell><cell>1.20</cell><cell>6.51</cell></row><row><cell>mREBEL-L</cell><cell>17.50</cell><cell>1.02</cell><cell>5.86</cell></row><row><cell>M2M100</cell><cell>19.30</cell><cell>0.90</cell><cell>4.66</cell></row><row><cell>NLLB-200</cell><cell>17.80</cell><cell>0.87</cell><cell>4.90</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_0">https://huggingface.co/abacusai/Smaug-72B-v0.1</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="7" xml:id="foot_1">https://www.semafor.com/article/03/24/2023/the-secret-history-of-elon-musk-sam-altman-and-openai</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="8" xml:id="foot_2">https://www.wired.com/story/openai-ceo-sam-altman-the-age-of-giant-ai-models-is-already-over/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="9" xml:id="foot_3">https://www.theverge.com/2023/9/12/23870547/unit-price-change-game-development</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="10" xml:id="foot_4">https://edition.cnn.com/2024/03/04/business/red-sea-cables-cut-internet/index.html</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="11" xml:id="foot_5">https://huggingface.co/defog/sqlcoder-7b-2</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="12" xml:id="foot_6">https://store.steampowered.com/hwsurvey/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="13" xml:id="foot_7">https://huggingface.co/facebook/m2m100_418M</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="14" xml:id="foot_8">https://huggingface.co/facebook/nllb-200-distilled-600M</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="15" xml:id="foot_9">https://coypu.org/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="16" xml:id="foot_10">https://www.nliwod.org/challenge</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>This work was partially supported by grants from the German Federal Ministry for Economic Affairs and Climate Action (BMWK) to the CoyPu project (01MK21007A) and KISS project (01MK22001A) as well as from the German Federal Ministry of Education and Research (BMBF) to the project StahlDigital (13XP5116B).</p></div>
			</div>


			<div type="funding">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>/announcing-microsoft-copilot-your-everyday-ai-companion/ 3 https://github.blog/2023-11-08-universe-2023-copilot-transforms-github-into-the-ai-powered-developer-platform/ <ref type="bibr" target="#b3">4</ref> </p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<author>
			<persName><forename type="first">S</forename><surname>Lin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Hilton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Evans</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2109.07958</idno>
		<title level="m">Truthfulqa: Measuring how models mimic human falsehoods</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<title level="m" type="main">Hellaswag: Can a machine really finish your sentence?</title>
		<author>
			<persName><forename type="first">R</forename><surname>Zellers</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Holtzman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Bisk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Farhadi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Choi</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1905.07830</idno>
		<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<author>
			<persName><forename type="first">F</forename><surname>Chollet</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1911.01547</idno>
		<title level="m">On the measure of intelligence</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<author>
			<persName><forename type="first">E</forename><surname>Beeching</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Fourrier</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Habib</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Han</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Lambert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Rajani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Sanseviero</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Tunstall</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Wolf</surname></persName>
		</author>
		<ptr target="https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard" />
		<title level="m">Open llm leaderboard</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<author>
			<persName><forename type="first">B</forename><surname>Workshop</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2211.05100</idno>
		<title level="m">Bloom: A 176b-parameter open-access multilingual language model</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Llm-assisted knowledge graph engineering: Experiments with chatgpt</title>
		<author>
			<persName><forename type="first">L.-P</forename><surname>Meyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Stadler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Frey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Radtke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Junghanns</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Meissner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Dziwis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Bulert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Martin</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-658-43705-3_8</idno>
		<idno type="arXiv">arXiv:2307.06917</idno>
	</analytic>
	<monogr>
		<title level="m">First Working Conference on Artificial Intelligence Development for a Resilient and Sustainable Tomorrow (AITomorrow) 2023</title>
				<editor>
			<persName><forename type="first">C</forename><surname>Zinke-Wehlmann</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Friedrich</surname></persName>
		</editor>
		<imprint>
			<publisher>Informatik aktuell</publisher>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Developing a scalable benchmark for assessing large language models in knowledge graph engineering</title>
		<author>
			<persName><forename type="first">L.-P</forename><surname>Meyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Frey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Junghanns</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Brei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Bulert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Gründer-Fahrer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Martin</surname></persName>
		</author>
		<ptr target="https://ceur-ws.org/Vol-3526/paper-04.pdf.arXiv:2308.16622" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Posters and Demo Track of the 19th International Conference on Semantic Systems (SEMANTICS 2023)</title>
				<editor>
			<persName><forename type="first">N</forename><surname>Keshan</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Neumaier</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><forename type="middle">L</forename><surname>Gentile</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Vahdati</surname></persName>
		</editor>
		<meeting>the Posters and Demo Track of the 19th International Conference on Semantic Systems (SEMANTICS 2023)</meeting>
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Benchmarking the abilities of large language models for RDF knowledge graph creation and comprehension: How well do llms speak turtle?</title>
		<author>
			<persName><forename type="first">J</forename><surname>Frey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Meyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Arndt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Brei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Bulert</surname></persName>
		</author>
		<ptr target="https://ceur-ws.org/Vol-3559/paper-3.pdf.arXiv:2309.17122" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Workshop on Deep Learning for Knowledge Graphs (DL4KG 2023) co-located with the 21th International Semantic Web Conference (ISWC 2023)</title>
		<title level="s">CEUR Workshop Proceedings</title>
		<editor>
			<persName><forename type="first">M</forename><surname>Alam</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Cochez</surname></persName>
		</editor>
		<meeting>the Workshop on Deep Learning for Knowledge Graphs (DL4KG 2023) co-located with the 21th International Semantic Web Conference (ISWC 2023)<address><addrLine>Athens</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2023">November 6-10, 2023. 2023</date>
			<biblScope unit="volume">3559</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Assessing the evolution of llm capabilities for knowledge graph engineering in</title>
		<author>
			<persName><forename type="first">J</forename><surname>Frey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L.-P</forename><surname>Meyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Brei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Gründer-Fahrer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Martin</surname></persName>
		</author>
		<ptr target="https://www.researchgate.net/publication/378804553_Assessing_the_Evolution_of_LLM_capabilities_for_Knowledge_Graph_Engineering_in_2023" />
	</analytic>
	<monogr>
		<title level="m">ESWC 2024 Satellite Events</title>
				<editor>
			<persName><forename type="first">A</forename><forename type="middle">M</forename><surname>Peñuela</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">O</forename><surname>Corcho</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Groth</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">E</forename><surname>Simperl</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">V</forename><surname>Tamma</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Nuzzolese</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Poveda-Villalón</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Sabou</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">V</forename><surname>Presutti</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">I</forename><surname>Celino</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Revenko</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Raad</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">B</forename><surname>Sartini</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Lisena</surname></persName>
		</editor>
		<meeting><address><addrLine>Hersonissos, Crete, Greece</addrLine></address></meeting>
		<imprint>
			<publisher>Proceedings</publisher>
			<date type="published" when="2023-05-26">2023. May 26 -30, 2024. 2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Towards self-configuring knowledge graph construction pipelines using llms -a case study with rml</title>
		<author>
			<persName><forename type="first">M</forename><surname>Hofer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Frey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Rahm</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Fifth International Workshop on Knowledge Graph Construction @ ESWC2024</title>
				<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">C</forename><surname>Rangel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">M</forename><surname>De Farias</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">C</forename><surname>Sima</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Kobayashi</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2402.04627</idno>
		<title level="m">Sparql generation: an analysis on fine-tuning openllama for question answering over a life science knowledge graph</title>
				<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Fine-tuned LLMs know more, hallucinate less with few-shot sequence-to-sequence semantic parsing over Wikidata</title>
		<author>
			<persName><forename type="first">S</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Culhane</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Pertseva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M.-H</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Semnani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lam</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2023.emnlp-main.353</idno>
		<ptr target="https://aclanthology.org/2023.emnlp-main.353.doi:10.18653/v1/2023.emnlp-main.353" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics</title>
				<editor>
			<persName><forename type="first">H</forename><surname>Bouamor</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Pino</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">K</forename><surname>Bali</surname></persName>
		</editor>
		<meeting>the 2023 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics<address><addrLine>Singapore</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="5778" to="5791" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Modern baselines for sparql semantic parsing</title>
		<author>
			<persName><forename type="first">D</forename><surname>Banerjee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">A</forename><surname>Nair</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">N</forename><surname>Kaur</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Usbeck</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Biemann</surname></persName>
		</author>
		<idno type="DOI">10.1145/3477495.3531841</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR &apos;22</title>
				<meeting>the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR &apos;22</meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<monogr>
		<title level="m" type="main">A comprehensive evaluation of neural sparql query generation from natural language questions</title>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">A K K</forename><surname>Diallo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Reyd</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Zouaq</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2304.07772</idno>
		<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<monogr>
		<author>
			<persName><forename type="first">Z</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Fan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Gu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Duan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Dong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wang</surname></persName>
		</author>
		<idno>ArXiv abs/2308.12060</idno>
		<ptr target="https://api.semanticscholar.org/CorpusID:261076103" />
		<title level="m">Flexkbqa: A flexible llm-powered framework for few-shot knowledge base question answering</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<monogr>
		<title level="m" type="main">Sparql generation with entity pre-trained gpt for kg question answering</title>
		<author>
			<persName><forename type="first">D</forename><surname>Bustamante</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Takeda</surname></persName>
		</author>
		<idno>ArXiv abs/2402.00969</idno>
		<ptr target="https://api.semanticscholar.org/CorpusID:267406567" />
		<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Sgpt: A generative approach for sparql query generation from natural language questions</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">R A H</forename><surname>Rony</surname></persName>
		</author>
		<author>
			<persName><forename type="first">U</forename><surname>Kumar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Teucher</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Kovriguina</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Lehmann</surname></persName>
		</author>
		<idno type="DOI">10.1109/ACCESS.2022.3188714</idno>
	</analytic>
	<monogr>
		<title level="j">IEEE Access</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="page" from="70712" to="70723" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Exploring the limits of transfer learning with a unified text-to-text transformer</title>
		<author>
			<persName><forename type="first">C</forename><surname>Raffel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Shazeer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Roberts</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Narang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Matena</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">J</forename><surname>Liu</surname></persName>
		</author>
		<ptr target="http://jmlr.org/papers/v21/20-074.html" />
	</analytic>
	<monogr>
		<title level="j">Journal of Machine Learning Research</title>
		<imprint>
			<biblScope unit="volume">21</biblScope>
			<biblScope unit="page" from="1" to="67" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<monogr>
		<title level="m" type="main">Scaling instruction-finetuned language models</title>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">W</forename><surname>Chung</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Hou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Longpre</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Zoph</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Tay</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Fedus</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Dehghani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Brahma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Webson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">S</forename><surname>Gu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Dai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Suzgun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Chowdhery</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Narang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Mishra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Dai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Petrov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">H</forename><surname>Chi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Dean</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Devlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Roberts</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><forename type="middle">V</forename><surname>Le</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wei</surname></persName>
		</author>
		<idno type="DOI">10.48550/ARXIV.2210.11416</idno>
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Finetuned language models are zero-shot learners</title>
		<author>
			<persName><forename type="first">J</forename><surname>Wei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Bosma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Guu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">W</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Lester</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Du</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">M</forename><surname>Dai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><forename type="middle">V</forename><surname>Le</surname></persName>
		</author>
		<ptr target="https://openreview.net/forum?id=gEZrGCozdqR" />
	</analytic>
	<monogr>
		<title level="m">International Conference on Learning Representations</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<monogr>
		<title level="m" type="main">BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension</title>
		<author>
			<persName><forename type="first">M</forename><surname>Lewis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Goyal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Ghazvininejad</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Mohamed</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Levy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Stoyanov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zettlemoyer</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1910.13461</idno>
		<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<monogr>
		<author>
			<persName><forename type="first">J</forename><surname>Devlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M.-W</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Toutanova</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1810.04805</idno>
		<title level="m">Bert: Pre-training of deep bidirectional transformers for language understanding</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<monogr>
		<title level="m" type="main">Language models are unsupervised multitask learners</title>
		<author>
			<persName><forename type="first">A</forename><surname>Radford</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Child</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Luan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Amodei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Sutskever</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<monogr>
		<author>
			<persName><forename type="first">Y</forename><surname>Tang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Tran</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P.-J</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Goyal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Chaudhary</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Gu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Fan</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2008.00401</idno>
		<title level="m">Multilingual translation with extensible multilingual pretraining and finetuning</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Red fm : a filtered and multilingual relation extraction dataset</title>
		<author>
			<persName><forename type="first">P.-L</forename><surname>Huguet Cabot</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Tedeschi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A.-C. Ngonga</forename><surname>Ngomo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Navigli</surname></persName>
		</author>
		<ptr target="https://arxiv.org/abs/2306.09802" />
	</analytic>
	<monogr>
		<title level="m">Proc. of the 61st Annual Meeting of the Association for Computational Linguistics: ACL 2023, Association for Computational Linguistics</title>
				<meeting>of the 61st Annual Meeting of the Association for Computational Linguistics: ACL 2023, Association for Computational Linguistics<address><addrLine>Toronto, Canada</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<monogr>
		<author>
			<persName><forename type="first">A</forename><surname>Fan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Bhosale</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Schwenk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Ma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>El-Kishky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Goyal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Baines</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Celebi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Wenzek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Chaudhary</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Goyal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Birch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Liptchinsky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Edunov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Grave</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Auli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Joulin</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2010.11125</idno>
		<title level="m">Beyond english-centric multilingual machine translation</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<monogr>
		<author>
			<persName><forename type="first">N</forename><surname>Team</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">R</forename><surname>Costa-Jussà</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Cross</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Çelebi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Elbayad</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Heafield</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Heffernan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Kalbassi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Lam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Licht</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Maillard</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Wenzek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Youngblood</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Akula</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Barrault</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">M</forename><surname>Gonzalez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Hansanti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Hoffman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Jarrett</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">R</forename><surname>Sadagopan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Rowe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Spruit</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Tran</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Andrews</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">F</forename><surname>Ayan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Bhosale</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Edunov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Fan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Gao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Goswami</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Guzmán</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Koehn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Mourachko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Ropers</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Saleem</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Schwenk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wang</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2207.04672</idno>
		<title level="m">No language left behind: Scaling human-centered machine translation</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<analytic>
		<title level="a" type="main">Qald-9-plus: A multilingual dataset for question answering over dbpedia and wikidata translated by native speakers</title>
		<author>
			<persName><forename type="first">A</forename><surname>Perevalov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Diefenbach</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Usbeck</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Both</surname></persName>
		</author>
		<idno type="DOI">10.1109/ICSC52841.2022.00045</idno>
	</analytic>
	<monogr>
		<title level="m">IEEE 16th International Conference on Semantic Computing (ICSC)</title>
				<meeting><address><addrLine>Los Alamitos, CA, USA</addrLine></address></meeting>
		<imprint>
			<publisher>IEEE Computer Society</publisher>
			<date type="published" when="2022">2022. 2022</date>
			<biblScope unit="page" from="229" to="234" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b28">
	<analytic>
		<title level="a" type="main">Entity linking and filling for question answering over knowledge graphs</title>
		<author>
			<persName><forename type="first">D</forename><surname>Diomedi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Hogan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of NLIWoD2022</title>
				<meeting>NLIWoD2022</meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b29">
	<monogr>
		<author>
			<persName><forename type="first">A</forename></persName>
		</author>
		<idno type="DOI">10.5281/zenodo.10996425</idno>
		<ptr target="https://github.com/AKSW/LMs4Text2SPARQLandatDOI:10.5281/zenodo.10996425" />
		<title level="m">Online Resources Source code for the training, organizational graph dataset and CoyPu dataset can be found at</title>
				<imprint/>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
