<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Facilitating Search of the Virtual Record Treasury of Ireland Knowledge Graph using ChatGPT ⋆</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Alex</forename><surname>Randles</surname></persName>
							<email>alex.randles@adaptcentre.ie</email>
							<affiliation key="aff0">
								<orgName type="department">ADAPT Centre for Digital Content</orgName>
								<orgName type="institution">Trinity College Dublin</orgName>
								<address>
									<settlement>Dublin</settlement>
									<country key="IE">Ireland</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Lucy</forename><surname>Mckenna</surname></persName>
							<email>lucy.mckenna@adaptcentre.ie</email>
							<affiliation key="aff0">
								<orgName type="department">ADAPT Centre for Digital Content</orgName>
								<orgName type="institution">Trinity College Dublin</orgName>
								<address>
									<settlement>Dublin</settlement>
									<country key="IE">Ireland</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Lynn</forename><surname>Kilgallon</surname></persName>
							<email>kilgall@tcd.ie</email>
							<affiliation key="aff1">
								<orgName type="department">Department of History</orgName>
								<orgName type="institution">Trinity College Dublin</orgName>
								<address>
									<settlement>Dublin</settlement>
									<country key="IE">Ireland</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Beyza</forename><surname>Yaman</surname></persName>
							<email>beyza.yaman@adaptcentre.ie</email>
							<affiliation key="aff0">
								<orgName type="department">ADAPT Centre for Digital Content</orgName>
								<orgName type="institution">Trinity College Dublin</orgName>
								<address>
									<settlement>Dublin</settlement>
									<country key="IE">Ireland</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Peter</forename><surname>Crooks</surname></persName>
							<email>pcrooks@tcd.ie</email>
							<affiliation key="aff1">
								<orgName type="department">Department of History</orgName>
								<orgName type="institution">Trinity College Dublin</orgName>
								<address>
									<settlement>Dublin</settlement>
									<country key="IE">Ireland</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Declan</forename><surname>O'sullivan</surname></persName>
							<email>declan.osullivan@adaptcentre.ie</email>
							<affiliation key="aff0">
								<orgName type="department">ADAPT Centre for Digital Content</orgName>
								<orgName type="institution">Trinity College Dublin</orgName>
								<address>
									<settlement>Dublin</settlement>
									<country key="IE">Ireland</country>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff2">
								<orgName type="department">International Conference on Semantic Systems</orgName>
								<address>
									<settlement>Amsterdam</settlement>
									<country key="NL">Netherlands</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Facilitating Search of the Virtual Record Treasury of Ireland Knowledge Graph using ChatGPT ⋆</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">E40176EC42F93DA2E6E5707A5109B354</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T18:46+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>KG Search, User Interface, ChatGPT 0000-0001-6231-3801 (A. Randles)</term>
					<term>0000-0002-6035-7656 (L. McKenna)</term>
					<term>0000-0002-3075-8571 (L. Kilgallon)</term>
					<term>0000-0003-2130-0312 (B. Yaman)</term>
					<term>0000-0001-6782-044X (P. Crooks)</term>
					<term>0000-0003-1090-3548 (D. O&apos;Sullivan)</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>The Virtual Record Treasury of Ireland (VRTI) is an initiative to digitally recreate the contents of the Irish central archive which was destroyed during the Civil War. The project has created a Knowledge Graph (KG) to facilitate information discovery and reasoning over the recovered items. However, complex queries must be created to retrieve data in the KG, which require a high level of technical expertise. In this paper, we explore the application of Large Language Models (LLMs) to facilitate searching of the VRTI-KG by users who lack this technical expertise and to decrease workload for those who do not. The VRTI-ChatGPT framework is proposed which uses ChatGPT to interpret requests from users and to facilitate the creation of queries which can be executed on the KG.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>The Virtual Record Treasury of Ireland (VRTI) <ref type="bibr" target="#b0">[1]</ref><ref type="bibr" target="#b1">[2]</ref><ref type="bibr" target="#b2">[3]</ref> is a state-funded programme hosted at Trinity College Dublin. The VRTI began with the objective of digitally reconstructing archival records destroyed during the 1922 Irish Civil war <ref type="bibr" target="#b1">[2]</ref>. A fire during the war destroyed the Public Record Office of Ireland, which damaged records dating back more than 700 years. The staff at the time spent months to recover documents which were recreated a century later. The initial VRTI Knowledge Graph (KG) was created as a result of the lead project named Beyond 2022. The VRTI-KG contains notable information about people, places, roles, organisations and their interconnections from Irish history. Representing this information using a KG allows the integration of heterogenous source data formats and supports reasoning and inference of the data and applied to a range of scenarios already from event based <ref type="bibr" target="#b3">[4]</ref> networks to more recently climate action related applications <ref type="bibr" target="#b4">[5]</ref>. The KG was implemented using RDF, which means data must be retrieved using SPARQL <ref type="bibr" target="#b5">[6]</ref> queries. Creating these queries is time-consuming and requires an understanding of the SPARQL query language and structure of relevant schemas. Many of the historians who would interact with the VRTI-KG do not possess the technical expertise to create these complex queries. Large Language Models (LLMs) such as ChatGPT <ref type="bibr" target="#b6">[7]</ref> provide functionality which could allow the data in the VRTI-KG to be easily searched and the results presented using natural language. It was decided to use ChatGPT in the proposed approach as it provided the best results in early experimentation. With the emergence of generative AI, we are interested in exploring what benefits it can have for the VRTI-KG system <ref type="bibr" target="#b0">[1]</ref>. However, it is important to ensure that the proposed application of generative AI to the VRTI-KG is constrained to information only contained in the VRTI and does not pollute responses with external information on the requested topic. In order to explore how generative AI could be applied, we propose the VRTI-ChatGPT framework which was designed to facilitate searching of the VRTI-KG through natural language questions and answers. The framework uses strict prompt templates to interact with ChatGPT in order to process the users input and form sentences from KG query results. A recent survey <ref type="bibr" target="#b7">[8]</ref> has highlighted the importance of providing straightforward interaction between semantic interfaces and respective domain experts. The survey compared 28 interfaces based on interaction paradigm, information being displayed, and strategies used to improve the understanding of information. The survey concluded that many of these approaches still require some level of technical expertise to be used effectively, which some domain experts may lack. It is hoped natural language interaction can bridge the gap between domain experts (historians) and diverse data held in VRTI-KG. An existing tool designed for KG natural language querying by Ontotext<ref type="foot" target="#foot_0">1</ref> was experimented with before deciding to create a bespoke solution. The tool uses LLMs to create SPARQL queries from a provided ontology and natural language question. The endpoint of the VRTI-KG and ontology were provided to the tool, however, it was observed that it struggled to create syntactically correct queries for most test cases. The incorrect queries could be a result of the complex structure of the VRTI ontology. Using an approach involving query templates ensures that the query created from the natural language is syntactically correct and retrieves all of the required information to provide a sufficient response. The query templates used in the framework are configurable which is hoped to allow the approach to be customised for other KGs in future. Early observations from the historians in the VRTI has been positive when the involved prompts are strictly constrained so that ChatGPT does not make inference on the provided information.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">VRTI-ChatGPT framework</head><p>This section discusses the design and implementation of the VRTI-ChatGPT framework. The implementation of the framework is available online <ref type="foot" target="#foot_1">2</ref> . The framework is configurable to allow changes in the VRTI-KG to be easily synchronised with the involved prompts and queries. Figure <ref type="figure">1</ref> presents an overview of the activities of the framework.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Figure 1: Workflow of the VRTI-ChatGPT framework</head><p>Several Python libraries <ref type="foot" target="#foot_2">3</ref> were used to implement the framework. Flask is a customizable web framework which was used to create the web application. SPARQLWrapper is used to execute queries on the endpoint of the VRTI-KG. The Open-AI library is used to communicate with ChatGPT 4.0 <ref type="foot" target="#foot_3">4</ref> . Python string formatting are used to create prompts and queries from the templates. Figure <ref type="figure" target="#fig_0">2</ref> presents search results displayed on the implementation. Initial Processing of Users Input. First, the user enters a question into the search bar (A -Figure <ref type="figure" target="#fig_0">2</ref>) or selects a suggested question. For instance, a question could ask about a specific (&lt;person&gt;) such as "Tell me about &lt;person&gt;?" ", "Where and when was &lt;person&gt; born", "Was &lt;person&gt; in the army?" or "What job did &lt;person&gt; have?". For the running example, the user inputs "Tell me about Michael Collins". Michael Collins<ref type="foot" target="#foot_4">5</ref> is a notable Irish person who was involved in the Irish civil war. The question is inserted in a prompt template which extracts the name of people and places from the user's input. The generated prompt is "Extract the names of people and places in this text 'Tell me about Michael Collins' and output the result into a JSON dictionary". Then, the prompt is fed into ChatGPT 4.0 using a request carried out by the OpenAI library. The result is a dictionary containing key-value pairs of names of people and places which is stored in memory.</p><p>Creation of SPARQL query. The extracted entity ("Michael Collins") is inserted into a SPARQL <ref type="bibr" target="#b5">[6]</ref> query template<ref type="foot" target="#foot_5">6</ref> defined in the configuration file. The insertion involves translating the key-value pairs from the created JSON dictionary into FILTER conditions (FILTER CONTAINS(?Name, "Michael Collins")) using string formatting methods. The query retrieves resources with a matching name along with their related properties, such as birth date and place. Thereafter, the query is executed on the VRTI-KG using the SPARQLWrapper library to retrieve matching resources. The query results are represented in dictionary format.</p><p>Creation of Natural Language Response. The dictionary containing the query result is inserted into a prompt template to generate the natural language answer. For this example, the generated prompt is "Answer this question 'Tell me about Michael Collins' using only the information in this dictionary '{Person: &lt;….&gt;, Occupation: &lt;…&gt;, BirthDate: "…"}'. Do not include any external information in the answer."). The prompt template is designed to constrain ChatGPT to use only the information from the query results from the VRTI-KG rather than external information it has on the topic. The response (B -Figure <ref type="figure" target="#fig_0">2</ref>) from ChatGPT is then displayed on the interface. In addition, the URI of each resource returned from the query are presented in a tabular format (C -Figure <ref type="figure" target="#fig_0">2</ref>), which allows further exploration with the application.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Future Work and Conclusion</head><p>Future work includes usability testing of the framework with a cohort of historians. The testing will allow the user requirements to be refined and validated. The testing will involve the participants interacting with the framework to complete several tasks which mimic the expected user interaction. In addition, it is hoped to configure the framework to answer questions from information stored in other KGs.</p><p>The VRTI-ChatGPT framework proposed in this paper provides possible direction for the integration of generative AI, such as LLMs in the VRTI-KG system <ref type="bibr" target="#b0">[1]</ref>. It is hoped the proposed approach can facilitate searching by users who lack relevant technical expertise. Thus, reducing workload and improving the uptake of information by domain experts.</p><p>Finally, the prompts used by the framework are hoped to provide guidance for researchers who propose similar approaches.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Screenshot of implementation displaying results for "Tell me about Michael Collins"</figDesc><graphic coords="3,85.05,372.19,424.90,248.55" type="bitmap" /></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">https://www.ontotext.com/blog/natural-language-querying-of-graphdb-in-langchain/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">https://vrti-graph.adaptcentre.ie/gpt-search</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">https://github.com/alex-randles/VRTI-ChatGPT/blob/main/libraries.pdf</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_3">https://openai.com/index/gpt-4/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_4">https://kb.virtualtreasury.ie/person/Collins_Michael_c20_dib_a1860</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_5">https://github.com/alex-randles/VRTI-ChatGPT/blob/main/sample-query.rq</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgements</head><p>Virtual Record Treasury of Ireland (VRTI) is funded by the Government of Ireland, through the Department of Tourism, Culture, Arts, Gaeltacht, Sport and Media, under the Project Ireland 2040 framework. The project is also partially supported by the ADAPT Centre for Digital Content Technology under the SFI Research Centres Programme (Grant 13/RC/2106_P2).</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Digital Prosopography Information in Virtual Record Treasury of Ireland Knowledge Graph</title>
		<author>
			<persName><forename type="first">B</forename><surname>Yaman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Mckenna</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Randles</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Kilgallon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Crooks</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>O'sullivan</surname></persName>
		</author>
		<ptr target="https://ceur-ws.org/Vol-3724/paper2.pdf" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 1st International Workshop of Semantic Digital Humanities (SemDH) Co-Located with the 21st Extended Semantic Web Conference</title>
				<meeting>the 1st International Workshop of Semantic Digital Humanities (SemDH) Co-Located with the 21st Extended Semantic Web Conference</meeting>
		<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">The Virtual Record Treasury of Ireland: A century of Recovery from the 1922 Four Courts Blaze -and Beyond</title>
		<author>
			<persName><forename type="first">P</forename><surname>Crooks</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Reid</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Wallace</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Hist Irel</title>
		<imprint>
			<biblScope unit="volume">30</biblScope>
			<biblScope unit="page" from="38" to="41" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Creating a Knowledge Graph for Ireland&apos;s Lost History: Knowledge Engineering and Curation in the Beyond 2022 Project</title>
		<author>
			<persName><forename type="first">C</forename><surname>Debruyne</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Munnelly</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Kilgallon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>O'sullivan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Crooks</surname></persName>
		</author>
		<idno type="DOI">10.1145/3474829</idno>
		<ptr target="https://doi.org/10.1145/3474829" />
	</analytic>
	<monogr>
		<title level="j">J. Comput. Cult. Herit</title>
		<imprint>
			<biblScope unit="volume">15</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Extending Siena to support more expressive and flexible subscriptions</title>
		<author>
			<persName><forename type="first">J</forename><surname>Keeney</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Roblek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Jones</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Lewis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>O'sullivan</surname></persName>
		</author>
		<idno type="DOI">10.1145/1385989.1385995</idno>
		<ptr target="https://doi.org/10.1145/1385989.1385995" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Second International Conference on Distributed Event-Based Systems</title>
				<meeting>the Second International Conference on Distributed Event-Based Systems<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery</publisher>
			<date type="published" when="2008">2008</date>
			<biblScope unit="page" from="35" to="46" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">An ontology model for climatic data analysis</title>
		<author>
			<persName><forename type="first">J</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Orlandi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>O'sullivan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Dev</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE International Geoscience and Remote Sensing Symposium IGARSS</title>
				<imprint>
			<date type="published" when="2021">2021. 2021</date>
			<biblScope unit="page" from="5739" to="5742" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<title level="m" type="main">SPARQL 1.1 Query Language, World Wide Web Consortium (W3C) Recommendation</title>
		<author>
			<persName><forename type="first">S</forename><surname>Harris</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Seaborne</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Prud'hommeaux</surname></persName>
		</author>
		<ptr target="https://www.w3.org/TR/sparql11-query/" />
		<imprint>
			<date type="published" when="2013-04-01">2013. April 1, 2023</date>
			<biblScope unit="volume">21</biblScope>
			<biblScope unit="page">778</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">A brief overview of ChatGPT: The history, status quo and potential future development</title>
		<author>
			<persName><forename type="first">T</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q.-L</forename><surname>Han</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Tang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE/CAA Journal of Automatica Sinica</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="page" from="1122" to="1136" />
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Linked Data interfaces: a survey</title>
		<author>
			<persName><forename type="first">E</forename><surname>Bernasconi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Miguel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Mecella</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">19th Conference on Information and Research Science Connecting to Digital and Library Science</title>
				<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="1" to="16" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
