<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Sparklis: a SPARQL Endpoint Explorer for Expressive Question Answering</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author role="corresp">
							<persName><forename type="first">Sébastien</forename><surname>Ferré</surname></persName>
							<email>ferre@irisa.fr</email>
							<affiliation key="aff0">
								<orgName type="laboratory">IRISA</orgName>
								<orgName type="institution">Université de Rennes</orgName>
								<address>
									<addrLine>1 ; Campus de Beaulieu</addrLine>
									<postCode>35042</postCode>
									<settlement>Rennes cedex</settlement>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Sparklis: a SPARQL Endpoint Explorer for Expressive Question Answering</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">E5BD36C20C6274DEBFA9547A6F918611</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T20:10+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Sparklis is a Semantic Web tool that helps users explore SPARQL endpoints by guiding them in the interactive building of questions and answers, from simple ones to complex ones. It combines the fine-grained guidance of faceted search, most of the expressivity of SPARQL, and the readability of (controlled) natural languages. No endpoint-specific configuration is necessary, and no knowledge of SPARQL and the data schema is required from users. This demonstration paper is a companion to the research paper <ref type="bibr" target="#b1">[2]</ref>.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Motivation</head><p>A wealth of semantic data is accessible through SPARQL endpoints. DBpedia alone contains several billions of triples covering all sorts of topics (e.g., people, places, buildings, species, films, books). Although different endpoints may use different vocabularies and ontologies, they all share a common interface to access and retrieve semantic data: the SPARQL query language. In addition to being a widely-adopted W3C standard, the advantages of SPARQL are its expressivity, especially since version 1.1, and its scalability for large RDF stores thanks to highly optimized SPARQL engines (e.g., Virtuoso, Jena TDB). Its main drawback is that writing SPARQL queries is a tedious and error-prone task, and is largely unaccessible to most potential users of semantic data.</p><p>Our motivation in developing Sparklis<ref type="foot" target="#foot_0">1</ref> , shared by many other developers of Semantic Web tools and applications, is to unleash access to semantic data by making it easier to define and send SPARQL queries to endpoints. The novelty of Sparklis is to combine in an integrated fashion different search paradigms: Faceted Search (FS), Query Builders (QB), and Natural Language Interfaces (NLI). That integration is the key to reconcile properties for which there is generally a trade-off in existing systems: user guidance, expressivity, readability of queries, scalability, and portability to different endpoints <ref type="bibr" target="#b1">[2]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Principles</head><p>Sparklis re-uses and generalizes the interaction model of Faceted Search (FS) <ref type="bibr" target="#b7">[8]</ref>, where users are guided step-by-step in the selection of items. At each step, the system gives a set of suggestions to refine the current selection, and users only have to pick a suggestion according to their preferences. The suggestions are specific to the selection, and therefore support exploratory search <ref type="bibr" target="#b6">[7]</ref> by providing overview and feedback during the search process.</p><p>To overcome expressivity limitations of FS and existing extensions for the Semantic Web (e.g., gFacet <ref type="bibr" target="#b3">[4]</ref>, VisiNav <ref type="bibr" target="#b2">[3]</ref>, SemFacet <ref type="bibr" target="#b0">[1]</ref>), we have generalized it to Query-based Faceted Search (QFS), where the selection of items is replaced by a structured query. The latter is built step-by-step through the successive choices of the user. This makes Sparklis a kind of Query Builder (QB), like SemanticCrystal <ref type="bibr" target="#b4">[5]</ref>. QBs have the advantage to allow for a high expressivity while assisting users about syntax, e.g. avoiding syntax errors, listing eligible constructs. However, the FS-based guidance of Sparklis is more fine-grained than in QBs. Sparklis avoids vocabulary errors by retrieving the URIs and literals right from the SPARQL endpoint. It needs not be configured for a particular dataset, and dynamically discovers the data schema. In fact, Sparklis only allows the building of queries that do return results, preventing users to fall on empty results. That is because system suggestions are computed for the individual results, not for their common class. In fact, Sparklis is as much about building answers as about building questions.</p><p>To overcome the lack of readability of SPARQL queries for most users, Sparklis queries and suggestions are verbalized in natural language so that SPARQL queries never need to be shown to users. This makes Sparklis a kind of Natural Language Interface (NLI), like PowerAqua <ref type="bibr" target="#b5">[6]</ref>. The important difference is that questions are built through successive user choices in Sparklis instead of being freely input in NLIs. Sparklis interaction makes question formulation more constrained, slower, and less spontaneous, but it provides guidance and safeness with intermediate answers and suggestions at each step. Moreover, it avoids the hard problem of NL understanding: i.e., ambiguities, out-of-scope questions. A few NLI systems, like Ginseng <ref type="bibr" target="#b4">[5]</ref>, are based on a controlled NL and autocompletion to suggest the next words in a question. However, their suggestions are not fine-grained like with FS, and less flexible because they only apply to the end of the question. In Sparklis, questions form complete sentences at any step of the search; and suggestions are not words but meaningful phrases (e.g., that has a director), and can be inserted at any position in the current question.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">User Interface and Interaction</head><p>Figure <ref type="figure" target="#fig_0">1</ref> is a Sparklis screenshot taken during an exploration of book writers in DBpedia. From top to bottom, the user interface contains (1) navigation buttons and the endpoint URL, (2) the current question and the current focus as a subphrase (highlighted in green), (3) three lists of suggestions for insertion at the focus, and (4) the table of answers. The shown question and answer have been built in 10 steps (8 insertions and 2 focus moves): a Writer/that has a birthDate/after 1800/focus on a Writer/that is the author of something/a Book/a number of/the highest-to-lowest/focus on a Writer/that has a nationality. Note that different insertion orderings are possible for a same question. Navigation buttons allow to move backward/forward in the construction history. A permalink to the current navigation state (endpoint+question) can be generated at any time. To switch to another SPARQL endpoint, it is enough to input its URL in the entry field. The query focus is moved simply by clicking on different parts of the question, or on different table column headers. Every suggestion in the three lists, as well as every table cell, can be inserted or applied to the current focus by clicking it. The first suggestion list contains entities (individuals and literals). The second list contains concepts (classes and properties). The third list contains logical connectives, sorting modifiers, and aggregation operators. Each suggestion list is equipped with an immediate-feedback filtering mechanism to quickly locate suggestions in long lists. With the first list, filters can be inserted into the query with different filter operators listed in a drop-down menu (e.g., matches, higher or equal than, before). Questions and suggestions use indentation to disambiguate different possible groupings and improve readability, and syntax coloring to distinguish between the different kinds of words.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Performances and Limitations</head><p>Portability. Sparklis conforms to the SPARQL standard, and requires no preprocessing or configuration to explore an endpoint. It entirely relies on the end-point to discover data and its schema. The main limitation is that URIs are displayed through their local names, which is not always readable.</p><p>Expressivity. Sparklis covers many features of SPARQL: basic graph patterns (including cycles), basic filters, UNION, OPTIONAL, NOT EXISTS, SELECT, ORDER BY, multiple aggregations with GROUP BY. Almost all queries of the QALD<ref type="foot" target="#foot_1">2</ref> challenge can be answered. Uncovered features are expressions, named graphs, nested queries, queries returning RDF graphs, and updates.</p><p>Scalability. Sparklis is responsive on the largest well-known endpoint: DBpedia. Among the 100 QALD-3 questions, half can be answered in less than 30 seconds (wall-clock time including user interaction and system computations).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Demonstration</head><p>The demonstration has shown to participants how QALD questions over DBpedia can be answered in a step-by-step process. Those questions cover various retrieval tasks: basic facts (Give me the homepage of Forbes), entity lists (Which rivers flow into a German lake?), counts (How many languages are spoken in Colombia?), optimums (Which of Tim Burton's films had the highest budget?). More complex analytical question answering has also been demonstrated (Give me the total runtime, from highest to lowest, of films per director and per country). Participants were also given the opportunity to explore any SPARQL endpoint of their choice.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Fig. 1 .</head><label>1</label><figDesc>Fig. 1. Sparklis screenshot: a list of writers with their birth date (after 1800), nationality, and (decreasing) number of written books. Current focus is on writer's nationality.</figDesc><graphic coords="3,152.06,116.83,311.24,217.10" type="bitmap" /></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">Online at http://www.irisa.fr/LIS/ferre/sparklis/osparklis.html</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">http://greententacle.techfak.uni-bielefeld.de/~cunger/qald/</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">SemFacet: Semantic faceted search over YAGO</title>
		<author>
			<persName><forename type="first">M</forename><surname>Arenas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Grau</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Kharlamov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Š</forename><surname>Marciuška</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Zheleznyakov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Jimenez-Ruiz</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">World Wide Web Conf. Companion</title>
				<imprint>
			<publisher>WWW Steering Committee</publisher>
			<date type="published" when="2014">2014</date>
			<biblScope unit="page" from="123" to="126" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Expressive and scalable query-based faceted search over SPARQL endpoints</title>
		<author>
			<persName><forename type="first">S</forename><surname>Ferré</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Int. Semantic Web Conf</title>
				<editor>
			<persName><forename type="first">P</forename><surname>Mika</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">T</forename><surname>Tudorache</surname></persName>
		</editor>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">VisiNav: A system for visual search and navigation on web data</title>
		<author>
			<persName><forename type="first">A</forename><surname>Harth</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">J. Web Semantics</title>
		<imprint>
			<biblScope unit="volume">8</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="348" to="354" />
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Facet graphs: Complex semantic querying made easy</title>
		<author>
			<persName><forename type="first">P</forename><surname>Heim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Ertl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Ziegler</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Extended Semantic Web Conference</title>
				<editor>et al., L.A.</editor>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2010">2010</date>
			<biblScope unit="volume">6088</biblScope>
			<biblScope unit="page" from="288" to="302" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Evaluating the usability of natural language query languages and interfaces to semantic web knowledge bases</title>
		<author>
			<persName><forename type="first">E</forename><surname>Kaufmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Bernstein</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">J. Web Semantics</title>
		<imprint>
			<biblScope unit="volume">8</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="377" to="393" />
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">PowerAqua: Supporting users in querying and exploring the semantic web</title>
		<author>
			<persName><forename type="first">V</forename><surname>Lopez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Fernández</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Motta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Stieler</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Semantic Web</title>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="249" to="265" />
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Exploratory search: from finding to understanding</title>
		<author>
			<persName><forename type="first">G</forename><surname>Marchionini</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Communications of the ACM</title>
		<imprint>
			<biblScope unit="volume">49</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="41" to="46" />
			<date type="published" when="2006">2006</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Dynamic taxonomies and faceted search</title>
	</analytic>
	<monogr>
		<title level="m">The information retrieval series</title>
				<editor>
			<persName><forename type="first">G</forename><forename type="middle">M</forename><surname>Sacco</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Y</forename><surname>Tzitzikas</surname></persName>
		</editor>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
