<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Formatting SPARQL 1.1 via Parsing Expression Grammar</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author role="corresp">
							<persName><forename type="first">Hirokazu</forename><surname>Chiba</surname></persName>
							<email>chiba@dbcls.rois.ac.jp</email>
							<affiliation key="aff0">
								<orgName type="department" key="dep1">Database Center for Life Science</orgName>
								<orgName type="department" key="dep2">Joint Support-Center for Data Science Research</orgName>
								<orgName type="institution">Research Organization of Information and Systems</orgName>
								<address>
									<addrLine>178-4-4 Wakashiba</addrLine>
									<postCode>277-0871</postCode>
									<settlement>Kashiwa</settlement>
									<region>Chiba</region>
									<country key="JP">Japan</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Formatting SPARQL 1.1 via Parsing Expression Grammar</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">48266AED572EDA84FC4ECEFD1BC50F18</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T17:25+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>SPARQL</term>
					<term>parser</term>
					<term>formatter</term>
					<term>PEG</term>
					<term>grammar</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>The core programming language of the Semantic Web is SPARQL. To increase the productivity of Semantic Web programming, reusability of the SPARQL queries is key. Here, we aim to implement a formatter that fully supports SPARQL 1.1 to enhance the reusability of SPARQL. First, SPARQL 1.1 grammar defined in EBNF was transformed into Parsing Expression Grammar (PEG) to generate a SPARQL 1.1 parser. The parser accepts a valid SPARQL query and constructs the abstract syntax tree (AST) of the input SPARQL query. Next, a formatter was implemented to output a SPARQL query from this AST. The SPARQL formatter is available on the command line and also available as a JavaScript library for developing websites. SPARQL ASTs are represented internally in JSON, and can be reused for purposes other than formatting. Furthermore, the PEG expression developed here can be readily modified for similar subgraph matching problems on knowledge graphs. The source code and a website of the SPARQL formatter are available at https://github.com/sparqling/sparql-formatter.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>The core of Semantic Web programming is SPARQL <ref type="bibr" target="#b0">[1]</ref>. To increase the productivity of Semantic Web programming, enhancing the reusability of SPARQL queries is key. Especially, the readability of code is essential for reusability. However, reformatting the SPARQL query is not a trivial issue. We need to parse the input SPARQL and output it appropriately to reformat it. Parsing a SPARQL query is a complex task because the SPARQL specification has a complex grammar, which is comprised of 173 rules in the format of EBNF. Thus, it is difficult to implement a parser in a conventional sequential programming language.</p><p>Several programming languages including JavaScript support Parsing Expression Grammar (PEG) <ref type="bibr" target="#b1">[2]</ref> for describing definitions of a grammar in a declarative manner to generate a parser. Here, we aim to implement a formatter that fully supports SPARQL 1.1 by using PEG for command-line use and website development. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Implementation</head><p>We used PEG.js (https://pegjs.org/) to generate a SPARQL 1.1 parser. 173 rules expressed in EBNF were extracted from SPARQL 1.1 Query Language specification <ref type="bibr" target="#b0">[1]</ref> and translated into PEG rules for constructing abstract syntax trees (ASTs). An excerpt from the EBNF rules of SPARQL and the corresponding PEG rules for constructing the AST is shown in Figure <ref type="figure">1</ref>. Notably, white spaces and comments in SPARQL are not included in EBNF rules. As comments are often inserted in SPARQL queries in practice, comments besides white spaces are also expressed in our PEG expressions. Comments are kept separate from the AST. The position from the beginning of the SPARQL input was obtained and included in the returned values.</p><p>The formatter was implemented in JavaScript, which accepts an AST of SPARQL and outputs each element of the SPARQL uniformly. The position of each element is compared with the positions of comments, so that the comments are added at the appropriate positions.</p><p>91 test queries were extracted from the SPARQL 1.1 Query Language <ref type="bibr" target="#b0">[1]</ref> specifications, and 16 from SPARQL 1.1 Update [3]. All those queries were formatted and then manually curated for use as test cases.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Use Cases</head><p>The SPARQL formatter is published as an npm library, so that it is easy to install and use in a Node.js environment as the sparql-formatter command. The Docker version is also available and published on Docker Hub.</p><p>The JavaScript code of the parser and formatter has been bundled and made available on Content Delivery Network (CDN), so that users can incorporate the SPARQL formatter into their websites. Figure <ref type="figure" target="#fig_1">2</ref> shows a website using the SPARQL formatter. All the queries extracted from the SPARQL 1.1 specifications are available as example queries on the website.</p><p>The AST of a SPARQL query is represented internally in JSON and can be reused for purposes other than formatting. Figure <ref type="figure">3</ref> shows the AST obtained for an example SPARQL query, "SELECT * WHERE { ?s ?p ?o } LIMIT 10 ".</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>// [ 2 ]Figure 1 :</head><label>21</label><figDesc>Figure 1: An excerpt from the EBNF of SPARQL and the corresponding PEG rules</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: A website of the SPARQL 1.1 formatter</figDesc><graphic coords="3,89.29,84.19,416.69,119.77" type="bitmap" /></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Discussion</head><p>By generating a SPARQL parser via PEG.js, the SPARQL formatter was implemented. This work contributes to the increased reusability of SPARQL queries. The queries can be formatted in a uniform way, thus the readability will be increased. While other efforts to implement SPARQL parsers have been made in projects such as RSFStore-js <ref type="bibr" target="#b2">[4]</ref> and SPARQL.js (https://github.com/RubenVerborgh/SPARQL.js), they could not format queries as intended. Notably, by adding a JSON-LD context to the obtained AST for a SPARQL, the query can be viewed as RDF, thus the SPARQL queries themselves can be treated as RDF resources. It may contribute to the FAIRification <ref type="bibr" target="#b3">[5]</ref> of the Semantic Web queries. The PEG.js code developed for SPARQL 1.1 can be modified for extended specifications of SPARQL [6] or other subgraph matching problems on knowledge graphs [7].</p></div>
			</div>

			<div type="annex">
<div xmlns="http://www.tei-c.org/ns/1.0" />			</div>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">1 Query Language</title>
		<ptr target="https://www.w3.org/TR/sparql11-query/" />
	</analytic>
	<monogr>
		<title level="m">SPARQL 1</title>
				<imprint>
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Parsing Expression Grammars: a recognition-based syntactic foundation</title>
		<author>
			<persName><forename type="first">B</forename><surname>Ford</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 31st ACM SIGPLAN-SIGACT symposium on Principles of programming languages</title>
				<meeting>the 31st ACM SIGPLAN-SIGACT symposium on Principles of programming languages</meeting>
		<imprint>
			<date type="published" when="2004">2004</date>
			<biblScope unit="page" from="111" to="122" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">A JavaScript RDF store and application library for linked data client applications</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">G</forename><surname>Hernández</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>García</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Devtracks of the WWW2012 conference</title>
				<meeting><address><addrLine>Lyon, France</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">The FAIR Guiding Principles for scientific data management and stewardship</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">D</forename><surname>Wilkinson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Dumontier</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><forename type="middle">J</forename><surname>Aalbersberg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Appleton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Axton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Baak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Blomberg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J.-W</forename><surname>Boiten</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">B</forename><surname>Da Silva Santos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">E</forename><surname>Bourne</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Scientific Data</title>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="page" from="1" to="9" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
