<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">An interactive visualisation for RDF data</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Fernando</forename><surname>Florenzano</surname></persName>
						</author>
						<author>
							<persName><forename type="first">Denis</forename><surname>Parra</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Pontificia Universidad Católica de Chile</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Juan</forename><surname>Reutter</surname></persName>
						</author>
						<author>
							<persName><forename type="first">Freddie</forename><surname>Venegas</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Pontificia Universidad Católica de Chile</orgName>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff1">
								<orgName type="department">Center for Semantic Web research</orgName>
								<address>
									<region>CL</region>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">An interactive visualisation for RDF data</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">3572BDB116143624A88D7B9B2038547E</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T03:12+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>We demonstrate a visualisation aimed at facilitating SPARQL-fluent users to produce queries over a dataset they are not familiar with. This visualisation consists of a labelled graph whose nodes are the different types of entities in the RDF dataset, and where two types are related if entities of these types appear related in the RDF dataset. To avoid a visual overload when the number of types in a dataset is too big, the graph groups together all types that are subclass of a more general type, and users are given the option of navigating through this hierarchy of types, dividing type nodes into subtypes as they see fit. We illustrate our visualisation using the Linked Movie Database dataset, and offer as well the visualisation of DBpedia.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Today one can reasonably state that querying an endpoint is an easy task, assuming of course that one is familiar with the SPARQL language and with the structure of the dataset made available by the endpoint. Unfortunately, this is a pretty strong assumption: even if the endpoint is up and running, and even if the user is an expert in SPARQL, the task of producing a query that extracts the desired information may end up demanding more resources than those needed to actually compute the query. The problem is the unstructured nature of RDF: as there is no real notion of schema, knowing what and how the RDF data is stored is not an easy task. In contrast with relational databases, where users can directly consult the schema information for the names and attributes of tables, with RDF data one has almost no alternative but to understand the data by issuing several probe queries. This behaviour has been confirmed when analysing query logs of different endpoints <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b8">9]</ref>.</p><p>As an example of the challenges we face, consider and endpoint for Linked Movie Database (LinkedMDB <ref type="bibr" target="#b2">[3]</ref>), a database storing information about movies: who starred them, who directed them, when and where were they shot, etc. Consider now a SPARQL expert trying to obtain the names of all directors that have also acted in one of their movies. The landing page of our LinkedMDB endpoint of choice would probably look as a white canvas, perhaps offering as well a few example queries, so, unless we are lucky and hit an example, there are almost no clues on how to start answering the desired query. In our case, to answer this query we would need information about how are actors and directors stored in the database, how is the connection between actors, directors and movies, and how to recognise when a director and an actor are the same person.</p><p>It has been noted that users new to RDF datasets would benefit tremendously from a graphical interface that explicitly gives them the information they need to start producing meaningful SPARQL patterns (see e.g. <ref type="bibr" target="#b1">[2,</ref><ref type="bibr" target="#b5">6,</ref><ref type="bibr" target="#b4">5,</ref><ref type="bibr" target="#b7">8]</ref>). But if we want ordinary endpoint users to take advantage of our interface, we need a lightweight system that can be added onto endpoints without much computational overload and that can scale for arbitrary large datasets. Related work. To our best knowledge, none of the available visualisations offer a summarised graph with the ability to navigate through the graph using hierarchy of types. For example, systems such as LODSight <ref type="bibr" target="#b1">[2]</ref>, Sparqture <ref type="bibr" target="#b7">[8]</ref>, those working with VOWL <ref type="bibr" target="#b6">[7]</ref> or that of <ref type="bibr" target="#b5">[6]</ref> provide rather static visualisations, and systems such as LODpeas <ref type="bibr" target="#b3">[4]</ref> provide interactive visualisations but only at an entity level.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Overview of the System</head><p>Our main visualisation is a labelled, undirected graph, where the nodes are the types of the dataset, and where there is a p-labeled edge between nodes t 1 and t 2 if and only if there are entities u 1 and u 2 such that u 1 is of type t 1 , u 2 is of type t 2 and the dataset contains the triple {u 1 p u 2 }. That is, two types are related to each other if there are entities of these types connected by a triple in the dataset. We complement the main graph with two extra panels: a panel on the left side that allows the users to obtain additional information of the semantic graph as a whole and a panel on the right with information about the specific node or edge selected on the main pane. Figure <ref type="figure">1</ref> presents a screenshot of our visualisation applied to the LinkedMDB dataset. The central pane is the graph, the left-side panel displays the top 10 types ranked according to the number of total entities of this type, and the right-side panel shows specific information about the node selected by the user (in this case performance). In what follows we give a brief description of the main features of our visualisation. Hierarchical navigation of types. Most of the type assignment in more complex RDF datasets is hierarchical. We take advantage of the forest-like shape of the RDF type hierarchy and include in our visualisation the ability to slice the type graph in different levels of this hierarchy. Figure <ref type="figure">2</ref>(a) describes the same visualisation of LinkedMDB, but where some nodes are hidden and the emphasis is on the Person node. In LinkedMDB there are several types which are a subclass of Person, and users interested in one of them can drill down on this node, subdividing Person into its subclasses, as shown in Figure <ref type="figure">2(b)</ref>. Navigation and summarisation of relations. It is not strange to find different relations amongst entities of the same type in an RDF dataset. For example, there are several relations between persons and films in LinkedMDB. As we did with nodes, we aggregate all relationship into one big set, and only show the more fine-grained relationships whenever they are requested by the user. This again prevents information overload, and allow to search only those relations that are interesting to the user. Summarization of attributes. An attribute of a certain entity u is a property that relates u with a string or value, such as performance actor, performance character, etc., in LinkedMDB, according to Figure <ref type="figure">1</ref>. Attributes are vital in asking questions such as What is the name of...? or In which year did...?.</p><p>However, not all entities of the same type have the same attributes, and it is difficult to show all the attributes of all types in the graph at the same time. For this reason, when users hover over or select a particular node in the graph, the right-side panel displays all the information about the attributes of entities of this type. Additionally, a percentage is shown that corresponds to the percentage of entities of the chosen type that posses this attribute. System architecture. The force layout was constructing using the D3 library.</p><p>To show the visualisation of RDF datasets we first preprocess them, using SPARQL, to generate a JSON file that contains only the needed information about the datasets. This file is mounted on a server, and clients using the interface may send server requests to query this file. This keeps the visualisation quite low on computational resources, the only computationally expensive part is the pre-computation, but this can be done in regular intervals. Scalability and limitations. Of course, this way of visualising RDF data can only scale as long as one can find a strict hierarchy of types: a flat RDF graph in which no type is a subclass of another type will be visualised just as any other force graph layout. Furthermore, even tough the system aggregates information into types, and therefore scales perfectly well over increasingly large dataset, it does suffer from scalability issues when the ontology is too large: running the visualisation over YAGO or DBPedia datasets (which have millions of types) does require computational powers beyond a personal computer.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Demonstration Overview</head><p>The demonstration will showcase the visualisations shown in our live server at http://jreutter.sitios.ing.uc.cl/VisualRDF.html. We start with LinlkedMDB, explaining the basic navigation options of our example, and then we show the scalability of the system by showing a portion of the YAGO database with more than a million types. We will also provide live access to endpoints of these datasets, and ask users to produce queries using only the visualisation as information for the dataset. The aim of the demo is to showcase the usefulness of these types of visualisations to produce SPARQL queries, and how navigating through type hierarchies is an important feature of these tools.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Fig. 1 .Fig. 2 .</head><label>12</label><figDesc>Fig. 1. The top-level visualisation of LinkedMDB's dataset</figDesc><graphic coords="3,134.77,115.83,348.71,201.07" type="bitmap" /></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Acknowledgements. Work funded by the Millennium Nucleus Center for Semantic Web Research under Grant NC120004.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">A preliminary investigation into SPARQL query complexity and federation in Bio2RDF</title>
		<author>
			<persName><forename type="first">C</forename><surname>Buil-Aranda</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Ugarte</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Arenas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Dumontier</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">AMW</title>
		<imprint>
			<biblScope unit="page">196</biblScope>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Dataset summary visualization with LOD-Sight</title>
		<author>
			<persName><forename type="first">M</forename><surname>Dudáš</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Svátek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Mynarz</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ESWC</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2015">2015</date>
			<biblScope unit="page" from="36" to="40" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<title level="m" type="main">Linked movie data base</title>
		<author>
			<persName><forename type="first">O</forename><surname>Hassanzadeh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">P</forename><surname>Consens</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2009">2009</date>
			<publisher>LDOW</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">LODPeas: Like peas in a LOD (cloud)</title>
		<author>
			<persName><forename type="first">A</forename><surname>Hogan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Munoz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Umbrich</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Billion Triple Challenge</title>
				<meeting>the Billion Triple Challenge</meeting>
		<imprint>
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">ExpLOD: summary-based exploration of interlinking and RDF usage in the linked open data cloud</title>
		<author>
			<persName><forename type="first">S</forename><surname>Khatchadourian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">P</forename><surname>Consens</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ESWC</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="272" to="287" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">An interactive map of semantic web ontology usage</title>
		<author>
			<persName><forename type="first">S</forename><surname>Kinsella</surname></persName>
		</author>
		<author>
			<persName><forename type="first">U</forename><surname>Bojars</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Harth</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">G</forename><surname>Breslin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Decker</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">12th ICIV</title>
				<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2008">2008. 2008</date>
			<biblScope unit="page" from="179" to="184" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Visualizing ontologies with vowl</title>
		<author>
			<persName><forename type="first">S</forename><surname>Lohmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Negru</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Haag</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Ertl</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Semantic Web</title>
				<imprint>
			<date type="published" when="2016">2016</date>
			<biblScope unit="page" from="1" to="21" />
		</imprint>
	</monogr>
	<note type="report_type">Preprint</note>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">SPARQture: a more welcoming entry to SPARQL endpoints</title>
		<author>
			<persName><forename type="first">F</forename><surname>Maali</surname></persName>
		</author>
		<ptr target="CEUR-WS.org" />
	</analytic>
	<monogr>
		<title level="m">IESD</title>
				<imprint>
			<date type="published" when="2014">2014</date>
			<biblScope unit="page" from="78" to="82" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">LSQ: The linked SPARQL queries dataset</title>
		<author>
			<persName><forename type="first">M</forename><surname>Saleem</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">I</forename><surname>Ali</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Hogan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Mehmood</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A.-C</forename><forename type="middle">N</forename><surname>Ngomo</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ISWC</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2015">2015</date>
			<biblScope unit="page" from="261" to="269" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
