=Paper=
{{Paper
|id=Vol-2187/paper5
|storemode=property
|title=Combining Faceted Search with Data-analytic Visualizations on Top of a SPARQL Endpoint
|pdfUrl=https://ceur-ws.org/Vol-2187/paper5.pdf
|volume=Vol-2187
|authors=Petri Leskinen,Goki Miyakita,Mikko Koho,Eero Hyvönen
|dblpUrl=https://dblp.org/rec/conf/semweb/LeskinenMKH18
}}
==Combining Faceted Search with Data-analytic Visualizations on Top of a SPARQL Endpoint==
Combining Faceted Search with Data-analytic
Visualizations on Top of a SPARQL Endpoint
Petri Leskinen1 , Goki Miyakita2 , Mikko Koho1 , and Eero Hyvönen1,3
1
Semantic Computing Research Group (SeCo), Aalto University, Finland
2
KMD Research Institute, Keio University, Japan
3
HELDIG – Helsinki Centre for Digital Humanities, University of Helsinki, Finland
http://seco.cs.aalto.fi, http://heldig.fi
Abstract. This paper discusses practical experiences on creating data-analytic
visualizations in a browser, on top of a SPARQL endpoint based on the results of
faceted search. Four use cases related to Digital Humanities research in proposog-
raphy are discussed where the SPARQL Faceter tool was used and extended in
different ways. The Faceter tool allows the user to select a group of people with
shared properties, e.g., people with the same place of birth, gender, profession,
or employer. The filtered data can then be visualized, e.g., as column charts, with
business graphics, sankey diagrams, or on a map. The use cases examine the
potential of visualization as well as automated knowledge discovery in Digital
Humanities research.
Keywords: Linked Data, Visualization, Biography, Prosopography, Knowledge
Discovery
1 Client-side Faceted Search on a SPARQL Endpoint
Faceted search and browsing [5,21], known also as view-based search [19] and dynamic
hierarchies [20], has become a norm in web applications. The idea here is to index data
items along orthogonal category hierarchies, i.e., facets 4 (e.g., places, times, document
types etc.) and use them for searching and browsing: the user selects categories on facets
in free order, and the data items included in the selected categories are considered search
results. After each selection, a count is computed for each category showing the number
of results, if the user next makes that selection. In this way, search is guided by avoiding
annoying ”no hits” results. Moreover, hit distributions on facets provide the end-user
with data-analytic views on what kind of items there are in the underlying database.
Faceted search is especially useful on the Semantic Web where hierarchical ontologies
used for data annotation provide a natural basis for facets, and reasoning can be used
for mapping heterogeneous data to facets [8]. The idea of combining faceted search and
visualizations has been applied, e.g., in ePistolarium5 . However, this application is not
based on Linked Data unlike ours [10,11,16,18].
4
The idea of facets dates back to the Colon Classification system of S. R. Ranganathan in
library science, published in 1933.
5
http://ckcc.huygens.knaw.nl/epistolarium/
53
Faceted Search and Data-analytic Visualizations on Top of a SPARQL Endpoint
Faceted search can be implemented with server-side solutions, such as Solr6 , Sphinx7 ,
and ElasticSearch8 , and higher level tools, such as vuFind9 . However there is a lack of
light-weight client-side faceted search tools or components that are able to search large
datasets directly from a SPARQL endpoint. Such a tool is useful, because it can be used
easily on virtually any open SPARQL endpoint on the web without any need for server
side programming and access rights. This paper presents such a tool, SPARQL Faceter,
a web component for implementing faceted search applications efficiently in a browser,
based only on a standard SPARQL API. We extend our earlier short paper of the tool
[14] by 1) showing detail about how the tool is used and works, by 2) explaining novel-
ties in its latest version, and 3) especially show how the tool and faceted search can be
extended with different kind of data-analytic visualizations.
As a proof of concept, four use case studies of data visualization are discussed
from a SPARQL Faceter perspective: 1) WarSampo, using cultural heritage materials
of World War II in Finland [10]. 2) Norssit, on top of a Finnish high school alumni
registry data [11]. 3) Semantic National Biography of Finland, based on the National
biography of the Finnish literature society [16]. 4) U.S. Congress Prosopographer, uti-
lizing biographical records of U.S. Congress legislators [18]. In these cases, the fol-
lowing two-step prosopographical research method [22, p. 47] is supported where the
goal is to find out some kind of commonness or average in selected target groups of
people. First, a target group of people is selected that share desired characteristics for
solving the research question at hand. Second, the target group is analyzed, and possibly
compared with other groups, in order to solve the research question. For finding target
groups, faceted search is used, and then visualizations are created in order to analyze
their characteristics.
The rest of the paper is organized as follows. First, characteristics of SPARQL
Faceter are explained with a focus on showing how it is used in practice in applica-
tions. After this, extending the tool with visualizations is in focus. In conclusion, lessons
learned and directions for further research are discussed.
2 Using and Extending SPARQL Faceter
SPARQL Faceter uses AngularJS10 as the implementation framework. The GitHub
page11 gives instructions how to install it, and how to define the application with facets
of desired type in the source code. The page provides demo examples with queries
to DBpedia and WarSampo databases. The developer can adopt it to any Linked Data
publication by configuring the endpoint, property paths for facets, and queries. The
SPARQL Faceter is documented in detail12 .
6
http://lucene.apache.org/solr/
7
http://sphinxsearch.com/blog/2013/06/21/faceted-search-with-sphinx/
8
https://www.elastic.co/
9
https://vufind.org/
10
https://angularjs.org/
11
https://github.com/SemanticComputing/angular-semantic-faceted-search
12
http://semanticcomputing.github.io/angular-semantic-faceted-search/#/api
54
Faceted Search and Data-analytic Visualizations on Top of a SPARQL Endpoint
PREFIX xsd:
PREFIX schema:
PREFIX skos:
PREFIX skosxl:
PREFIX nbf:
PREFIX crm:
PREFIX foaf:
PREFIX gvp:
SELECT DISTINCT (?id AS ?id__uri) ?id__name ?value WHERE {
# Restraints set in Faceter
{ ?id a nbf:PersonConcept ;
foaf:focus/ˆcrm:P98_brought_into_life/
nbf:time/gvp:estStart ?slider_2 .
FILTER (1800<=year(?slider_2) && year(?slider_2)<=2018)
}
# Query person’s age
?id foaf:focus/ˆcrm:P100_was_death_of/nbf:time
[ gvp:estStart ?time ; gvp:estEnd ?time2 ] ;
foaf:focus/ˆcrm:P98_brought_into_life/nbf:time
[ gvp:estStart ?birth ; gvp:estEnd ?birth2 ] .
BIND (xsd:integer(0.5*
(year(?time)+year(?time2)-year(?birth)-year(?birth2)))
AS ?value)
# Filter out erroneous cases
FILTER (-1