=Paper= {{Paper |id=None |storemode=property |title=Konduit VQB: a Visual Query Builder for SPARQL on the Social Semantic Desktop |pdfUrl=https://ceur-ws.org/Vol-565/paper4.pdf |volume=Vol-565 }} ==Konduit VQB: a Visual Query Builder for SPARQL on the Social Semantic Desktop== https://ceur-ws.org/Vol-565/paper4.pdf
           Konduit VQB: a Visual Query Builder for SPARQL
                   on the Social Semantic Desktop
               Oszkár Ambrus                Knud Möller               Siegfried Handschuh
            oszkar.ambrus@deri.org      knud.moeller@deri.org        siegfried.handschuh@deri.org
                              Digital Enterprise Research Institute (DERI)
                            National University of Ireland, Galway (NUIG)

ABSTRACT                                                                Konduit [10] is a tool for building visual workflows for RDF
With the adoption of Nepomuk as an organic part of KDE                  data within Nepomuk-KDE, allowing for a flexible access to
the semantic desktop became a reality to a great number of              the local RDF data as well as mashing up with web-based
users and is employed by a growing number of applications.              data. It features a visual programming environment and al-
Thus, the amount of semantic data is constantly growing on              lows for various manipulations (merging, filtering, mashing
the desktop. Therefore users need a way to access this data             up, creating visual workflows, etc.) as well as executing dif-
outside of the limiting use cases of the applications employ-           ferent actions (executing scripts, automatizing emails, etc.)
ing Nepomuk-KDE.                                                        using the queried RDF data. A query builder is used to gen-
                                                                        erate SPARQL queries for querying components which act
We aim to assist users in building queries and running them             as data sources in the RDF workflows, producing data that is
to make use of RDF data that would otherwise be partially               made use of further in the workflow.
or completely hidden. In this paper, as an initial iteration of
our efforts, we present four approaches to building SPARQL              We want to provide a way for users to build these queries
queries visually, based on two different categorizations: sche-         in an intuitive way, with having no or little knowledge about
ma-based vs. instance-based and SELECT vs. CONSTRUCT                    the querying language (i.e., SPARQL [11]). Although this
queries. We present the used interfaces, visual languages and           does not mean a complete abstraction from the underlying
query generation methods associated to each of approaches               details, we aim to assist users with limited technical knowl-
as well as the autocompletion techniques for the instance-              edge as well as those who know SPARQL and RDF (with
based query builders.                                                   an emphasis on the latter, though) and provide an interface
                                                                        that suits the needs of both. We try to provide a tool with
Author Keywords                                                         the necessary features similar tools provide, that also sup-
Visual Query Builder, SPARQL, Nepomuk                                   ports RDF, provides search assistance on a whole repository
                                                                        (as opposed to a single ontology) and is also integrated into
INTRODUCTION
                                                                        Nepomuk-KDE (within Konduit and beyond) allowing for
                                                                        local data-driven querying as well as sharing queries online.
The Social Semantic Desktop [4] is a paradigm transpos-
ing Semantic Web concepts unto the desktop. Ontologies
                                                                        We explore several approaches due to the structured nature
thus conceptualize information and semantic data is stored
                                                                        of RDF data and the difficulty of searching and querying
in RDF. It loosens the borders between applications and pro-
                                                                        it in an intuitive and transparent manner. We present, as a
vides a unified environment. The Nepomuk project [6] out-
                                                                        first attempt in our research on visual query builders, four
lines the requirements and functionalities of the Social Se-
                                                                        interfaces aiming to achieve the above goal, with different
mantic Desktop and defines an architecture specification that
                                                                        approaches to query building and varying degrees of com-
fulfills these requirements. Nepomuk-KDE1 is a reference
                                                                        plexity. The first two of them are schema-based, allowing
implementation of Nepomuk. It provides a platform to cre-
                                                                        for building queries using restrictions based on the ontology
ate and handle all kinds of metadata. It uses RDF stores for
                                                                        structure. The second two use a triple construction-based
the metadata persistence and provides a middleware for ap-
                                                                        approach; the user constructs the restricting triples of the
plications to build upon, allowing them to store and access
                                                                        SPARQL query assisted by suggestions using both schema
the semantic data on the desktop (or, alternatively, the Web).
                                                                        and instance information from the underlying repository.
1
    http://nepomuk.kde.org/
                                                                        RELATED WORK
                                                                        There are a number of tools that aim to assist users in build-
                                                                        ing queries for semantic data. Many of them provide novel
                                                                        and intuitive approaches and demonstrate useful features,
                                                                        such as NITELIGHT [12] or RDF-GL [7] aiming to repre-
                                                                        sent SPARQL constructs through graphical metaphors. Mash-
Workshop on Visual Interfaces to the Social and Semantic Web
                                                                        QL [8], GRQL [1] and GLOO [5] propose queries as trees
(VISSW2010), IUI2010, Feb 7, 2010, Hong Kong, China. Copyright is       starting from a given class and restricting it incrementally
held by the author/owner(s).

                                                                    1
on the branches. SPARQLViz [2] provides a click-through
wizard for composing queries and SEWASIE [3] features a
limited ontology-based query formulation.

Nevertheless, several of the tools only support querying based
on a single ontology, some of them do not support RDF and
SPARQL. Some of them require extensive manual editing
of the queries, or do not feature clear relationships between
query parts. Moreover, excepting SPARQLViz, all query
builders are web-based (or only usable within their own sys-
tem), not allowing for the integration of desktop data.

Our two schema-based interfaces are mostly built on the in-
tuition of MashQL and GRQL, in exposing schema struc-
tures and possible restrictions branching from an initial class.
The instance-based query builders resemble SPARQLViz in
providing forms for the user to complete, but feature a single,
less confusing and simple interface, with clear connections
between the query parts.

RUNNING EXAMPLE
                                                                                Figure 1. Schema-based SELECT query builder.
Suppose we are searching for a contact from the local repos-
itory whose name contains the letter ‘K’ and has written a
publication on the semantic desktop.                                   don’t exist in the repository, adding properties that have not
                                                                       been defined or converting an instance from one ontology to
QUERY BUILDING                                                         another, e.g., transforming ?v foaf:name ‘‘Smith’’
The interface of the query builder application features a cen-         to ?v nco:fullName ‘‘Smith’’.
tral part for the visual query builder and previewers for the
query and its results, as well as menu actions and a status bar.       SCHEMA-BASED SELECT QUERIES
The central query builder has four incarnations as presented           The schema-based SELECT query builder (Figure 1) is the
in the following sections, employing different approaches to           most user-friendly approach to building SPARQL queries,
building SPARQL queries.                                               aimed at users having the least knowledge of semantic tech-
                                                                       nologies. We have devised a simplification of the SPARQL
The most common searching interface, a search box for sim-             language, which allows for this particular kind of builder.
ple keyword searches to retrieve semantic information is high-         It allows for selecting a class (which will be the type of
ly ambiguous and needs extensive research, so abstracting              the queried variable) and restricting it in a tree-like manner
completely from the structure of semantic data is not yet our          through its properties.
intention. Also, we can’t yet provide a fully-featured ma-
ture semantic querying application, as we aim to explore the           Queries are built using schema information from Nepomuk.
most appropriate ways to do it and research the possibilities          The possible classes and predicates to be chosen are queried
and limitations in accomplishing this task.                            and presented to the user.
The two schema-based approaches use schema information                 We start from the assumption that users most often want
from the ontologies in the Nepomuk system, allowing users              to find certain information belonging to an entity of certain
to compose tree-based queries in restricting the properties of         classification and/or having a number of known restricting
the resulting objects. This allows users to explore the local          characteristics. For example, one would want to search for a
schema structures. These approaches feature a simplification           contact person (entity) with a given name (restricting char-
of SPARQL, in allowing only to restrict the initial selection          acteristic), similarly to doing a free text search.
descending in a tree-like fashion.
                                                                       The interface therefore provides a way to build queries as
The instance-based approaches are based on constructing                trees, starting with the type of entity the user inquires for
triples without necessarily knowing RDF or SPARQL (sim-                and progressing with restrictions on the branches of the tree.
ilarly to the Wikipedia Visual Query Builder2 . They use
schema information as well as instance information in sug-             The Visual Language
gesting users possibilities in completing the subjects, predi-         The visual language covers a subset of SPARQL. Listing 1
cates and objects of the constraining triples of the SPARQL            provides a formal description of the queries built through the
query. This autocompletion allows for users to explore the             visual facilities of the interface. Note that the missing termi-
data stored on the local RDF repository. The instance-based            nal definitions ClassN ame and P redicate are IRI refer-
query builders also allow for the construction of triples that         ences, LiteralV alue is a string literal and V ariableN ame
2
    http://dl-learner.org/Projects/dbpedia                             is a SPARQL variable (such as ?v or $x). Relation denotes


                                                                   2
a relation such as contains, equals, etc.
Query                            : : = Outputs Conditions
Outputs                          : : = RootNode | L i t e r a l N o d e
Conditions                       ::= GraphPattern
GraphPattern                     : : = QueryTree+
QueryTree                        : : = RootNode TreeNumber
RootNode                         : : = ClassName V a r i a b l e N a m e R e s t r i c t i o n s
Restrictions                     : : = QueryNode∗
QueryNode                        ::= ClassRestriction | LiteralRestriction
C l a s s R e s t r i c t i o n : : = P r e d i c a t e RootNode
L i t e r a l R e s t r i c t i o n ::= Predicate Relation LiteralNode
LiteralNode                      : : = VariableName L i t e r a l V a l u e


  Listing 1. EBNF description of the visual language (non-terminals)



Query Generation
Queries are generated based on the visual description ac-
cording to the defined language. The SELECT part of the
queries will be a set of variables extracted from the com-
ponents (class combo boxes or literal text boxes) selected
as Outputs. It is made up of the variable names in the
RootN odes and LiteralN odes, for class combo boxes or
literal text boxes, respectively. This happens in a transpar-                                                   Figure 2. Schema-based CONSTRUCT query builder.
ent way, as variables are extracted automatically from the
selected components.
                                                                                                       Visual Language Extension
The WHERE clause is a set of RDF triples generated from                                                The visual language is extended, as the output part of the
the query tree structures present in the query form. We have                                           queries is built using graph patterns constructed from triples.
the following three cases: (1) For the root RootN ode the                                              Outputs      ::= GraphPattern
generated triple is VariableName a ClassName and                                                       GraphPattern ::= Triple+
                                                                                                       Triple       ::= Subject Predicate Object
for every restriction a triple is generated starting with Vari-                                        Subject      : : = VariableName
ableName and continuing as presented in the following                                                  Predicate    : : = P r e d i c a t e N a m e | V a r i a b l e N a m e | ClassName
                                                                                                       Object       : : = P r e d i c a t e N a m e | V a r i a b l e N a m e | ClassName
points (e.g. ?v41 a foaf:Person generated for the root                                                      | LiteralValue
node in Figure 1). (2) A ClassRestriction completes the
parent’s triple with Predicate VariableName (where
                                                                                                           Listing 2. EBNF description of the output part (non-terminals)
VariableName is the variable of the RootN ode belong-
ing to the new restriction) (e.g. ?v41 foaf:
publications ?v77 generated for the publication                                                        Query Generation Extension
restriction in Figure 1). (3) A LiteralN ode completes the                                             The way outputs are generated has been changed for this
triple with Predicate VariableName adding a regu-                                                      query builder: in this case outputs are triples and consist of
lar expression filter string according to the chosen Relation                                          the triples described in the graph pattern.
and LiteralV alue (e.g. FILTER regex(?v59, ’K’,
’i’) generated for the first restriction shown in Figure 1).                                           User Interface Extension
                                                                                                       The user interface is extended with a component for compos-
User Interface                                                                                         ing output triples, as shown in Figure 2. It lists all variables
The query builder form (Figure 1) allows for adding several                                            for the subject field, all variables, predicates and classes for
query trees (staring from different classes), and restricting                                          the predicate field and all variables, predicates, classes and
them by their properties. If a property has a literal range, the                                       literal values for the object field. Users can select desired
user can enter a value and restrict it on a relation, such as                                          triples and add them to the output list.
equals or contains. If its range is not a literal, they can
add restrictions.                                                                                      INSTANCE-BASED SELECT QUERIES
                                                                                                       Building queries with the instance-based SELECT builder
Outputs are selected by right clicking on a combo box repre-                                           relies on schema and instance information from the underly-
senting a class or a value. The corresponding variable will be                                         ing RDF repository.
added to the output, and the combo box will be highlighted.
                                                                                                       The Visual Language
SCHEMA-BASED CONSTRUCT QUERIES                                                                         Variable, class and predicate names are IRIs describing the
The approach to building schema-based CONSTRUCT que-                                                   corresponding entities, as described for the previous builders,
ries is very similar to the one shown in the previous section.                                         where the meaning of Relation is also explained. Instance
The difference lies in the way in which the outputs are se-                                            IRIs are taken from the repository as autocompletion pop-
lected.                                                                                                ups based on user input. The LiteralV alue represents valid


                                                                                                   3
SPARQL literals, such as strings or integers (e.g. "Exam-
ple" ). See Listing 3 for the formal EBNF description.
Query           : : = Outputs Conditions
Outputs         : : = VariableName+
Conditions      ::= GraphPattern
GraphPattern    ::= Triple+
Triple          ::= Subject Predicate Object
Subject         : : = V a r i a b l e N a m e | ClassName | I n s t a n c e I R I
Predicate       : : = VariableName | PredicateName
Object          : : = V a r i a b l e N a m e | ClassName | I n s t a n c e I R I
      | Literal | FilterExpression
Literal         : : = L i t e r a l V a l u e { DataType }?
FilterExpression ::= Relation LiteralValue


Listing 3.    Visual language of the instance-based SELECT query
builder


Query Generation                                                                        Figure 3. Instance-based SELECT query builder. Resulting query is
Queries are constructed by enumerating the variable names                               identical to the one shown in Fig. 1.
for the SELECT part and taking the list of triples for the
WHERE part. Filter expressions are built by adding a regu-
lar expression filter string according to the chosen relation.

Autocompletion
The interface features an incremental autocompletion for the
WHERE part based on user input described in [9], because
of the infeasibility of listing all the instances. Whenever the
user types something into any of the text boxes, the sys-
tem pops up all the possible options of RDF entity names,
classes, properties or instance identifiers and values. This
helps the user in exploring the underlying data set.                                    Figure 4. Instance-based CONSTRUCT query builder — output part.

Autocompletion is achieved by running an incremental query
                                                                                        Query Generation Extension
every time the user enters text. The system queries for all en-
tity names (classes, predicates) as well as all instance iden-                          The query generation is extended by taking the triples from
tifiers or values that contain the user input and match the                             the Output graph pattern and enclosing them in the CON-
graph pattern constructed so far (for the final query to have                           STRUCT part of the query.
results). The user is then presented with the list of possible
options, this list incrementally growing as more matches are                            Autocompletion Extension
found from the RDF repository.                                                          This interface also supports autocompletion for the output
                                                                                        part of the query, similarly to its sibling interface. There is a
User Interface                                                                          slight modification, however, since the output triplesv́alues
The user interface features a component for building con-                               do not need to satisfy any conditions, only being present in
ditional (WHERE) triples, as shown in Figure 3. The text                                forming the output triples. Thus, it is not required to comply
fields provide autocompletion popups for user input, based                              with the rest of the graph pattern, so all existing entities are
on what the repository contains, against the triples that have                          suggested to the user, that match the given input.
already been added to the output list. The user can select a
filter option using the desired relation or can specify the type                        User Interface Extension
of the object, the latter defining the way it will be formatted                         The user interface is extended with an output construction
and/or suffixed in the output.                                                          part, shown in Fig. 4, having an almost identical structure to
                                                                                        the triple building component for the conditional (WHERE)
Outputs are selected from an output list that is populated                              part, excluding the filtering option.
with all the variables occurring in the conditional triples.
                                                                                        DISCUSSION
INSTANCE-BASED CONSTRUCT QUERIES                                                        The schema-based SELECT query builder allows for sim-
Visual Language Extension                                                               ple querying and restricting the desired properties with user-
The visual language is extended with triple patterns for the                            defined input. The advantage is that it is simple, intuitive
output part as well. They are identical to the GraphP attern                            and satisfies the large number of occasions when the user
nonterminal defined for instance-based SELECT queries.                                  wants to search for something based on certain properties.
There is one exception, namely the lack of F ilterExpres-                               Selecting the outputs is also straightforward and clear. The
sions, since such expressions do not exist in the output part                           disadvantage is that it is limiting and inflexible, only allow-
of SPARQL queries for obvious reasons.                                                  ing querying as trees, thus being unsuitable for some cases a


                                                                                    4
proficient user would meet.                                                       the European project NEPOMUK No. FP6-027705.

The schema-based CONSTRUCT query builder allows to                                REFERENCES
construct triples, this being required in many cases (in Kon-                      1. N. Athanasis, V. Christophides, and D. Kotzinos.
duit, all data are RDF triples). The advantages/disadvantages                         Generating on the fly queries for the semantic web: The
are similar to its sibling approach, adding the complication                          ICS-FORTH graphical RQL interface (GRQL). Lecture
of selecting the correct variables and classes for the output                         notes in computer science, pages 486–501, 2004.
triples, but adding the flexibility of formatting the output.
                                                                                   2. J. Borsje, H. Embregts, and S. F. Frasincar. Graphical
For the instance-based SELECT query builder we have the                               query composition and natural language processing in
advantages of conditioning the results in a flexible, data-                           an rdf visualization interface, 2006.
driven manner (with autocompletion based on the data in                            3. T. Catarci, P. Dongilli, T. Di Mascio, E. Franconi,
the repository), using user-defined variables and types, and                          G. Santucci, and S. Tessaris. An ontology based visual
reusing variables. The disadvantage is that it features a direct                      tool for query formulation support. In ECAI,
correspondence of the underlying RDF structure, making it                             volume 16, page 308, 2004.
more complicated than the schema-based interfaces.
                                                                                   4. S. Decker and M. R. Frank. The networked semantic
The instance-based CONSTRUCT query builder is the most                                desktop. In WWW Workshop on Application Design,
flexible, triple-based query assistant, making it possible to                         Development and Implementation Issues in the
compose advanced queries, with the obvious disadvantage                               Semantic Web, 2004.
of being the least accessible to naı̈ve users.                                     5. A. Fadhil and V. Haarslev. Gloo: A graphical query
                                                                                      language for owl ontologies. In B. C. Grau, P. Hitzler,
CONCLUSIONS                                                                           C. Shankey, and E. Wallace, editors, OWLED, volume
We have presented four techniques to assist users in build-                           216 of CEUR Workshop Proceedings. CEUR-WS.org,
ing SPARQL queries to retrieve information from the ever-                             2006.
growing collection of semantic data. Aimed at beginners                            6. T. Groza, S. Handschuh, K. Möller, G. Grimnes,
and proficient users as well, the interfaces feature a range of                       L. Sauermann, E. Minack, C. Mesnage, M. Jazayeri,
approaches that ease the composition of queries. The query                            G. Reif, and R. Gudjónsdóttir. The NEPOMUK project
builders are based on the local (or possibly, remote) reposi-                         — on the way to the social semantic desktop. In
tory, facilitating the discovery of the RDF store.                                    T. Pellegrini and S. Schaffert, editors, Proceedings of
                                                                                      I-Semantics’ 07, pages pp. 201–211. JUCS, 2007.
The first two approaches relied solely on schema informa-
tion, helping the users to query for instances of existing classes                 7. F. Hogenboom, V. Milea, F. Frasincar, and U. Kaymak.
and restricting them with the available properties. One of                            RDF-GL: A SPARQL-Based Graphical Query
these is intended for writing SELECT queries by selecting                             Language for RDF.
a set of outputs, the other one is for CONSTRUCT queries                           8. M. Jarrar and M. D. Dikaiakos. Mashql: a
presenting the used variables, predicates and classes in three                        query-by-diagram topping sparql. In ONISW ’08:
lists for selecting the subject, predicate and object.                                Proceeding of the 2nd international workshop on
                                                                                      Ontologies and nformation systems for the semantic
The other two approaches use instance information as well                             web, pages 89–96, New York, NY, USA, 2008. ACM.
in providing autocompletion popups based on user input to
suggest possible options, taking into consideration the triples                    9. K. Möller. Lifecycle Support for Data on the Semantic
previously added. The SELECT query builder simply al-                                 Web. PhD thesis, National University of Ireland,
lows for selecting the variables used in the query for output,                        Galway, 2009.
and the CONSTRUCT query builder allows for composing                              10. K. Möller, S. Handschuh, S. Trug, L. Josan, and
triples from the variables used and all existing entities in the                      S. Decker. Demo: Visual programming for the semantic
repository.                                                                           desktop with Konduit. In 5th European Semantic Web
                                                                                      Conference (ESWC2008), Tenerife, Spain, volume 5021
We plan to perform a usability evaluation in determining the                          of LNCS, pages 849–553. Springer, June 2008.
most appropriate tool from the ones presented for building
SPARQL queries within Konduit and Nepomuk-KDE. We                                 11. E. Prud’hommeaux and A. Seaborne. SPARQL query
will present it to users with no RDF/SPARQL background                                language for RDF. Recommendation, W3C, January
as well as users with deep knowledge in semantic technolo-                            2008. http:
gies to decide on the future direction in what approach and                           //www.w3.org/TR/rdf-sparql-query/.
features best suits a SPARQL query builder aimed at a fairly                      12. P. R. Smart and Russell. A visual approach to semantic
wide variety of users.                                                                query design using a web-based graphical query
                                                                                      designer. In EKAW ’08: Proceedings of the 16th
Acknowledgments                                                                       international conference on Knowledge Engineering,
The work presented in this paper has been funded (in part) by Science Foun-           pages 275–291, Berlin, Heidelberg, 2008.
dation Ireland under Grant No. SFI/08/CE/I1380 (Lı́on-2) and (in part) by             Springer-Verlag.


                                                                              5