<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A General-Purpose Visual Query Language for Knowledge Graphs with Bidirectional Transformations</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Qiang Fu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Xin Wang</string-name>
          <email>wangxg@tju.edu.cn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yuan-Fang Li</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>College of Intelligence and Computing, Tianjin University</institution>
          ,
          <addr-line>Tianjin</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Monash University</institution>
          ,
          <addr-line>Melbourne</addr-line>
          ,
          <country country="AU">Australia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper, we present a general-purpose interactive visual query language, GPVQL, to improve the e ciency of end-users' understanding and querying of knowledge graphs. Furthermore, GPVQL realizes the novel capability of exible bidirectional transformations between query patterns and graph results, therefore signi cantly assists end-users in formulating queries over large and unfamiliar knowledge graphs in an incremental way. We present the syntax and semantics of GPVQL, discuss our design rationale behind this interactive visual query language, and evaluate the e ectiveness of a visual query system based on GPVQL against a number of textual and visual query environments over a large knowledge graph, DBpedia. Our evaluation demonstrates the GPVQL's superiority in e ectiveness and accurateness.</p>
      </abstract>
      <kwd-group>
        <kwd>Knowledge graphs Bidirectional transformation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Arti cial intelligence has become a powerful tool to meet practical requirements
in various domains. Knowledge graphs have been identi ed as a critical
component in diverse AI-based applications. Thus, designing query languages to
support the e ective and e cient exploration and query answering over
knowledge graphs has become a key research problem under insentive investigation.
However, a number of key challenges still remain on the generalizability and ease
of use of these query languages.</p>
      <p>An illustrating example of the process from a question to the corresponding
query results is shown in Fig. 1. As can be seen from the textual query (in
SPARQL) at the top middle in Fig. 1, it may be di cult for an end-users to
quickly learn and use such a query language, or to be familiar with a knowledge
graph. Visual query languages can help make it easier for users to construct query
Copyright c 2020 for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0).</p>
      <p>Question
Find philosophers who
were influenced by people
that influenced Bob Black.</p>
      <p>Edit
Draw</p>
      <p>Textual language
SELECT * WHERE {
Bob Black influencedBy ?x.
?y influencedBy ?x.}</p>
      <p>Unidirectional</p>
      <p>query
Bob Black ?m</p>
      <p>influencedBy ?x influbeirnthcYeedaBry ?z
influenced
?y influencedBy</p>
      <p>Visual language</p>
      <p>Bidirectional
deathYear ?n transformation
?y
?z
?m
?x
?x
Graphical result
?z
?n
patterns, as shown in the bottom middle of Fig. 1. However, end-users may not
understand the correspondence between a query pattern and its results (such as
those nodes labeled by the same variable name at the bottom middle/right of
Fig. 1) when they are exploring a knowledge graph. Such correspondences allow a
user to easily modify and expand the current query to build more complex ones
and are supported by the bidirectional transformation functionality provided
by our GPVQL visual language. In this paper, we introduce the visual syntax
and semantics of GPVQL, discuss our design rationale, and demonstrate its
accurateness and e ectiveness in a user study.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Our Approach</title>
      <p>
        In GPVQL, (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) single circles with dotted and solid lines indicate variables and
constants, respectively; (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) two single circles connected by a directed edge denote
a basic triple pattern; and (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) double circles and rectangles represent operators
and parameters in a query pattern, respectively. We illustrate the syntax and
semantics of GPVQL with ve examples in Table 1 and the corresponding queries
in three textual query languages: SPARQL, Cypher and Gremlin. In GPVQL,
end-users do not need to learn a speci c textual query language to write queries.
What end-users need to learn are just query examples that can then be used in
an interactive query-by-example (QBE [2]) way.
      </p>
      <p>Based on our proposed GPVQL language, we have developed a visual
interface that addresses these challenges to achieve our goal of enabling end-users to
query and explore knowledge graphs. The interface 3 of GPVQL is composed of
six main components that are displayed in four panes in Fig. 2. Fig. 2 shows
three components: an interactive visual query editor (left pane), keyword and
type search panel (right top pane), and query code display panel (right bottom
pane). In the top left are buttons for additional functionality: adding new nodes,
showing user manual, user study documents, using force-directed layout, etc. In
GPVQL, the construction of a query pattern is mainly done in a drag-and-drop
manner. The operations (i.e., Expand, Collapse, Filter, Lock, Optional, and
Union) are in the right-click context menu. As shown in Fig. 2, GPVQL not only
supports keyword queries, but also type-based queries as the starting point. The
process starts with a blank visual query editor. A user can choose keyword or
type as the starting point for queries. For example, as shown in Fig. 2, after the
user adds a new node 1 , and input \Bob" as a keyword 2 , the related entities
will be automatically displayed. Once \Bob Black" is chosen, the thumbnail 3
will be displayed automatically. Further, the user can nd people who have
in</p>
      <p>uenced both Bob Black and people in uenced by Bob Black (the corresponding
query pattern is marked with red arrows). Users can select the Expand
operation in the context menu to expand a result. The expanded results are pointed
to with blue arrows. When users are interested in some query result, deep
exploration to query additional information is supported. For example, users can
nd the birth year and death year of Peter Kropotkin, or people who in uenced
Marshall Sahlins (pointed to with green arrows).</p>
    </sec>
    <sec id="sec-3">
      <title>User Study and Evaluation</title>
      <p>We conducted a user study to assess how accurate and e ective our new visual
query language GPVQL 3 is in comparison to the current semantic query
language SPARQL, a visual query language QueryVowl [1], and RDF Explorer [3].
We used the same SPARQL endpoint 4 of the DBpedia knowledge graph to
ensure the fairness of the study. We created ve tasks, shown in Table 2, based
on an informal survey of interesting patterns that people formed when exploring
prior graph query research works. We recruited 20 participants in total and
divided them into four groups evenly. Each group uses a di erent tool, e.g., Group
A + GPVQL, Group B + SPARQL, Group C + QueryVOWL, and Group D +
RDF Explorer.</p>
      <p>We ranked the di culty of each task based on the number of nodes, edges,
and constraints needed. As can be seen in Table 2, the tasks are related to each
other, and that later task can be built incrementally from previous tasks. This
design is intended to simulate the actual work ow in a real-world setting, where
an end-user typically writes related and incrementally more complex queries,
but not unrelated random ones.</p>
      <p>
        From Fig. 3, we can obtain the following conclusions: (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) Among the ve
tasks, the query completion times of GPVQL are the shortest; (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) As the tasks
become more di cult, the query completion times (resp. accuracy) of SPARQL
are gradually increasing (resp. reducing). However, GPVQL outperforms all the
other systems: SPARQL, QueryVOWL, and RDF Explorer, and the accuracy of
GPVQL remains above 60%. As shown in Fig. 3(b), GPVQL moderately
outperforms QueryVOWL on accuracy for all tasks, and signi cantly outperforms
QueryVOWL for the more complex ones, i.e., Task 4 and Task 5. Compared
3 http://gpvql.gq/
4 http://dbpedia.org/sparql/
      </p>
      <p>GPVQL
SPARQL
QueryVOWL
RDFExplorer
40
20
0 Task1</p>
      <p>Task2 Task3 Task4 Task5
(a) Mean completion time for each tool
per task.
(b) Mean accuracy for each tool per
task.
with RDF Explorer, GPVQL reduces the completion time by approx. 15%, and
increases the accuracy by about 10%. In summary, our evaluation demonstrates
the superiority of GPVQL in both aspects of completion time and accuracy. We
believe that the main reasons of GPVQL's superiority are bidirectional
transformation and the deep exploration it facilitates. For changes of the query tasks,
traditional query languages requires users to construct new queries from scratch,
which reduces the speed, usability, and user-friendliness. With the bidirectional
transformation in GPVQL, users can use query results of the previous query
pattern as the input of the next query pattern, which reduces the di culty and
time used of creating queries.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Conclusion</title>
      <p>In this paper, we propose a general-purpose visual query language. The main
advantage of GPVQL is the exible bidirectional transformation between query
patterns and graph results, which is a useful method for end-users to gain insights
of large-scale knowledge graphs, and eliminates the boundary between query
patterns and graph results. We experimentally validated the accurateness and
e ectiveness of GPVQL and its associated interface, showing its superiority over
other visual query languages.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Haag</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lohmann</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Siek</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ertl</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Queryvowl: A visual query notation for linked data</article-title>
          .
          <source>In: International Semantic Web Conference</source>
          . pp.
          <volume>387</volume>
          {
          <fpage>402</fpage>
          . Springer (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Jayaram</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Khan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yan</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Elmasri</surname>
          </string-name>
          , R.:
          <article-title>Querying knowledge graphs by example entity tuples</article-title>
          .
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          <volume>27</volume>
          (
          <issue>10</issue>
          ),
          <volume>2797</volume>
          {
          <fpage>2811</fpage>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Vargas</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Buil-Aranda</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hogan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lopez</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Rdf explorer: A visual sparql query builder</article-title>
          .
          <source>In: International Semantic Web Conference</source>
          . pp.
          <volume>647</volume>
          {
          <fpage>663</fpage>
          . Springer (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>