<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>How to bring some MAGIC to SPARQL</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Christina Ehrlinger</string-name>
          <email>Christina.Ehrlinger@uni-passau.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University Passau</institution>
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2019</year>
      </pub-date>
      <abstract>
        <p>For SPARQL, the query language to retrieve information from an RDF graph, the optimization of the order of the containing triples is a challenging topical issue. In this paper, we show an approach which passes the information about the already found binding for a variable to all other occurrences of this variable in combination with a cost model in order to minimize the execution time. Di erent to other approaches our approach as described in this paper can be applied to a SPARQL query, which not only consists of basic graph pattern, but also contains group graph pattern like FILTER or OPTIONAL. Our experiments show the applicability of our approach and function as preliminary proof of concept.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>The structure of the web changes from linked documents,
which contain the information, to directly link the
information using the keyword Linked Open Data. This direct linking
of information generates a huge number of quite big graphs.
In order to handle these graphs, the Resource
Description Framework RDF 1 is one of the used standard models
for data interchange on the Web. It uses Uniform Resource
Identi ers (URIs) to refer to resources in order to connect
them. The resulting structure is a directed, labeled graph,
where the edges represent the connection between the
resources. These RDF graphs are generated very easily due to
the absence of a schema. There exist specialized databases
for storing these RDF graphs, so-called triple stores 2.</p>
      <p>
        The SPARQL query language [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] is used to access the
information stored in these RDF graphs. These SPARQL
queries must be performant in order to nd the desired
information in these huge graphs. For optimizing the execution
time of a SPARQL query we can consider two di erent
approaches. First, we can design a triple store, which optimizes
the storing or the usage of indices. There are also approaches
1https://www.w3.org/RDF/
2https://www.w3.org/wiki/LargeTripleStores
which focus on this like the RDF-3X triple store [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Second,
we can rewrite the SPARQL query in order to optimize the
order of the triples.
      </p>
      <p>The same kind of optimizations took place for SQL. There
are highly performant database systems which also considers
the storage of the relations as well as techniques to optimize
the order of the joins, which have to be done in order to
execute SQL queries with more than one relation.</p>
      <p>The aim of this approach is not an optimization, which
perfectly ts into a self-designed triple store. It is a generic
way how to reorder SPARQL queries in order to use their
full potential according to their execution time.</p>
      <p>The rest of the paper is organized as follows. Section 2
describes the problem statement for the reordering of the
triples and states a motivational example. Section 3
presents our approach for using the sideways information
passing rst for a query containing a basic graph pattern and
furthermore the adaption of the approach in order to use
other components besides to a basic graph pattern, for
example FILTER. Section 4 shows the experiments which were
executed so far. Section 5 discusses related work.
2.</p>
    </sec>
    <sec id="sec-2">
      <title>PROBLEM STATEMENT</title>
      <p>
        In order to present our approach for reordering a
SPARQL query, we rst have a closer look at a simple SPARQL
query and discuss the impact of di erent orderings.
Considering the SPARQL query in Listing 1. This query can be
executed against a dataset generated using the Lehigh
University Benchmark (LUBM) data generator [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. It consists
of 5 triples, whereas triple t5 also contains a literal.
      </p>
      <sec id="sec-2-1">
        <title>SELECT</title>
        <p>WHERE f
?p : t e a c h e r O f ? c .
? s : t a k e s C o u r s e ? c .
? s : a d v i s o r ?p .
? s : memberOf ?d .</p>
        <p>?d : name ' Department0 ' .
g
#t 1
#t 2
#t 3
#t 4
#t 5
Listing 1: Example for a SPARQL query</p>
        <p>While executing the query as shown in Listing 1 using
the common triple store GraphDB 3, the query has an
average execution time of 4606 ms while executing ten times.
If we slightly change the order of the triples, we get the
query shown in Listing 2, whereas this query has an
average execution time of 12283 ms (executed ten times). Even
for this simple query executed against a quite small dataset
of approximately 130 000 triples we see the huge impact of
reordering the triples of a SPARQL query.</p>
      </sec>
      <sec id="sec-2-2">
        <title>SELECT</title>
        <p>WHERE f
? s : t a k e s C o u r s e ? c .
?p : t e a c h e r O f ? c .
? s : memberOf ?d .
? s : a d v i s o r ?p .</p>
        <p>?d : name ' Department0 ' .</p>
        <p>Listing 2: Reordered SPARQL query from Listing 1</p>
        <p>We also nd examples of reordering which lead to a high
improvement regarding the execution time. In Listing 3 we
again have the same SPARQL query apart from the ordering
of the triples. This query has an average execution time of
325 ms.</p>
      </sec>
      <sec id="sec-2-3">
        <title>SELECT</title>
        <p>WHERE f
?d : name ' Department0 ' .
? s : memberOf ?d .
? s : a d v i s o r ?p .
? s : t a k e s C o u r s e ? c .</p>
        <p>?p : t e a c h e r O f ? c .</p>
        <p>
          The mechanism of reordering to get a more e cient query
is not a new idea. This was done in Datalog [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] while
applying the Magic Set Transformation [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] to a Datalog program
in order to sort the subgoals of a rule and the same idea
was also used while applying the join order optimization to
a SQL query in a relational database system [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ].
        </p>
        <p>
          Especially for Datalog and SPARQL, the ordering of
subgoals or respectively triples has a huge impact on the
variables. The term Sideways Information Passing, short SIP,
was used for this in Datalog and it describes how the
bindings of variables are passed from one subgoal to another
subgoal. Depending on the order of the subgoals it results in
a variety of di erent SIP possibilities, called SIP strategies
[
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. For our approach, we adopt this term for SPARQL. In
the context of SPARQL, it describes the way the binding of
the variables is passed between the triples.
        </p>
        <p>Consider the query from Listing 3 again. We start with the
triple ?d :name 'Department0'. This generates bindings for
the variable ?d, which are passed to the second triple ?s
:memberOf ?d. Again this generates bindings for the
variable ?s, which is passed to the triples ?s :advisor ?p and
?s :takesCourse ?c.
3https://www.ontotext.com/products/graphdb/
#t 2
#t 1
#t 4
#t 3
#t 5
#t 5
#t 4
#t 3
#t 2
#t 1
t2
t3
t1
t4
t5</p>
        <p>In the following chapter, we present an approach in
order to use the sideways information passing for SPARQL
queries.
3.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>USING SIDEWAYS INFORMATION PAS</title>
    </sec>
    <sec id="sec-4">
      <title>SING FOR SPARQL QUERIES</title>
      <p>In order to optimize a SPARQL query regarding the SIP,
we can represent the query as a graph to visualize the
possible ways to pass the bindings. In Figure 1 we see the resulting
graph for the SPARQL query in Listing 1.</p>
      <p>Every node represents a triple pattern of the query. Two
nodes are connected with an undirected edge if they share
a common variable. Conceptually, the representation allows
us to determine the best order of the triples by transforming
the undirected graph in a directed graph using the approach
described in this paper, whereas the directed edge between
node t1 and node t2 denotes that t1 is in the ranking order
before t2.
Another way to visually represent the query is shown in
Figure 2. Here we represent every subject and object
position as a node, no matter if this is a variable, an IRI or a
blank node. For the triple pattern</p>
      <p>?s takesCourse ?c
we get two nodes representing the variables ?s and
respectively ?c and they are connected with a directed edge, which
represents the predicate takesCourse between these two
variables. The edge is directed from the node representing the
subject of the triple to the node representing the object of
the triple. For the triple ?s takesCourse ?c, the edge is
directed from the node representing the variable ?s to the node
representing the variable ?c.</p>
      <p>?c
This kind of representation describes the pattern we are
looking for while executing the query against a dataset.</p>
      <p>The approach described in this paper focuses on how to
rewrite the query with a more e cient ordering of the triples.
This is done based on the query graph in order to optimize
the sideways information passing.</p>
      <sec id="sec-4-1">
        <title>It consists of the following four steps:</title>
      </sec>
      <sec id="sec-4-2">
        <title>1. Generate query graph</title>
        <p>2. Transform undirected graph into directed graph
3. Determine order of the nodes according to the
direction of the edges</p>
      </sec>
      <sec id="sec-4-3">
        <title>4. Generate optimized SPARQL query</title>
        <p>According to the di erent elements of a SPARQL query,
we have a closer look at the rearrangement of a query only
consisting of a basic graph pattern and in a second step, we
also consider all possible group graph patterns of a SPARQL
query.
3.1</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Basic Graph Pattern Query</title>
      <p>
        Basic graph patterns (BGP) are sets of triple patterns
[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. The most simple SPARQL query consists of exactly one
triple pattern:
      </p>
      <p>?s ?p ?o</p>
      <p>
        One example of a query which consists only of a basic
graph pattern but more than one triple pattern is shown in
Listing 1. As described before, in the rst step we generate
the corresponding query graph for the SPARQL query from
Listing 1 as seen in Figure 1. During the second step, we
convert the undirected query graph into a directed graph.
In the following, we describe how this is achieved. Similar to
Datalog, while performing the Sideways Information Passing
as a preliminary step before the Magic Set Transformation
[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], we try to nd Literals or IRIs in the query to get a
starting point for generating bindings for the variables. In
general, we are looking for the node with the highest
number of bound positions (subject, predicate, object). In our
example, the only literal is the name of the department, so
the triple, which contains this literal, will be the rst triple
according to the ranking order.
      </p>
      <p>As explained before, this generates a binding for the
variable ?d. This binding is passed to all triples, which also
contain the variable ?d, no matter, if it occurs in the
subject or object position. In the query graph, this passing is
visualized by adding an outgoing direction to all undirected
edges of the current node, which corresponds to the current
triple. The search for a node, where the corresponding triple
has the highest number of bound positions and the
subsequent adding of directions to edges are repeated until all
edges are directed.</p>
      <p>This approach works ne for SPARQL queries, where the
corresponding query graph is basically a path. In our
example query, we can follow the path until node t4. Now more
than one node is a ected by adding the direction of an
edge and we have to decide which of the two nodes t2 and t3
should be processed next. In order to solve this, we have
introduced a cost-based model to pick the best next node out
of all possible nodes.</p>
      <p>The chosen cost model takes into account the average
occurrence of every property but in addition, also considers
the position of the common variable. For every property p
we determine two di erent numbers, so-called outgoing and
incoming value. The naming of these values origins in the
SPARQL graph representation. The outgoing value
describes the average amount of outgoing edges of this property
b
takesCourse
advisor
c
e
h
for a node if this node has at least one outgoing edge of
this property. In general, the outgoing value for property p
determines the average number of awaiting bindings for the
object position while the subject position already has a
binding, whereas the incoming value determines the average
number of awaiting bindings for the subject position while
the object position already has a binding.</p>
      <p>We also have to consider the cost model in order to choose
the rst node, if the query contains only triples of the form
?s p ?o, where only the predicate position p is bound and
not a single literal or IRI is part of the query.</p>
      <p>In Figure 3 we see a small instance graph as an example of
the generated RDF graphs by LUBM. The shape of the
node represents the type of the node. A rectangle represents a
student, a triangle represents a professor and a circle
represents a course. The outgoing and incoming values resulting
from the small example can be seen in Table 1.</p>
      <p>property
advisor
takesCourse
teacherOf</p>
      <p>For a better understanding, we have a closer look at the
calculations for the outgoing and incoming values. Consider
the property takesCourse. We sum up all edges with the
label takesCourse. Node e has three edges, the node b has two
edges and the node g has one edge with the label
takesCourse. Overall we have six edges with this label. To determine
the outgoing value we divide this sum by the number of
nodes, which have an outgoing edge with this label (in this
example three nodes). Overall we get an outgoing value of 2
for the property takesCourse. Similar calculations are done
for the incoming value for the property takesCourse. Now
we consider all nodes, which have an incoming edge with
this label. Node a has two edges as well as node c. Both
nodes f and h have one edge. Again the number of edges
is divided by the number of nodes in order to calculate the
average amount. Overall we get an incoming value of 1.5 for
the property takesCourse.</p>
      <p>So we have two di erent values for every property. Due to
this distinction, we take into account the sideways
information passing to the subject or rather the object position.</p>
      <p>Using this cost model we can now determine which node
to choose in our example from Figure 1. After handling the
nodes t5 and t4 we have to choose between the nodes t2 and
t3. In both nodes the corresponding triple contains the
already bound variable in the subject position, so we compare
the outgoing values of both predicates as seen in Table 1 and
choose the smaller one in order to minimize the intermitted
results. Because advisor has an outgoing value of 1 while
takesCourse has an outgoing value of 2, we choose node t3
to be the subsequent processed node.</p>
      <p>Based on the number of bound positions and if equivocal
based on the presented cost model we are able to transform
the undirected query graph into a directed graph. In the
third step, we can use topological sorting for extracting the
resulting order of the nodes from the directed graph. In the
last step, we map the node to the corresponding triple in
order to get the optimized SPARQL query.
3.2</p>
    </sec>
    <sec id="sec-6">
      <title>Handling FILTERs</title>
      <p>
        Until now we have considered SPARQL queries which
consist of a basic graph pattern or in other words of a set of
triple patterns. On the one hand, these triple patterns
generate bindings for variables. On the other hand, they are also
consuming the binding of variables for the subject position
in order to generate a binding for the object position or vice
versa. Overall a triple pattern can generate and can
consume bindings for variables. Things are slightly di erent as it
comes to FILTERs in SPARQL. SPARQL FILTERs restrict
solutions to those for which the lter expression evaluates
to TRUE [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. So a FILTER can only consume bindings for
the variables, but can not generate any bindings. In order to
handle this, we have to slightly adapt our present approach.
While generating the query graph, the approach stays the
same, so for every lter in the query, we add a new node
to the query graph next to the nodes for every triple. In
order to distinguish the node for a triple and the node for
a lter, the node for a lter is drawn with a bold line. For
a better di erentiation, we also name the node for the lter
consuming node. The meaning of the edges stays the same.
Consider the following SPARQL query, which is the same
SPARQL query as in Listing 1 extended with one FILTER
expression:
      </p>
      <sec id="sec-6-1">
        <title>SELECT</title>
        <p>WHERE f
?p t e a c h e r O f ? c .
? s t a k e s C o u r s e ? c .
? s a d v i s o r ?p .
? s memberOf ?d .
?d name ' Department0 ' .</p>
        <p>FILTER( ? p !=
&lt;h t t p : / /www. Department0 .</p>
        <p>U n i v e r s i t y 0 . edu /</p>
        <p>A s s i s t a n t P r o f e s s o r 4 &gt; )
g
The corresponding query graph is shown in Figure 4.</p>
        <p>While transforming the query graph from an undirected
graph into a directed graph, we use the same approach as
discussed before. The only change regards the handling of
these consuming nodes. From a naive standpoint of view,
we can imagine that in general, it can not result in the very
best execution time if the lter is placed at an arbitrary
position. Based on the experiments in section 4 we were able
to substantiate the assumption to place the lter after all
triples, which have at least one common variable with the
lter condition, were processed. Based on the query graph,
all edges of the consuming node have to be directed in order
to be the next node to be processed.</p>
        <p>Considering our example query graph in Figure 4 we have
already processed the nodes t5, t4 and t3. In the current
state we have to choose between node t2, node t1 or the
consuming node t6. As described before, the consuming
node can only be a candidate, if all edges of the consuming
node were turned into incoming edges in order to get the
bindings from all relevant triples. So we have to choose
between t2 and t1. This is done using the approach presented
before. In doing so, we rst choose t2, afterwards t1. Now
all edges of the consuming node are directed and we can
add the lter to the order of the so far processed triples. If
there would be any undirected edge in the query graph, we
just continue using our approach. Overall this handling of
a lter in a SPARQL query does not mean we always add
the lter at the end of the query. Considering the following
lter regarding the variable ?s:</p>
        <p>FILTER(?s != http://www.Department0.
University0.edu/GraduateStudent29)</p>
        <p>This lter would be added in the current order after we
have processed all triples containing the variable ?s.</p>
        <p>In summary, we have adapted our approach to handle the
FILTER construct of a SPARQL query.
3.3</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>Handling Group Graph Patterns next to</title>
    </sec>
    <sec id="sec-8">
      <title>FILTER</title>
      <p>Next to a FILTER, there are many more possibilities to
write a group graph pattern in SPARQL, for example
OPTIONAL or SERVICE. Also these elements can consume
bindings similar to a FILTER, but considering only the
triples contained in a group graph pattern, these triples also
generate bindings for each other.</p>
      <p>Consider the SPARQL query in Listing 5 with a BGP of
three triples and an OPTIONAL clause with two triples.
The idea is to handle those queries on di erent
abstraction levels. While reordering the triples which are contained
directly in the WHERE clause, the group graph pattern is
abstracted into one single node. On this basis, we can
consider the triple patterns and group graph patterns inside of
the current group graph pattern.
During the abstraction, the query graph for the rst level
contains four nodes. The nodes t1, t2 and t3 represent the
corresponding triple pattern while the node O represents the
OPTIONAL clause as shown in Figure 5. Based on this,
we consider a query graph containing the triples t4 and t5,
which are inside of the OPTIONAL. The incoming edges
for the second query graph represent the information about
already bound variables from outside the OPTIONAL as
shown in Figure 5.</p>
      <p>Until now we have only considered queries, which have at
most one OPTIONAL group graph pattern. This approach
is also applicable if a SPARQL query has more than one
group graph pattern. If this is the case, we generate a
consuming node for every group graph pattern and use the same
approach as described before. We only have to extend the
mechanism to handle several consuming nodes as potential
next nodes. In order to get the best possible order, we use
a preference ranking for the di erent kind of group graph
patterns. So for example, a FILTER is ranked above an
OPTIONAL. This approach is quite generic because a SPARQL
query can also be nested using an arbitrary number of
levels. Conceptually handling these elements like OPTIONAL
should be possible using our presented approach as stated
before. As described in section 4 we will need more
experiments and tests to show the applicability of our approach
for queries with group graph patterns like OPTIONAL.</p>
    </sec>
    <sec id="sec-9">
      <title>EXPERIMENTS</title>
      <p>
        In this section, we present the experiments carried out
to test our approach and to get an idea if this approach is
promising. We used the LUBM data generator [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] to generate
a dataset of approximately 130 000 triples. These triples
are stored in GraphDB. One of the rst experiments was to
execute all possible permutations of the BGP of a SPARQL
query to get an idea about the di erent execution times
regarding the di erent orders of the triples. In addition, the
query was reordered using our presented approach. For every
query tested so far the reordered query was in the forefront
of all execution times from the permuted queries. To give an
idea how the execution times can di er, the distribution of
the execution times regarding all permutations of the query
presented in Listing 1 depicted in Figure 6. For this example,
our approach achieves the best execution time by on average
of 325 ms. All permutations were executed 10 times in order
to reach a good average.
      </p>
      <p>20
s
n
o
i
ta 15
t
u
m
r
e
fp10
o
s
r
e
b
um 5
n
0</p>
      <p>This test setup was executed with several di erent queries
but the overall picture is always the same. In Table 2 we see
a comparison between some of the LUBM queries regarding
the best possible execution time for one of the
permutations compared with the execution time achieved while
executing the permutation derived from our approach. In the
last column, we listed the percentage of permutations of the
respective query, which had the same or smaller execution
time than the derived permutation from our approach.</p>
      <p>LUBM
Query</p>
      <p>For LUBM query 2 our approach as presented in this
paper achieves an execution time of 16 ms, whereas the best
permutation achieves an execution time of 7 ms. Based on
these two numbers, it seems that our approach does not
perform well. If we add the information, that this query has 6
triples, so 720 possible permutations and only approximately
3% of these permutations have the same or better
execution time, our approach is in a better position. For some of
the queries, as LUBM query 7 with 4 triples, our approach
derived the best possible permutation based on the
execution time. These experiments also showed some examples like
LUBM query 12, where our approach does not determine a
good permutation of the query. These queries will be used
to improve our cost model.</p>
      <p>Based on the assumption the approach is using a good
heuristic, we have also examined the impact of the lter
placement. In order to do this, we used our approach to reorder
the BGP of the query, added the lter statement at every
possible position and compared the execution times again.
Also for this test setup, the experiment matches with the
expectations about placing the lter after all triples which
contain at least one of the variables of the lter. In Table 3
we see the impact of the lter placement.</p>
      <p>lter placement execution time
before t5 289 ms
after t5 279 ms
after t4 281 ms
after t3 277 ms
after t2 277 ms
after t1 274 ms</p>
      <p>Using our approach for the query as shown in Listing 4 we
place the lter after the triple t5, which is also the best
position according to all possible positions as seen in Table 3.
During our experiments, we have observed the impact of the
lter condition itself, but explicitly considering the
condition of the lter will be part of our future research. Overall
our performed experiments showed the applicability of our
approach.</p>
    </sec>
    <sec id="sec-10">
      <title>RELATED WORK</title>
      <p>
        In general, query optimization is a well-established area
especially for SQL, but regarding SPARQL queries it is a
quite current topic. In the following, we shortly describe
some di erent approaches, which use di erent methodologies
than our approach presented here. The Characteristic Set
approach [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] uses dynamic programming based algorithm on
a precomputed hierarchical structure, which allows
determining the best order of the triples. In [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] they translate a query
into a multidimensional vector space and perform
distancebased optimization by considering the relative di erences
between the triple patterns. Comparing di erent selectivity
estimations for a SPARQL query was done in [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], whereby
this approach focusses only on BGP queries. It describes
different ways to compute heuristics for the optimization but
does not consider the structure of the triple. Our approach
lls this gap between using statistical data and taking the
structure of the query as well as the triple into account.
      </p>
    </sec>
    <sec id="sec-11">
      <title>CONCLUSION AND FUTURE WORK</title>
      <p>In this paper, we have presented our approach for
optimizing a SPARQL query by means of using the sideways
information passing. We show how to use our approach for
a SPARQL query only consisting of a basic graph pattern.
Based on this we showed how to adapt the approach in
order to handle queries, which also contain FILTERs or other
group graph patterns. The experiments based on the LUBM
dataset showed the impact on the execution time based on
the order of the triples. Also, these experiments functioned
as a preliminary proof of concept for our approach. While
testing the execution time of all possible permutations of the
triples, the order of triples, our approaches chooses, was in
the forefront of the smallest execution times.</p>
      <p>
        As described in the title, this is a work in progress paper.
Therefore these experiments are not the top of the agpole.
The next steps will include tests with bigger datasets
generated with the LUBM dataset generator as well as tests using
data and test queries from the Berlin SPARQL Benchmark
(BSBM) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. While the test queries from LUBM are quite
short, the queries tested in BSBM contain more triples and
also more complex patterns like FILTERs or OPTIONAL
clauses. Another benchmark which provides more complex
test queries than LUBM is the SP 2Bench [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], which we want
to use as a source for test queries.
      </p>
      <p>In order to improve the reordering based on the current
approach we want to take into account semantic information.
This will be part of our future work.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>C.</given-names>
            <surname>Beeri</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Ramakrishnan</surname>
          </string-name>
          .
          <article-title>On the power of magic</article-title>
          .
          <source>In Proceedings of the Sixth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, PODS '87</source>
          , pages
          <fpage>269</fpage>
          {
          <fpage>284</fpage>
          , New York, NY, USA,
          <year>1987</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          and
          <string-name>
            <surname>A. Schultz.</surname>
          </string-name>
          <article-title>The berlin sparql benchmark</article-title>
          .
          <source>Int. J. Semantic Web Inf. Syst.</source>
          ,
          <volume>5</volume>
          :1{
          <fpage>24</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Gubichev</surname>
          </string-name>
          and
          <string-name>
            <given-names>T.</given-names>
            <surname>Neumann</surname>
          </string-name>
          .
          <article-title>Exploiting the query structure for e cient join ordering in sparql queries</article-title>
          .
          <source>In EDBT</source>
          , pages
          <volume>439</volume>
          {
          <fpage>450</fpage>
          . OpenProceedings.org,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Pan</surname>
          </string-name>
          , and
          <string-name>
            <surname>J.</surname>
          </string-name>
          <article-title>He in. Lubm: A benchmark for owl knowledge base systems</article-title>
          .
          <source>J. Web Sem</source>
          .,
          <volume>3</volume>
          (
          <issue>2</issue>
          -3):
          <volume>158</volume>
          {
          <fpage>182</fpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J. W.</given-names>
            <surname>Lloyd</surname>
          </string-name>
          .
          <source>Foundations of Logic Programming</source>
          . Springer-Verlag, Berlin, Heidelberg,
          <year>1984</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Meimaris</surname>
          </string-name>
          and
          <string-name>
            <given-names>G.</given-names>
            <surname>Papastefanatos</surname>
          </string-name>
          .
          <article-title>Distance-based triple reordering for SPARQL query optimization</article-title>
          .
          <source>In 33rd IEEE International Conference on Data Engineering, ICDE</source>
          <year>2017</year>
          , San Diego, CA, USA, April
          <volume>19</volume>
          -
          <issue>22</issue>
          ,
          <year>2017</year>
          , pages
          <fpage>1559</fpage>
          {
          <fpage>1562</fpage>
          . IEEE Computer Society,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>T.</given-names>
            <surname>Neumann</surname>
          </string-name>
          and
          <string-name>
            <given-names>G.</given-names>
            <surname>Weikum.</surname>
          </string-name>
          Rdf-3x:
          <article-title>a risc-style engine for rdf</article-title>
          .
          <source>PVLDB</source>
          ,
          <volume>1</volume>
          (
          <issue>1</issue>
          ):
          <volume>647</volume>
          {
          <fpage>659</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>E.</given-names>
            <surname>Prud</surname>
          </string-name>
          <article-title>'hommeaux and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Seaborne. SPARQL Query</surname>
          </string-name>
          <article-title>Language for RDF</article-title>
          .
          <source>W3C Recommendation</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Schmidt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Hornung</surname>
          </string-name>
          , G. Lausen, and
          <string-name>
            <given-names>C.</given-names>
            <surname>Pinkel</surname>
          </string-name>
          .
          <article-title>Sp2bench: A SPARQL performance benchmark</article-title>
          .
          <source>CoRR, abs/0806.4627</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>S.</given-names>
            <surname>Sippu</surname>
          </string-name>
          and
          <string-name>
            <given-names>E.</given-names>
            <surname>Soisalon-Soininen</surname>
          </string-name>
          .
          <article-title>Multiple sip strategies and bottom-up adorning in logic query optimization</article-title>
          .
          <source>In ICDT</source>
          ,
          <year>1990</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M.</given-names>
            <surname>Steinbrunn</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Moerkotte, and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Kemper</surname>
          </string-name>
          .
          <article-title>Heuristic and randomized optimization for the join ordering problem</article-title>
          .
          <source>The VLDB Journal</source>
          ,
          <volume>6</volume>
          (
          <issue>3</issue>
          ):
          <volume>191</volume>
          {
          <fpage>208</fpage>
          ,
          <string-name>
            <surname>Aug</surname>
          </string-name>
          .
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>M.</given-names>
            <surname>Stocker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Seaborne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bernstein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Kiefer</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Reynolds</surname>
          </string-name>
          .
          <article-title>Sparql basic graph pattern optimization using selectivity estimation</article-title>
          .
          <source>In Proceedings of the 17th International Conference on World Wide Web, WWW '08</source>
          , pages
          <fpage>595</fpage>
          {
          <fpage>604</fpage>
          , New York, NY, USA,
          <year>2008</year>
          . ACM.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>