=Paper=
{{Paper
|id=None
|storemode=property
|title=Distributed SPARQL Throughput Increase: On the effectiveness of Workload-driven RDF partitioning
|pdfUrl=https://ceur-ws.org/Vol-1035/iswc2013_poster_11.pdf
|volume=Vol-1035
|dblpUrl=https://dblp.org/rec/conf/semweb/BascaB13
}}
==Distributed SPARQL Throughput Increase: On the effectiveness of Workload-driven RDF partitioning==
<pdf width="1500px">https://ceur-ws.org/Vol-1035/iswc2013_poster_11.pdf</pdf>
<pre>
 Distributed SPARQL Throughput Increase: On
   the effectiveness of Workload-driven RDF
                   partitioning

                      Cosmin Basca and Abraham Bernstein

      DDIS, Department of Informatics, University of Zurich, Zurich, Switzerland
                               {lastname}@ifi.uzh.ch


1     Introduction
The current size and expansion of the Web of Data or WoD, as shown by the stag-
gering growth of the Linked Open Data (LOD) project1 , which reached to over 31
billion triples towards the end of 2011, leaves federated and distributed Semantic
DBMS’ or SDBMS’ facing the open challenge of scalable SPARQL query pro-
cessing. Traditionally, SDBMS’ push the burden of efficiency at runtime on the
query optimizer. This is in many cases too late (i.e., queries with many and/or
non-trivial joins). Extensive research in the general field of Databases has iden-
tified partitioning, in particular horizontal partitioning, as a primary means to
achieve scalability. Similarly to [2] we adopt the assumption that minimizing the
number of distributed-joins as a result of reorganizing the data over participating
nodes will lead to increased throughput in distributed SDBMS’. Consequently,
the benefit of reducing the number of distributed joins in this context is twofold:
     A) Query optimization becomes simpler. Generally regarded as a hard prob-
lem in a distributed setup, query optimization benefits, at all execution levels,
from fewer distributed joins. During source selection the optimizer can use spe-
cialized indexes like in [5], while during query planning better query plans can
be devised quicker, since much of the optimization burden and complexity is
shifted away from the distributed optimizer to local optimizers.
     B) Query execution becomes faster. Not having to pay for the overhead of
shipping partial results around, naturally reduces the time spent waiting for
usually higher latency network transfers. Furthermore, federated SDBMS’ incur
higher costs as they have to additionally serialize and deserialize data.
     The main contributions of this poster are: i) the presentation of a novel
and naı̈ve workload-based RDF partitioning method2 and ii) an evaluation and
study using a large real-world query log and dataset. Specifically, we investigate
the impact of various method-specific parameters and query log sizes, comparing
the performance of our method with traditional partitioning approaches.
2     Method Overview
Traditional approaches like Schism construct a graph representation where ver-
texes are tuples that participate in workload transactions. The graph is extended
1
    http://linkeddata.org/
2
    This work was partially supported by the Swiss National Science Foundation under
    contract number 200021-118000.
to include all other tuples using a partition-trained classifier. Following this idea,
triples would be considered vertexes, while edges are created when any two triples
participate in the same query. This is however not feasible for RDF data.
 1) SPARQL logging                                          2) Query Space: Partitioning
                                                                                                                          Following this graph
              Federator, or SPARQL
                 Broker endpoint                                Query Graph Partitioning
                                                                                                                          representation in our
  SPARQL
                                             Query
                                             Triples
                                                                                                  Query
                                                                                                 Partitions
                                                                                                                          early attempts led to
    Query                                     Index                                                View
                                                                                                                          very dense graphs,
                                       Triples Distribution
                                                                                                                          which proved to be
                                                      Index     Triple Propagation & Replication
                                                                                                                          too large for state of
                                            Partition A
                                                                                                   Triple
                                            Partition B                                          Partitions
                                                                                                   View
                                                                                                                          the art graph parti-
                                            Partition C
                                                                                                                          tioning software like
                              4) Triple Space: Placement                  3) Triple Space: Replication & Propagation
                                                                                                                          Metis [4].3 Next, we
     Fig. 1. A simple generalized view of the partitioning process.                                                       detail all steps seen in
Figure 1 except the simpler 1st phase where queries and their results are logged.
Data Representation & Graph Partitioning. Since mapping triples to ver-
texes does not scale well for RDF data, we pursued an intermediate representa-
tion: the Queries graph.
Each query in the workload be-
                                                                              Dashed edges are replicated          Size = 200 Q1       Partition 1
comes a vertex, while edges be-                                                  i.e.: (Q1,Q4) and (Q1,Q3)
                                                                                                                    100
tween queries are formed when                                                           Size = 10000    Q2
                                                                                                                                    10
                                                                                                                                                   Partition 2
some triples participate in more                                                                                    20
                                                                                                                                           Q4     Size = 3000
than one query, with the num-                                                                                                      150

ber of common triples as edge                                                           Size = 5000    Q3          50         Q5  Size = 1500

weights (Figure 2). Finally, we
                                                                              Fig. 2. Basic query log-driven data representation.
apply Metis on the newly formed
queries graph, forcing balanced partitions as a result of the graph-cut operation.
Replication. After performing the graph-cut, there will still be distributed joins
even on the workload queries (i.e., query Q1 will require a distributed join while
query Q2 not). A straight-forward solution is to replicate the triples that reside
on the border between partitions. We proceed with identifying the minimum set
of triples that needs to be replicated, copying the extra triples from the smaller
sized query (i.e., copy extra triples from query Q1 over to Partition 2).
Propagation. While the process outlined so far guarantees that each query in
the workload can be executed without a single distributed join
there are no guarantees for fu-
ture unknown queries. A method                                                              ?                             ?                       ?
                                                                                                    ?
of expanding the set of all triples                                                                               ?                      ?
                                                                                    ?           ?           S           P        O           ?           ?
which participate in all workload                                                                 ?               ?                      ?

queries is needed. For this we rely                                                        ?                              ?                       ?

on the principle of (Spatial) Lo-                                                     <?, ?, S >                    <S, ?, ? >               <O, ?, ? >
cality of Reference [3] adapted to
the logical graph representation. Fig. 3. Visual depiction of the propagation patterns.
In effect we propagate along the edges in the original RDF data graph to iden-
tify new triples “related” to existing triples which participated in all workload
 3
     The resulting input edge file amounted to approx. 150GB on disk, crashing Metis.
queries. Hence, we perform an n-hop4 edge propagation matching the follow-
ing triple patterns given a triple <s,p,o> (also depicted Figure 3): a) siblings:
< s,?,?>, b) outgoing edges: <o,?,?> and c) incoming edges: <?,?,s>. The re-
maining dataset triples which have not been considered so far, are randomly
distributed to the K selected partitions, or by hashing by subject.
3                              Results & Conclusions
We make use of the USEWOD Data challenge [1] log file to extract 400k valid
and well formed SPARQL SELECT queries that produce at least 1 result, all
other log entries are discarded. We use a local instance of the Virtuoso RDF-
store to resolve them against DBpedia 3.5.1. Furthermore, we assume a perfect
distributed query optimizer, able to find the best possible query decomposition.
Measurements were conducted on a node with 72GB of RAM, 8 Cores @2.93GHz.
    We compare our method against random partitioning, expert (manual) par-
titioning 5 and hash partitioning. For the latter we hash on all possible combina-
tions of a triple: S, P, O, SP, SO, PO and SPO.6 Given the small to average size of
the DBpedia dataset (ca. 43.6 million triples), we fixed the number of partitions
to K = 8, simulating a small to medium sized cluster. Furthermore, we randomly
sampled the workload, with sizes consisting of 1k, 5k, 10k, 25k, 50k and 100k
queries from the total of 400k logged. The number of propagation hops was set
to 0, 1 and 2 respectively while replication was enabled in all cases.
                                                         90.00	
  
    Par33ons	
  from	
  Total	
  (Par33oned	
  +	
  
     Percentage	
  of	
  Triples	
  Assigned	
  to	
  

                                                                                                                                                                        76.26	
  
                                                         80.00	
  
         Replicated	
  +	
  Propagated)	
  

                                                         70.00	
                                                       65.44	
  
                                                                                                                                                                        72.63	
  
                                                         60.00	
  
                                                         50.00	
              45.08	
                                  59.08	
  
                                                                                             39.47	
  
                                                                                                                                                                                     Me/s	
  random	
  0	
  
                                                         40.00	
  
                                                                         28.64	
                                                                                                     Me/s	
  random	
  1	
  
                                                         30.00	
                             34.29	
  
                                                                                                                                                                                     Me/s	
  random	
  2	
  
                                                         20.00	
  12.13	
  
                                                                      26.01	
                                                                                            8.87	
  
                                                         10.00	
                              2.76	
                    5.72	
  
                                                                            1.55	
  
                                                          0.00	
   9.78	
  
                                                                     0	
      10000	
   20000	
   30000	
   40000	
   50000	
   60000	
   70000	
   80000	
   90000	
   100000	
  
                                                                                              Training	
  Workload	
  Sample	
  Size	
  (#	
  queries)	
  
      Fig. 4. The % of triples assigned to partitions from total triples, for each training query set.
    As we can visually observe in Figure 4 that although the increase in num-
ber of triples reached through the partitioning process (excluding the triples not
connected at all) is significant from 50k to 100k queries, there are diminishing re-
turns as the expansion process starts to slow down. Indeed doubling the number
of queries to log, yields approximatively a 10% increase at this point. Therefore,
we observe that a training log size of 50k queries represents an optimal point.
Performance Impact of Graph Partitioning. Even-though the general
problem of graph partitioning is known to be NP-hard, the approximating algo-
rithm implemented in Metis performs very well, finalizing the queries graph cut
in 0.17 seconds for 1k queries and 0.71 seconds for 100k queries.
4
  Multiple hops only enabled for in & out edges to avoid an expensive avalanche effect.
5
  Each large dump (if > 1M triples) to own partition, remainder grouped together.
6
  We use of the cityhash family of hash functions, due to low-collision rate and speed.
                                                          9.8	
  
                                                                                                                                                                          8.66	
     Me/s	
  hash	
  S	
  2	
  

    Performance	
  Improovement	
  Normalized	
  to	
  
                                                          8.8	
                                                           8.45	
  
                                                                                                                                                                                     Me/s	
  hash	
  S	
  1	
  
                                                          7.8	
  

           Slowest	
  (#	
  Distributed	
  Joins)	
  
                                                                                                                                                                                     Me/s	
  random	
  2	
  
                                                                                                                                                                                     Me/s	
  random	
  1	
  
                                                          6.8	
                                                                                                           6.14	
  
                                                                                                                                                                                     Hash	
  S	
  
                                                          5.8	
  
                                                                                                                          5.02	
                                                     Expert	
  (Manual)	
  
                                                          4.8	
                               4.40	
                                                                                 Hash	
  P	
  
                                                                       3.82	
  3.96	
  
                                                                    3.66	
  
                                                                                                                                                                          3.34	
     Me/s	
  random	
  0	
  
                                                          3.8	
  
                                                                                                                                                                                     Hash	
  SP	
  
                                                          2.8	
            2.35	
             2.15	
                                                                                 Hash	
  O	
  
                                                                     1.70	
  
                                                          1.8	
   1.34	
                                                                                                             Hash	
  SO	
  
                                                                                                                                                                                     Hash	
  PO	
  
                                                          0.8	
  
                                                                                                                                                                                     Random	
  
                                                                     0	
      10000	
   20000	
   30000	
   40000	
   50000	
   60000	
   70000	
   80000	
   90000	
   100000	
  
                                                                                                                                                                                     Hash	
  SPO	
  
                                                                                                Training	
  Workload	
  Sample	
  Size	
  (#	
  queries)	
  
         Fig. 5. The performance improvement relative to the lowest performing method (hash SPO).
Number of Hops Impact when Propagating. Figure 5 plots the relative
performance improvement over the lowest performing method. Non-workload
driven partitioning methods appear as horizontal lines. The worst performing
ones are the Hash SPO method with Random & Hash PO/SO exposing simi-
lar performance levels. Hashing by subject Hash S is performing best, followed
closely by the Expert (manual) distribution method. This could suggest that
the majority of the workload queries are dominated by star-shaped basic graph
patterns and contain few joins.7 When using random distribution of remain-
ing triples our method performs unsatisfactory for smaller workload sizes, but
becomes substantially better by 50k queries. At 100k queries it exposes a 6.14
performance factor being 2.81 and 3.1 times better than Hash S and Expert re-
spectively. When remaining triples are distributed based on subject hashes, the
method outperforms all other methods at all workload sizes. At best (Metis hash
S 2 ) our method is between 3.6 and 8.66 times better than the lowest performing
and up to 5.32 times better than hashing by subject. In essence the best case
partitioning would produce on average of 0.10 distributed joins per query.
References
1. B. Berendt, L. Hollink, V. Hollink, M. Luczak-Rsch, K. H. Mller, and D. Vallet.
   Usewod2011 - 1st international workshop on usage analysis and the web of data. in
   20th international world wide web conference (www2011), hyderabad, india, 2011.
2. C. Curino, E. Jones, Y. Zhang, and S. Madden. Schism: a workload-driven approach
   to database replication and partitioning. Proceedings of the VLDB Endowment,
   3:48–57, Sept. 2010.
3. P. J. Denning. The locality principle. Communications of the ACM, 48, July 2005.
4. G. Karypis and V. Kumar. MeTis: Unstructured Graph Partitioning and Sparse
   Matrix Ordering System, Version 4.0. http://www.cs.umn.edu/∼metis, 2009.
5. Y. Yan, C. Wang, A. Zhou, W. Qian, L. Ma, and Y. Pan. IEEE Xplore - Efficient
   Indices Using Graph Partitioning in RDF Triple Stores. In ICDE2009: IEEE 25th
   International Conference on Data Engineering, 2009., pages 1263 – 1266, 2009.


7
             A fact we intend to investigate in depth in the near future

</pre>