<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Physical Design Tuning of RDF Stores</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Saint Petersburg University</institution>
          ,
          <country country="RU">Russia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>RDF is a data model that allows to describe relationships between arbitrary entities in the form of subject-predicate-object. While representing data in such form is convenient, its e cient processing poses a number of challenges. Therefore, at present it is highly active research venue with lots of existing approaches. One of such approaches is based on characteristic sets. Essentially, it is a collection of records that describes an entity together with its properties. Usage of characteristic sets was shown to signi cantly speed up RDF query processing. In this paper we present a research proposal for a post-graduate project related to partitioning of tables obtained by characteristic set extraction. Our goal is to design a partitioning method that would be bene cial to processing of a such tables. In this report we survey existing RDF processing systems approaches, brie y describe our vision of partitioning approach and present a plan for further studies.</p>
      </abstract>
      <kwd-group>
        <kwd>RDF</kwd>
        <kwd>Physical Design</kwd>
        <kwd>Data Partitioning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>The Resource Description Framework (RDF) is a data model, developed by W3C
consortium for the Semantic Web. The main idea of the RDF is to describe
realworld entities and relationships between them in a machine-readable form. A
relationship between two entities in RDF is represented in a triple form (subject,
predicate, object ). This allows us to visualize an RDF dataset as a graph G =
(V; E) ; where V is the set of all entities and E is the set of all predicates. A
simple RDF graph is shown on Fig. 1.</p>
      <p>SPARQL is a query language for RDF. Just like the RDF data, any SPARQL
query can also be represented as a graph. A query in SPARQL uses the so-called
patterns to declare the constraints for query results. As shown in Fig. 2, a pattern
is a graph with variables in some nodes. In order to execute a SPARQL query
we need to somehow match the data with a pattern.</p>
      <p>The goal of an RDF management system is to store the RDF data and to
perform SPARQL queries in the most e cient manner. In order to reach that
goal, a large variety of systems have been developed, each of them has its own
purpose and features.</p>
      <p>foaf:name
\Johnny Lee Outlaw"
Male
foaf:gender
foaf:mbox
peter@example.org</p>
      <p>foaf:mbox
J L Outlaw foaf:friendOf P Goodguy foaf:gender Male</p>
      <p>foaf:name
\Peter Goodguy"</p>
    </sec>
    <sec id="sec-2">
      <title>Female</title>
      <p>foaf:gender
foaf:name
?name
?X</p>
    </sec>
    <sec id="sec-3">
      <title>J L Outlaw</title>
      <p>foaf:friendOf
foaf:mbox
?mbox
And with a graph representation in mind, one can de ne a Characteristic Set
(CS) of a vertex as a set of all edges connected to it.</p>
      <p>Designed initially to estimate the cardinality of queries with joins [25],
Characteristic Sets can be used as indexes [23] or as a base unit for a relational
schema [24, 30, 30]. A simple CS appears to be not su cient for both of the
cases, therefore complex CS-based structures emerge. In [23] Extended CSs are
de ned and used as an index structure, re ecting the complex inherent
dependencies in RDF data. A Schema [29, 30] of an RDF dataset is designed to be
its human-readable relational form. A single table in the Schema is composed
of several CSs of the same subject, merged together. A similar idea is suggested
in [24], but this solution uses di erent merging strategy and does not necessary
produce a human-readable relational schema.</p>
      <p>All of the above mentioned CS-based solutions are centralized systems, which
do not employ any advanced partitioning strategy. The utilization of some form
of relational partitioning may greatly increase performance of a system, making
it possible to compete with native relational solutions.</p>
      <p>In this proposal we present our ongoing research of RDF physical design
methods. We also state some preliminary results we obtained, in form of a short
state of the art survey. We have composed an evaluation plan of the research
that we intend to follow over the next three years of the PhD study. By
following this plan we hope to develop a solution comparable to other modern RDF
management systems.</p>
      <p>This proposal has a following structure: in Section 2 we brie y describe some
of the modern RDF management systems with an emphasis on their physical
design. Section 3 contains the problem statement and our considerations. In
Sections 4 and 5 we describe our research process and its key stages.</p>
      <sec id="sec-3-1">
        <title>State of the Art</title>
        <p>There is a large number of RDF stores, both centralized and distributed. In
Table 1 are listed some of the systems and their distinctive features. We brie y
describe them below.</p>
        <p>
          Centralized systems are meant to be run on a single computer and largely
do not support any data partitioning. Jena [
          <xref ref-type="bibr" rid="ref6">5</xref>
          ] implements a basic triple store
and a built-in reasoner. The system stores triples \as is" in property tables.
Hexastore [38] uses the \sextuple indexing": it stores a separate table for each
triple permutation. This storage scheme is a modi cation of Vertical
Partitioning [
          <xref ref-type="bibr" rid="ref2">1</xref>
          ]. Virtuoso [
          <xref ref-type="bibr" rid="ref11">10</xref>
          ] is able to store data in rows as well as in columns. It
uses bitmap and SPOG permutations indexes. DB2RDF [
          <xref ref-type="bibr" rid="ref5">4</xref>
          ] is built upon IBM
DB2 and utilizes subject and object indexes. It stores the data in tables, where
for each subject there are multiple columns of corresponding predicates and
objects. RDF-3X [26] is a RISC-style RDF store that employs all six
SPOpermutations as indexes. Virtuoso-Emergent [30], MonetDB/RDF [29]
and raxonDB [23] use a novel relational schema generation technique based
on characteristic sets. Both Virtuoso-Emergent and MonetDB/RDF share the
same algorithm of constructing the emergent relational schema, while raxonDB
employs a di erent, but similar approach. During a preprocessing phase, the
major portion of RDF data is distributed into a number of relational tables and
a SPO representation is used to store the rest. Each table consists of several
similar characteristic sets merged together.
        </p>
        <p>
          Distributed/standalone systems implement their own distribution
algorithms and are often an extension of some centralized system. TriAD [
          <xref ref-type="bibr" rid="ref15">14</xref>
          ],
gStoreD [28] and WARP [19] use METIS graph partitioner. AdPart [
          <xref ref-type="bibr" rid="ref17">16</xref>
          ]
presents a complex adaptive partitioning algorithm. The system is able to
redistribute \hot" data to the least busy nodes. While it requires a considerable
amount of preparation, the resulting query execution time is one of the best
compared to other RDF stores [
          <xref ref-type="bibr" rid="ref4">3</xref>
          ]. YARS2 [
          <xref ref-type="bibr" rid="ref19">18</xref>
          ] was one of the rst distributed
RDF management systems. It hash-partitions data and stores six permutations
of SPO triples for indexing purposes.
        </p>
        <p>
          Distributed/federated systems are essentially a number of centralized
systems, tied together by a custom mediator. DREAM [
          <xref ref-type="bibr" rid="ref16">15</xref>
          ] stores a replica of
the whole dataset at each node, thus allowing the system to execute any query
at any available node. Partout [
          <xref ref-type="bibr" rid="ref12">11</xref>
          ] implements a workload-aware horizontal
partitioning. Both of the systems run an instance of RDF-3X at each node and
utilize its indexing capabilities.
        </p>
        <p>
          Distributed/Hadoop systems use Hadoop framework for distributed query
processing. The main idea is to partition the data across multiple nodes, and then
execute queries in a MapReduce manner: split the initial query into subqueries
and then merge the local results together. CliqueSquare [
          <xref ref-type="bibr" rid="ref13">12</xref>
          ] uses the
composition of range partitioning and S-, P-, O- VP-like partitioning. H-RDF-3X [20]
uses METIS and n-hop guarantees to de ne the fragments. H2RDF+ [27]
utilizes HBase partitioning capabilities. HadoopRDF [
          <xref ref-type="bibr" rid="ref10">9</xref>
          ] is based on Sesame
and uses Vertical Partitioning to split data into fragments. PigSPARQL [34],
S2RDF [36], S2X [33] and Sempala [35] were developed by a group from
Freiburg Unversity. PigSPARQL translates queries into Apache Pig Latin and
hash-partitions the data with the means of Apache Pig. S2RDF and S2X are
based upon Spark Framework, the rst system implements Extended Vertical
Partitioning, and the second system is built on top GraphX and uses its
partitioning algorithms. Sempala system runs an instance of Impala at each node and
employs Vertical Partitioning. RAPID+ [31] uses Apache Pig infrastructure to
store and partition the data. Sedge [39] is a graph Pregel-based solution that
utilizes METIS partitioning in conjunction with LSH. SHAPE [22] introduces
Semantic Hash Partitioning: the data is hash-partitioned in two stages and after
that each fragment is extended by n-hop replication. SHARD [32] splits the
data into fragments with hash-partitioning provided by the Hadoop framework.
3
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>Problem Statement</title>
        <p>In Section 2 we have described some of the modern RDF management systems.
Among them there are only two that use characteristic sets to e ciently process
star pattern queries. However, none of them uses any kind of data partitioning,
and both are centralized systems. At the same time current studies show that
storing RDF graph using relational schema representation is competitive to the
native RDF approach, and, thus, is of research interest.</p>
        <p>
          Our primary research question is the following: is it possible to devise and
implement an e cient horizontal partitioning strategy for a CS-based system, both
for centralized and distributed cases? Currently, CS approach is in its infancy |
research is primarily concentrated on generating relational schemes [24, 29, 30].
Physical design issues are yet to be addressed. At the same time, there are lots
of physical design options for classic relational systems [
          <xref ref-type="bibr" rid="ref7">6</xref>
          ] which can be reused
for CS tables.
        </p>
        <p>An example of such option is a horizontal data partitioning. There are
several types distinguished: (i) range data partitioning; (ii) hash data partitioning;
(iii) list partitioning and (iv) column partitioning. The rst two partition the
data according to values of partitioning attribute or its hash respectively. The
third implies that each fragment is de ned by the list of values the partitioning
attribute has. The last has a virtual attribute added into the relation. Then
it is partitioned by any method using the virtual attribute as the partitioning
attribute.</p>
        <p>Our hypothesis is the following: some tables containing one or several merged
CS would be queried more frequently than others. In this case they can be
partitioned in order to provide (i) data pruning during query execution, e.g.
not all fragments would be scanned; (ii) data distribution with fragment-based
data replication. We presume that attribute-based composite multi-attribute
partitioning would be of use in this case.</p>
        <p>Workload-awareness is another dimension that was not addressed for the CS
approach. A special partitioning strategy for CS tables is only the rst step. In
order to reap maximum bene ts from CS this strategy should be automatically</p>
        <p>Wfo nU
d
e ;
h egy ten A
s
i
l</p>
        <p>m W
b t
u ra t
p t a</p>
        <p>s re
s
a g t
w in l</p>
        <p>a
n ic</p>
        <p>e
em ito p
ts it s g
y r
s a
e p
th ta sr i</p>
        <p>o
r a ta it
a d S tr
e
y ; a
|d</p>
        <p>n
|i</p>
        <p>n
a P
r|ign lon
a n w
e o o
Y it d
;e itr r
m a fo
a P e
n ; l
s d ab
' e l
em su ia</p>
        <p>v
ts s a
y e
se exd rsce sex
th in u e
so d
n</p>
        <p>I
||
s</p>
        <p>|
m e
e x e
ty ed od l.
s</p>
        <p>p</p>
        <p>C e
S In ;
.s ;s</p>
        <p>m
m d e
e o ts
tssy teh sy ta</p>
        <p>a
t m gn D
n n i
e o l
m it
y r
re ea
e a d Y
g
a lic n
n p u
a re eh
m l t
F ia
D ce |
R p g
. s n</p>
        <p>i
1
le l.
]
] ] 8
ed 23 [4 [3
z [ e</p>
        <p>r s
D ] ] e /
R 6 0
/ [2 [1
B
m d
E te ]
b p d s n o
a e n y e x</p>
        <p>B e e</p>
        <p>F r
nBD 2RD tsao []a5 tn F t
x n o</p>
        <p>X so so
D 3 o o i
e
u u</p>
        <p>t
D ir ir
u 1
b [
r r
t o
s t
i s</p>
        <p>t D
e r e
a r
dP toS ir r</p>
        <p>
          A in
d
e
f
/ ]
d ]
a 2
H [
          <xref ref-type="bibr" rid="ref2">1</xref>
          ]
/d re [
        </p>
        <p>]
] 4
9 3
or semi-automatically applied to the data. Therefore, on the next step we aim
to create an advisor that would accept a prospective workload and recommend
a number of bene cial con gurations that would consist of aforementioned
partitions. CS tables not only have di erent access frequencies, but also di erent in
terms of rows, number and type of attributes and so on. Thus, we would have
to create a cost-based model to use inside our advisor.</p>
        <p>
          Another direction of our study is the application of column-stores for
CSbased RDF processing. Column-store [
          <xref ref-type="bibr" rid="ref8 ref9">7, 8, 21, 37</xref>
          ] is a DBMS system that stores
each attribute separately. This approach o ers a number of unique query
processing techniques such as late materialization and compression. In overall these
techniques [
          <xref ref-type="bibr" rid="ref1 ref3">2</xref>
          ] allow column-store systems to achieve excellent performance for
read-only workloads. Column-stores are promising venue for building RDF
processing systems. For example, RLE compression o ers a solution for the NULL
problem [24]. As demonstrated by Table 1 currently only a few systems employ
column-store approach.
        </p>
        <p>Thus, our prospective step is to try to apply our partitioning for a centralized
column-store system. In case of the success we will explore the applicability of
our partitioning in the distributed environment.
4</p>
      </sec>
      <sec id="sec-3-3">
        <title>Research Method</title>
        <p>Step 1: test our main hypothesis: whether CS-tables are unevenly \heated" by
a workload or not. Run experimental evaluation.</p>
        <p>Input: some CS-based system (either row- or column-store) and a benchmark.
Output: heat patterns.</p>
        <p>Step 2: outline a class of bene cial partitioning schemes. Using this
information devise a partitioning method or methods. Then, evaluate these methods by
manually applying them to the same data in order to verify performance
improvement.</p>
        <p>Input: heat patterns.</p>
        <p>Output: partitioning methods and evaluation results.</p>
        <p>Step 3: implement an automatic advisor for our partitioning methods. We need
to develop a cost model and an enumeration algorithm to select an appropriate
partitioning strategy.</p>
        <p>Input: partitioning methods, evaluation results.</p>
        <p>Output: cost model, automatic advisor.</p>
        <p>
          Step 4: evaluate our partitioning for a centralized column-store system. In case
of success move to a distributed column-store system to evaluate distributed
processing and to try replication. PosDB [
          <xref ref-type="bibr" rid="ref8 ref9">7, 8</xref>
          ] is a good candidate for this step.
Input: cost model, automatic advisor.
        </p>
        <p>Output: implementation in a column-store system.</p>
      </sec>
      <sec id="sec-3-4">
        <title>Evaluation Plan and Preliminary Results</title>
        <p>Given the problem and the research method, we can compose the following plan:
1. Select a suitable RDF management system that supports characteristic sets.</p>
        <p>Modify its source code to gather information regarding CS table usage:
relative \heat" for each relation, number of scanned rows, number of rows passed
through ltering predicates and so on.
2. Perform evaluation for some standard benchmark data (currently we aim for</p>
        <p>
          LUBM [
          <xref ref-type="bibr" rid="ref14">13</xref>
          ]). Study the results, outline bene cial partitioning schemes.
3. Consider these schemes and propose partitioning methods that lead to such
schemes. Try these schemes on the original benchmark data to ensure
performance improvement over unpartitioned case. On this step we would have
to modify processing engine to support value-based partition pruning. Also,
perform an additional round of validation by using several other datasets.
4. Design an advisor that recommends an application of the developed
partitioning methods. Compare its output (partitioned con gurations) with the
unpartitioned case and some naive approaches.
5. Finally, assess the performance of our partitioning methods using a suitable
column-store system in centralized and distributed cases.
        </p>
        <p>So far we have performed a survey of RDF processing systems and
formulated a general idea of our approach. Using this survey we have demonstrated
its novelty and discussed its prospects. We have also selected systems that are
candidates to be used during the rst steps of our study.
6</p>
      </sec>
      <sec id="sec-3-5">
        <title>Summary and Future Work</title>
        <p>Designing Systems for processing RDF data is a highly active area of research.
Both academy and industry are interested in development of an e cient RDF
query engine. There are more than thirty systems built on a di erent principles
and exploiting various ideas.</p>
        <p>Characteristic sets is a promising approach to speed-up the processing RDF
queries that involve selection of entities which satisfy some predicates. Its core
idea is to detect star patterns in the graph data and translate them into dedicated
relational tables. Then, these tables can be combined with each other.</p>
        <p>In this paper we have outlined a plan for our study that aims to devise
a partitioning method for characteristic sets. We have presented a comparison
table of existing approaches and brie y discussed them. Then, we described a
justi cation of our approach and presented possible directions of our project.</p>
        <p>In the future we plan to continue our studies by executing the steps outlined
in the Section 4.
[19] Hose, K., Schenkel, R.: WARP: Workload-Aware Replication and
Partitioning for RDF. In: ICDEW'13. pp. 1{6
[20] Huang, J., Abadi, D.J., Ren, K.: Scalable SPARQL Querying of Large RDF</p>
        <p>Graphs. PVLDB 4, 1123{1134 (2011)
[21] Idreos, S., et al.: MonetDB: Two Decades of Research in Column-oriented</p>
        <p>Database Architectures. IEEE Data Eng. Bull. 35(1), 40{45 (2012)
[22] Lee, K., Liu, L.: Scaling Queries over Big RDF Graphs with Semantic Hash</p>
        <p>Partitioning. Proc. VLDB Endow. 6(14), 1894{1905 (Sep 2013)
[23] Meimaris, M., Papastefanatos, G., Mamoulis, N., Anagnostopoulos, I.:
Extended Characteristic Sets: Graph Indexing for SPARQL Query
Optimization. In: ICDE'17. pp. 497{508
[24] Meimaris, M., Papastefanatos, G.: Hierarchical
Characteristic Set Merging,
https://2018.eswc-conferences.org/wpcontent/uploads/2018/02/ESWC2018 paper 126.pdf
[25] Neumann, T., Moerkotte, G.: Characteristic Sets: Accurate Cardinality
Estimation for RDF Queries with Multiple Joins. In: ICDE'11. pp. 984{994
[26] Neumann, T., Weikum, G.: The RDF-3X Engine for Scalable Management
of RDF Data. The VLDB Journal 19(1), 91{113 (Feb 2010)
[27] Papailiou, N., et al.: H2RDF+: High-performance Distributed Joins over</p>
        <p>Large-scale RDF Graphs. In: ICBD'13. pp. 255{263
[28] Peng, P., et al.: Processing SPARQL Queries over Distributed RDF Graphs.</p>
        <p>The VLDB Journal 25(2), 243{268 (Apr 2016)
[29] Pham, M., Boncz, P.A.: Exploiting Emergent Schemas to Make RDF
Systems More E cient. In: ISWC'16. pp. 463{479
[30] Pham, M., Passing, L., Erling, O., Boncz, P.: Deriving an Emergent
Relational Schema from RDF Data. In: WWW'15. pp. 864{874
[31] Ravindra, P., Deshpande, V.V., Anyanwu, K.: Towards Scalable RDF</p>
        <p>Graph Analytics on MapReduce. In: MDAC'10. pp. 5:1{5:6
[32] Rohlo , K., Schantz, R.E.: Clause-iteration with MapReduce to Scalably</p>
        <p>Query Datagraphs in the SHARD Graph-store. In: DIDC'11. pp. 35{44
[33] Schatzle, A., Przyjaciel-Zablocki, M., Berberich, T., Lausen, G.: S2X:
Graph-Parallel Querying of RDF with GraphX. In: Biomedical Data
Management and Graph Online Querying. pp. 155{168
[34] Schatzle, A., Przyjaciel-Zablocki, M., Lausen, G.: PigSPARQL: Mapping</p>
        <p>SPARQL to Pig Latin. In: SWIM'11. pp. 4:1{4:8
[35] Schatzle, A., Przyjaciel-Zablocki, M., Neu, A., Lausen, G.: Sempala:
Interactive SPARQL Query Processing on Hadoop. In: ISWC'14. pp. 164{179
[36] Schatzle, A., Przyjaciel-Zablocki, M., Skilevic, S., Lausen, G.: S2RDF: RDF</p>
        <p>Querying with SPARQL on Spark. PVLDB 9(10), 804{815 (2016)
[37] Stonebraker, M., et al.: C-store: A Column-oriented DBMS. In: VLDB'05.</p>
        <p>pp. 553{564
[38] Weiss, C., Karras, P., Bernstein, A.: Hexastore: Sextuple Indexing for
Semantic Web Data Management. PVLDB 1(1), 1008{1019 (Aug 2008)
[39] Yang, S., Yan, X., Zong, B., Khan, A.: Towards E ective Partition
Management for Large Graphs. In: SIGMOD'12. pp. 517{528
[40] Zeng, K., et al.: A Distributed Graph Engine for Web Scale RDF Data. In:
PVLDB'13. pp. 265{276</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <volume>2 2 2 2 2 2 2 t 2 2 p 2 2 2 2 2 2 2 2 2 2 2 2</volume>
          a r e
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Abadi</surname>
            ,
            <given-names>D.J.</given-names>
          </string-name>
          , et al.:
          <article-title>Scalable Semantic Web Data Management Using Vertical Partitioning</article-title>
          .
          <source>In: VLDB'07</source>
          . pp.
          <volume>411</volume>
          {
          <fpage>422</fpage>
          . VLDB Endowment
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Abadi</surname>
            ,
            <given-names>D.J.</given-names>
          </string-name>
          , et al.:
          <article-title>The Design and Implementation of Modern ColumnOriented Database Systems</article-title>
          . Now Publishers Inc. (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Abdelaziz</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          , et al.:
          <article-title>A Survey and Experimental Comparison of Distributed SPARQL Engines for Very Large RDF Data</article-title>
          .
          <source>PVLDB</source>
          <volume>10</volume>
          (
          <issue>13</issue>
          ),
          <year>2049</year>
          {
          <year>2060</year>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Bornea</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          , et al.:
          <article-title>Building an E cient RDF Store over a Relational Database</article-title>
          .
          <source>In: SIGMOD'13</source>
          . pp.
          <volume>121</volume>
          {
          <fpage>132</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Carroll</surname>
            ,
            <given-names>J.J.</given-names>
          </string-name>
          , et al.:
          <article-title>Jena: Implementing the Semantic Web Recommendations</article-title>
          .
          <source>In: WWW Alt.'04</source>
          . pp.
          <volume>74</volume>
          {
          <fpage>83</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Chernishev</surname>
            ,
            <given-names>G.A.</given-names>
          </string-name>
          :
          <article-title>A Survey of DBMS Physical Design Approaches</article-title>
          .
          <source>Tr. St. Petersburg Inst. Infor. Avtom. Ross. Akad. Nauk SPIIRAN 24</source>
          ,
          <issue>222</issue>
          {276, http://www.proceedings.spiiras.nw.ru/ojs/index.php/sp/index
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Chernishev</surname>
            ,
            <given-names>G.A.</given-names>
          </string-name>
          , et al.:
          <article-title>PosDB: An Architecture Overview</article-title>
          .
          <source>Programming and Computer Software</source>
          <volume>44</volume>
          (
          <issue>1</issue>
          ),
          <volume>62</volume>
          {74 (Jan
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Chernishev</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          , et al.:
          <article-title>PosDB: A Distributed Column-store Engine</article-title>
          .
          <source>In: Perspectives of System Informatics</source>
          . pp.
          <volume>88</volume>
          {
          <issue>94</issue>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Du</surname>
            ,
            <given-names>J.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>H.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ni</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>HadoopRDF: A Scalable Semantic Data Analytical Engine"</article-title>
          .
          <source>In: Intelligent Computing Theories and Applications</source>
          . pp.
          <volume>633</volume>
          {
          <issue>641</issue>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Erling</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mikhailov</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Virtuoso: RDF Support in a Native RDBMS</article-title>
          , pp.
          <volume>501</volume>
          {
          <issue>519</issue>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Galarraga</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hose</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schenkel</surname>
          </string-name>
          , R.:
          <article-title>Partout: A Distributed Engine for E cient RDF Processing</article-title>
          .
          <source>In: WWW'14</source>
          . pp.
          <volume>267</volume>
          {
          <fpage>268</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Goasdou</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          , et al.:
          <article-title>CliqueSquare: Flat Plans for Massively Parallel RDF Queries</article-title>
          .
          <source>In: ICDE'15</source>
          . pp.
          <volume>771</volume>
          {
          <fpage>782</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Guo</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pan</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          , He in, J.:
          <article-title>LUBM: A Benchmark for OWL Knowledge Base Systems</article-title>
          .
          <source>Web Semant</source>
          .
          <volume>3</volume>
          (
          <issue>2-3</issue>
          ),
          <volume>158</volume>
          {182 (Oct
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Gurajada</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Seufert</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Miliaraki</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Theobald</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>TriAD: A Distributed Shared-nothing RDF Engine Based on Asynchronous Message Passing</article-title>
          . In: SIGMOD'14. pp.
          <volume>289</volume>
          {
          <fpage>300</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Hammoud</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , et al.:
          <article-title>DREAM: Distributed RDF Engine with Adaptive Query Planner and Minimal Communication</article-title>
          .
          <source>PVLDB 8</source>
          ,
          <issue>654</issue>
          {
          <fpage>665</fpage>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Harbi</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , et al.:
          <article-title>Accelerating SPARQL Queries by Exploiting Hash-based Locality and Adaptive Partitioning</article-title>
          .
          <source>The VLDB Journal</source>
          <volume>25</volume>
          (
          <issue>3</issue>
          ),
          <volume>355</volume>
          {380 (Jun
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Harris</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lamb</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shadbolt</surname>
          </string-name>
          , N.:
          <article-title>4store: The Design and Implementation of a Clustered RDF Store</article-title>
          .
          <source>In: SSWS'09</source>
          . pp.
          <volume>94</volume>
          {
          <fpage>109</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Harth</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Umbrich</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hogan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Decker</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>YARS2: A Federated Repository for Querying Graph Structured Data from the Web</article-title>
          .
          <source>In: SW'17</source>
          . pp.
          <volume>211</volume>
          {
          <fpage>224</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>