<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Towards Memory-E cient Answering of Tree-Shaped SPARQL Queries Using GPUs</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jedrzej Potoniec</string-name>
          <email>jpotoniec@cs.put.poznan.pl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michal Madziar</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Poznan University of Technology</institution>
          ,
          <country country="PL">Poland</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <addr-line>nSense</addr-line>
          <country country="PL">Polska</country>
        </aff>
      </contrib-group>
      <fpage>2</fpage>
      <lpage>5</lpage>
      <abstract>
        <p>We present an idea of e cient query answering over an RDF dataset employing a consumer-grade graphic card for an e cient computation. We consider tree-shaped SPARQL queries and static datasets, to facilitate data mining over RDF graphs in warehouse-like setups. Reasons to see the poster: a) presentation of the approach with examples; b) possibility of discussion about the implementation details.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
    </sec>
    <sec id="sec-2">
      <title>Related work</title>
      <p>
        Di erent aspects of processing semantic data on a graphics processing units
(GPUs) have been considered over a past few years. In the work [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], the authors
sketch a solution combining GPUs and CPUs to perform a fast aggregation and
a basic reasoning over RDF streams. In [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] a problem of an e cient parallel
RDFS rule-based reasoning is tackled, especially with consideration for
duplicated triples caused by multiple ways of inferring the same result. In [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] there
are considered methods for RDF indexing based on B+ trees and optimized for
NVIDIA CUDA platform3. Authors of [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] analyzed parallel join algorithms,
which are used to combine multiple SPARQL triple patterns into a solution for
a query. [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] employs the Rete Match algorithm to perform parallel rule-based
reasoning over RDFS knowledge bases.
      </p>
      <p>There are numerous commercial-grade RDF stores available on the market,
such as GraphDB 4, Virtuoso5 or Allegro Graph6, to just name a few. Yet, we
are not aware of any such a store employing computational power of GPUs.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Approach</title>
      <p>In this section we present a detailed description of our work. For the
interested readers, the implementation is made available in the GitHub repository
https://github.com/jpotoniec/MagicStore.
3.1</p>
      <p>
        Loading phase
As data are loaded once and queried multiple times, we prefer to compute as
much as possible during the loading. This phase consists of two steps: building
dictionaries, when we apply Hu man coding [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] to URIs to t as many triples as
possible into limited amount of graphics card memory; and building the index,
when the triples are sorted and packed into tight memory structures to allow for
an e cient search.
      </p>
      <p>In the step of building dictionaries, the number of occurrences of every URI
in the data is counted. To obtain shorter codes, the URIs occurring as predicates
are counted separately from those occurring as subjects or objects. The results
are then used to build dictionaries of Hu man codes. As this step employs only
sequential reading and counting, it does not require a lot of memory.</p>
      <p>In the step of building the index, the data are read again, and now every
triple is coded as a triple of Hu man codes. The triples are then packed into
a three-level index structure, as depicted in the Figure 1. The upper two levels
consist of consecutive cells, each containing a Hu man code and a pointer to the
next level, and cells in the bottom level contains only Hu man codes. Because
of the characteristics of the queries and the employed search algorithm, the rst
level contains predicates, the second objects and the third subjects.</p>
      <p>As the codes vary in length, so do the cells in the index. In every cell, the
rst two bits of the rst byte indicate length of the code in full bytes. Such an
approach wastes small amount of memory, because the codes rarely occupy full
bytes. We tested a scenario, where the pointer starts in the next bit after the
3 http://www.nvidia.com/object/cuda_home_new.html
4 http://ontotext.com/products/ontotext-graphdb/
5 http://virtuoso.openlinksw.com/
6 http://franz.com/agraph/allegrograph/
code, but it rendered out to be ine cient due to an additional overhead of bit
shifting on GPUs.</p>
      <p>To limit memory usage, triples are loaded in batches. During reading, triples
are coded and stored in a vector. After the vector exceeds certain size, it is
sorted, transformed into an index and merged with the index built so far.
We employed a fairly simple, bottom-up query answering algorithm, based on
iterating over the index built in the previous section. First, labels in a tree
corresponding to a posed query are coded and then the Algorithm 1 is called for
the root of the tree. Finally, returned values are decoded back to URIs.</p>
      <p>
        Parallel for in the line 12 is realized as multiple calls of an OpenCL kernel [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ],
that looks for a particular predicate and object in the index and returns range of
addresses in the third level of the index. This way only the rst two levels of the
index are required to be available to the GPU, what takes into account rather
limited amount of memory available to the GPU. Merge in the line 14 is also
implemented in parallel using merge part from the bitonic sort algorithm [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Conclusions and future work</title>
      <p>In this paper, we presented a system for answering tree-shaped SPARQL queries
over a static RDF dataset, a setup useful for semantic data mining. We believe
that the presented system can be put to use as a local RDF store backing
semantic data mining or decision support operations. We also think that this approach
could be combined with a typical RDF store to provide a hybrid system, that
uses GPU computations whenever possible.</p>
      <p>In the future, we would like to compare it to normal RDF stores to ensure that
this specialized solution really performs better. We also plan to extend supported
SPARQL subset to literals and basic FILTER expressions (e.g. comparisons with
a constant value).</p>
      <p>Acknowledgement. Jedrzej Potoniec acknowledges support of Polish National
Science Center, grant DEC-2013/11/N/ST6/03065.</p>
      <p>1 Function Answer(n)
2 p a predicate labelling an arc leading to n
3 if n has no children then
4 if label of n is an URI then
5 return subjects occurring with a predicate p and an object labelling n
6 else
7 return subjects occurring with a predicate p and any object
8 else
9
10
11
12
13
14</p>
      <p>O = Tc2children of n Answer(c)
if O = ; then return ;
S = ;
for o 2 O do in parallel</p>
      <p>S S [ fsubjects occurring with a predicate p and an object og
return merged sets from S</p>
      <p>Algorithm 1: The basic query answering algorithm.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Cormen</surname>
          </string-name>
          , T.H.:
          <article-title>Introduction to algorithms</article-title>
          . MIT press (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Glimm</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ogbuji</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>SPARQL 1.1 entailment regimes</article-title>
          .
          <source>W3C recommendation</source>
          ,
          <source>W3C (Mar</source>
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Heino</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pan</surname>
            ,
            <given-names>J.Z.</given-names>
          </string-name>
          :
          <article-title>RDFS Reasoning on Massively Parallel Hardware</article-title>
          . In: CudrMauroux,
          <string-name>
            <surname>P.</surname>
          </string-name>
          , He in, J., et al. (eds.)
          <source>The Semantic Web { ISWC 2012, Lecture Notes in Computer Science</source>
          , vol.
          <volume>7649</volume>
          , pp.
          <volume>133</volume>
          {
          <fpage>148</fpage>
          . Springer Berlin Heidelberg
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Hu</surname>
            <given-names>man</given-names>
          </string-name>
          , D.:
          <article-title>A Method for the Construction of Minimum-Redundancy Codes</article-title>
          .
          <source>Proceedings of the IRE</source>
          <volume>40</volume>
          (
          <issue>9</issue>
          ),
          <volume>1098</volume>
          {1101 (Sept
          <year>1952</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Lawrynowicz</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Potoniec</surname>
          </string-name>
          , J.:
          <article-title>Pattern based feature construction in semantic data mining</article-title>
          .
          <source>Int. J. Semantic Web Inf. Syst</source>
          .
          <volume>10</volume>
          (
          <issue>1</issue>
          ),
          <volume>27</volume>
          {
          <fpage>65</fpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Makni</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <string-name>
            <surname>Optimizing RDF Stores by Coupling</surname>
          </string-name>
          General-purpose
          <source>Graphics Processing Units and Central Processing Units</source>
          . In: Aroyo,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Noy</surname>
          </string-name>
          ,
          <string-name>
            <surname>N.F</surname>
          </string-name>
          . (eds.)
          <source>Proc. of the Doctoral Consortium co-located with 12th Int. Semantic Web Conf. (ISWC</source>
          <year>2013</year>
          ).
          <source>CEUR Workshop Proceedings</source>
          , vol.
          <volume>1045</volume>
          , pp.
          <volume>40</volume>
          {
          <issue>47</issue>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Munshi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>The OpenCL Speci cation</article-title>
          .
          <source>Version 1.2. Tech. rep., The Khronos Group (Nov</source>
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Peters</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brink</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , et al.:
          <article-title>Rule-based Reasoning on Massively Parallel Hardware</article-title>
          . In: Liebig,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Fokoue</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . (eds.)
          <source>Proc. of the 9th Int. Workshop on Scalable Semantic Web Knowledge Base Systems</source>
          , Sydney, Australia, October
          <volume>21</volume>
          ,
          <year>2013</year>
          .
          <source>CEUR Workshop Proceedings</source>
          , vol.
          <volume>1046</volume>
          , pp.
          <volume>33</volume>
          {
          <fpage>49</fpage>
          .
          <string-name>
            <surname>CEUR-WS.org</surname>
          </string-name>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Prud'hommeaux</surname>
          </string-name>
          , E.,
          <string-name>
            <surname>Seaborne</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>SPARQL query language for RDF. W3C recommendation</article-title>
          ,
          <source>W3C (Jan</source>
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Sankar</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Singh</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , et al.:
          <article-title>An E cient and Scalable RDF Indexing Strategy based on B-Hashed-Bitmap Algorithm using CUDA</article-title>
          .
          <source>Int. J. of Computer Applications</source>
          <volume>104</volume>
          (
          <issue>7</issue>
          ),
          <volume>31</volume>
          {38 (
          <year>October 2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Senn</surname>
          </string-name>
          , J.:
          <article-title>Parallel Join Processing on Graphics Processors for the Resource Description Framework</article-title>
          . In: Beigl,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Cazorla-Almeida</surname>
          </string-name>
          ,
          <string-name>
            <surname>F.J</surname>
          </string-name>
          . (eds.) ARCS '
          <fpage>10</fpage>
          - 23th
          <source>Int. Conf. on Architecture of Computing Systens</source>
          <year>2010</year>
          ,
          <source>Workshop Proceedings, February 22-23</source>
          ,
          <year>2010</year>
          , Hannover, Germany. pp.
          <volume>23</volume>
          {
          <fpage>30</fpage>
          . VDE Verlag (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>