<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A CONSTRUCT-based Query for Weighted RDF Graph Analytics</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Xin Song</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Shizhan Chen</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Xiaowang Zhang?</string-name>
          <email>xiaowangzhang@tju.edu.cn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Zhiyong Feng</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>College of Intelligence and Computing, Tianjin University</institution>
          ,
          <addr-line>Tianjin</addr-line>
          ,
          <country>China Tianjin</country>
          <institution>Key Laboratory of Cognitive Computing and Application</institution>
          ,
          <addr-line>Tianjin</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper, based on SPARQL CONSTRUCT query, we present a new query form called SPARQL Analyze Query, which integrates graph analytics with graph querying. SPARQL Analyze Query extends SPARQL by performing graph algorithms via WITH and PRODUCE operators, where WITH is used for invoking a graph algorithm and PRODUCE for designating results of variables. SPARQL Analyze Query supports two classes of graph algorithms over the weighted RDF graph w.r.t. nodes and shortest paths. Besides, we illustrate the feasibility of the SPARQL Analyze Query in expressing PageRank.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        (G; !) where ! denotes the weight function, and shortest path introduced
in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Especially, we only consider the weight function on edges ! : E ! R 0.
      </p>
      <p>Inspired by Neo4j, SPARQL Analyze Query supports two classes of graph
algorithms w.r.t. nodes and shortest paths, which are PageRank (PR),
Betweenness Centrality (BC), Closeness Centrality (CC), Shortest Path (SP), Single
Source Shortest Path (SSSP) and All Pair Shortest Path (ALSP).
2</p>
    </sec>
    <sec id="sec-2">
      <title>Motivating Example</title>
      <p>
        We provide an example of expressing PageRank over a social network to
directly illustrate how SPARQL Analyze Query performs. For clarity, the following
example is based on our semantics introduced in Section 4 while the SPARQL
CONSTRUCT query introduced in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] cannot output a weighted RDF graph.
      </p>
      <p>The following CONSTRUCT query Qc generates a social network which
consists of all people that ?p1 knows. The generated weighted RDF graph is shown
in Figure 2 where A, B, : : :, H are nicknames of each person respectively.</p>
      <sec id="sec-2-1">
        <title>CONSTRUCT f ?p1 knows ?p2.g</title>
        <p>WHERE f
?m knows ?n. ?m nickName ?p1.</p>
        <p>?n nickName ?p2.
g</p>
        <p>Based on Figure 1, we provide a SPARQL Analyze Query with PageRank
algorithm in Figure 3 that reveals the person whose nickname is E has the
greatest interpersonal in uence in the social network.</p>
        <p>A CONSTRUCT-based Query for Weighted RDF Graph Analytics</p>
      </sec>
      <sec id="sec-2-2">
        <title>Qc WITH PR(f?p1,?p2g,fknowsg,10,0.85)</title>
        <p>PRODUCE ?node ?pr
ORDER BY DESC (?pr)
LIMIT 3
?node ?pr</p>
        <p>E 0.776535
G 0.604325
H 0.579038
Given a CONSTRUCT query Qc, a SPARQL Analyze query QA is de ned as
follows:</p>
        <p>Qc WITH C PRODUCE v sq [ORDER BY] [LIMIT]
where
{ Qc is a SPARQL CONSTRUCT query;
{ C is an analyzing clause of one of the following six forms:</p>
        <p>Given a weighted RDF graph S, let NH and EH be sets of nodes and edges
occurring in H respectively, where nodes and edges can be variables or
constants.</p>
        <p>PR : PR (NH ; EH ; n; dF ), where n 2 N+ is the number of iterations in
PageRank and dF 2 (0; 1) is the damping factor;
BC : BC (NH ; EH );
CC : CC (NH ; EH );
SP : SP (NH ; EH ; u; v) where u; v 2 I [ L;
SSSP : SSSP (NH ; EH ; u) where u 2 I [ L;</p>
        <p>ALSP : ALSP (NH ; EH );
{ PRODUCE v sq designates the output of a SPARQL Analyze Query of
variables, where v sq is de ned as a sequence of two variables (v1; v2), v1; v2 2
V , and the parentheses () denote an ordered sequence of variables;
{ [ORDER BY] and [LIMIT] are standard SPARQL fragments. They are
optional to use for controlling the order and the number of solutions.
4</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Semantics of SPARQL Analyze Query</title>
      <p>
        To be compatible with weighted RDF graphs, we extend the semantics
introduced in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] by means of a weight w 2 R 0. Given a weighted RDF graph S, the
evaluation of a SPARQL graph pattern P over S is denoted by JP KS.
      </p>
      <p>We introduce some new terminology for SPARQL Analyze Query. Let VN 2
V and VE 2 V be the sets of disjoint node and edge variables respectively.
Given a node u, a shortest path , and let A be the set of real numbers. We
have two mappings : VN ! N , : VE ! E and two functions f : u ! a
and g : ! a, where f 2 F , g 2 G and a 2 A. F = fP R; BC; CCg and
G = fSP; SSSP; ALSP g are two function sets where each function represents
the computation of a graph algorithm. For example, the betweenness centrality
of u is obtained by BC(u).</p>
      <p>For a SPARQL Analyze Query QA, we say that Q = Qc WITH C is the
analytical part of QA. Given a graph pattern P , a weighted RDF graph S, a
CONSTRUCT query Qc and an analyzing clause C. The semantics of an
analytical part Q of QA returns a set of pairs , denoted by JQKS , de ned as
follows:</p>
      <p>Q S =
J K
8&gt;f(u; f (u)) j
&lt;
&gt;:f( ; g( )) j ;
2 JP KS ; f (u) 2 Ag</p>
      <p>On the above formalization and given an ordered sequence of variables v sq.
The semantics of a SPARQL Analyze Query QA, denoted by JQAKS , is de ned
as follows:
where
jv1;v2 is the restriction of</p>
      <p>JQAKS = JQ PRODUCE v sqKS = f j(v1;v2) j 2 JQKS g</p>
      <p>to (V; V ) correspondingly of its domain.
5</p>
    </sec>
    <sec id="sec-4">
      <title>Conclusion</title>
      <p>In this paper, we present SPARQL Analyze Query which integrates graph
querying with graph analytics. Our approach provides a logical foundation for
integrating querying with analytics, which makes it possible to study graph querying
and analytics in a uni ed way theoretically. In future work, we will study some
fundamental problems of SPARQL Analyze Query such as expressivity and
complexity as well as its generalization of more graph algorithms.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>This work is supported by the National Key Research and Development
Program of China (2017YFC0908401) and the National Natural Science Foundation
of China (61972455, 61672377). Xiaowang Zhang is supported by the Peiyang
Young Scholars in Tianjin University (2019XRX-0032).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Francis</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Green</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guagliardo</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Libkin</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          , et al.:
          <article-title>Cypher: An evolving query language for property graphs</article-title>
          .
          <source>In: Proc. of SIGMOD</source>
          , pp.
          <volume>1433</volume>
          {
          <issue>1445</issue>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Ereteo</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gandon</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bu</surname>
            <given-names>a</given-names>
          </string-name>
          , M.:
          <article-title>Semantic social network analysis</article-title>
          .
          <source>In: Proc. of WebSci</source>
          , pp.
          <volume>1</volume>
          {
          <issue>5</issue>
          (
          <year>2009</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Techentin</surname>
            ,
            <given-names>R.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gilbert</surname>
            ,
            <given-names>B.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lugowski</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Deweese</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          , et al.:
          <article-title>Implementing iterative algorithms with SPARQL</article-title>
          .
          <source>In: Proc. of ICDT</source>
          , pp.
          <volume>216</volume>
          {
          <issue>223</issue>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Mizell</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maschho</surname>
            <given-names>K.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reinhardt</surname>
            ,
            <given-names>S.P.</given-names>
          </string-name>
          :
          <article-title>Extending SPARQL with graph functions</article-title>
          .
          <source>In: Proc. of BigData</source>
          , pp.
          <volume>46</volume>
          {
          <issue>53</issue>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Kostylev</surname>
            ,
            <given-names>E.V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reutter</surname>
            ,
            <given-names>J.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ugarte</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>CONSTRUCT queries in SPARQL</article-title>
          .
          <source>In: Proc. of ICDT</source>
          , pp.
          <volume>212</volume>
          {
          <issue>229</issue>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Tartari</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hogan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>WiSP: Weighted shortest paths for RDF graphs</article-title>
          .
          <source>In: Proc. of ISWC</source>
          , pp.
          <volume>37</volume>
          {
          <issue>52</issue>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>