=Paper= {{Paper |id=Vol-1644/paper12 |storemode=property |title=An RDF Vocabulary for Meta-Programming with SPARQL Algebra |pdfUrl=https://ceur-ws.org/Vol-1644/paper12.pdf |volume=Vol-1644 |authors=Miguel Ceriani,Alejandro A. Vaisman |dblpUrl=https://dblp.org/rec/conf/amw/CerianiV16 }} ==An RDF Vocabulary for Meta-Programming with SPARQL Algebra== https://ceur-ws.org/Vol-1644/paper12.pdf
           An RDF Vocabulary for Meta-Programming
                   with SPARQL Algebra

                         Miguel Ceriani1,2 and Alejandro Vaisman2
                              1
                               Sapienza, Università di Roma, Italy
                               ceriani@di.uniroma1.it
                     2
                       Instituto Tecnológico de Buenos Aires, Argentina
                 mceriani@iitba.edu.ar, avaisman@itba.edu.ar



         Abstract. SPARQL queries can be represented in RDF in order to work with
         them as first-class citizen in Semantic Web applications. The representation used
         so far, SPIN SPARQL, closely follows the abstract syntax tree, but fails to di-
         rectly capture the semantics and is hard to work with for query rewriting. Here a
         novel vocabulary based on SPARQL Algebra is presented. The adequacy of the
         language is empirically shown through a few examples.


1     Introduction
SPARQL [2] is the standard query language for the Semantic Web and is defined on
the RDF data model. It is used to access and modify remote data stores as well as local
data, as part of the logic of a Semantic Web application. SPIN SPARQL syntax [1] has
been proposed as an RDF representation of SPARQL. Such syntax allows to consis-
tently store SPARQL queries together with the domain model. SPARQL expressions
can thus be used together with RDF/OWL models to define complex constraints, rules,
data visualizations, example queries, and so on. Furthermore, this triple-based repre-
sentation opens the way to meta-programming techniques, in which this queries can be
manipulated, rewritten, composed by the same means used for RDF.
    The SPIN SPARQL syntax mimics quite closely the abstract syntax tree, provid-
ing an easy way to convert from and to the SPARQL textual syntax. Unfortunately the
structural relationship between SPARQL syntax and semantics is not so immediate. As
a fact, SPARQL semantics are defined in a two-step process. At first, a query is con-
verted from its SPARQL syntax to another structure, composed with a set of operators
that are called the SPARQL Algebra. The semantics are then operationally defined for
the operators in the SPARQL Algebra. Our thesis is that the operations that have to do
with query semantics would be simpler to express if the queries were represented by a
vocabulary related to SPARQL Algebra instead of SPARQL syntax.


2     Related Work: SPIN SPARQL Syntax
The SPIN SPARQL vocabulary 3 is designed to represent a SPARQL query as a tree
of RDF nodes. The top-level node of a query has type sp:Query, specifically is an
 3
     The namespace is http://spinrdf.org/sp# and here is abbreviated with sp:
2

instance of one of its subclasses, such as sp:Select and sp:Construct. The ele-
ments in the WHERE clause (e.g., a FILTER clause or a triple pattern) are represented
as instances of subclasses of sp:Element or rdf:List instances for grouping ele-
ment lists. Listing 1 shows an example of a SPARQL query, while listing 2 shows its
representation using SPIN SPARQL syntax.
                               Listing 1. Example of SPARQL Query
SELECT *
WHERE { OPTIONAL { ?p foaf:givenName ?name }.
         FILTER(?age > 18). ?p foaf:age ?age.
         OPTIONAL { ?p foaf:familyName ?surname }. }


      Listing 2. Query of Listing 1 represented in SPIN SPARQL syntax (some details omitted)
[ a sp:Select;
  sp:where ( [ a sp:Optional; sp:elements ( [ a sp:TriplePattern; ... ] ) ]
             [ a sp:Filter;    sp:expression [ ... ] ]
             [ a sp:TriplePattern; ... ]
             [ a sp:Optional; sp:elements ( [ a sp:TriplePattern; ... ] ) ] ) ].




3      Proposed Syntax
We propose a new vocabulary to represent the SPARQL Algebra in RDF, available
as RDF Schema at http://meta-sparql.org/vocab/spa. This URL is also the
base URI for vocabulary elements, in this paper abbreviated as spa:. Each kind of op-
erator in the SPARQL Algebra is represented by a corresponding class (e.g. spa:BGP,
spa:Filter). The spa:subOp property connect an operator and its inputs (other oper-
ators). For the binary not commutative operators –spa:LeftJoin and spa:Minus, the
property spa:subOp is specialized by the properties spa:leftOp and spa:rightOp.
Classes and properties from SPIN SPARQL are used when applicable: to represent ex-
pressions, variables, and triple patterns/paths/templates.4
    The proposed syntax has by definition the same expressive power of SPARQL Al-
gebra, that in turn has the same expressive power of both SPARQL textual syntax and
SPIN SPARQL. As SPARQL semantics are defined through SPARQL Algebra, this
syntax is well suited when query semantics must be considered.
     Listing 3. Query of Listing 1 represented in the new proposed syntax (some details omitted)

[ a spa:Filter;
  spa:filterExpression [ ... ];
  spa:subOp
    [ a spa:LeftJoin;
      spa:leftOp
        [ a spa:Join;
          spa:subOp [ a spa:BGP; spa:tuple [ a sp:TriplePattern; ... ] ],
                    [ a spa:BGP; spa:tuple [ a sp:TriplePattern; ... ] ] ];
      spa:rightOp [ a spa:BGP; spa:tuple [ a sp:TriplePattern; ... ] ]      ] ].

    In Listing 2, the semantics of the query are not so evident. The elements (instances
of sp:Optional, sp:Filter and so on) are apparently all on the same level and one
 4
     Moreover, we require each variable to be identified by a unique RDF resource so that variable
     identity can be directly evaluated.
                                                                                         3

may be tempted to think that the order between them may be safely changed. Con-
versely, the opposite is true: the order of the sequence (represented as an rdf:List) is
critical for the semantics of the query. That means that any task related to the behavior
of the query needs to take in account the ordering of elements.
     In Listing 3 the tree corresponds directly to the semantics of the query. There is
no use of rdf:List or other generic sequences because the semantics are given by
the structure of algebra operators. Meta-queries and transformations that deal with the
meaning of queries can thus be more easily designed. In the next Section some empirical
evidence of this advantage is presented.


4   Examples of Meta-SPARQL
We will show some examples in which our proposal is compared with SPIN SPARQL
(the only other RDF representation of SPARQL to date). So far there is not much doc-
umented use of SPIN SPARQL for meta-programming, so the comparison is made
through examples we designed for a couple of use cases. We will in all the cases imply
that RDFS inferencing is in place to simplify the queries.
    Studies have been carried on statistics to gather on queries in order to classify them
somehow (usually to predict their performance when executed). The Linked SPARQL
Queries Dataset (LSQ) [3] is specially noteworthy for the recollection of both query
structure –represented through the SPIN SPARQL syntax– and statistics from logs of
a variety of public SPARQL endpoints. The following queries (re)calculate one of the
query statistics considered in LSQ, the number of Basic Graph Patterns (BGPs) in each
query. Even if this one of the most relatively straightforward statistics, the complexity
of using SPIN SPARQL syntax is apparent. For other statistics (e.g., variables that par-
ticipate in a join) queries using the SPIN SPARQL syntax becomes more complex still,
due to the number of different cases that must be considered.
       Listing 4. Meta-query counting the BGPs in each query (SPIN SPARQL syntax)
SELECT ?query (COUNT(?tupleStartingBgp) AS ?numOfBgps)
WHERE { ?query ((sp:elements|sp:where)?/rdf:rest*/rdf:first)* ?tupleStartingBgp.
        ?tupleStartingBgp a sp:Tuple.
        FILTER NOT EXISTS {
          ?previousTuple ˆrdf:first/rdf:rest/rdf:first ?tupleStartingBgp.
        } }


          Listing 5. Meta-query counting the BGPs in each query (proposed syntax)
SELECT ?query (COUNT(?bgp) AS ?numOfBgps)
WHERE { ?query spa:subOps* ?bgp.
        ?bgp a spa:BGP.          }

    As an example of query rewriting, we consider the use case of adding a filtering
step to a query. From a representation of the original query, in both cases we need to
create a new resource that refers to the original one and represents the added step.
        Listing 6. Turtle code to add a filtering step to a query (SPIN SPARQL syntax)
ex:queryWithFilter a sp:Select;
                   sp:where ( [ a sp:SubQuery;
                                sp:query ex:query ]
                              [ a sp:Filter;
                                sp:expression ex:filter ] ).
4

          Listing 7. Turtle code to add a filtering step to a query (proposed syntax)
ex:queryWithFilter a spa:Filter;
                   spa:subOp      ex:query;
                   spa:expression ex:filter.

    The case using SPIN SPARQL adds quite a deal of representational complexity to
the query due to the usage the subquery syntax. In the spa syntax only the related
functional operator is added and the query may be –if possible– further simplified by
additional UPDATE requests. As an example, the following UPDATE request may be
used as a rule to join adjacent filters. In a similar way, rules to push filters down the
tree when possible may be defined. Similar rules would be hard to write from SPIN
SPARQL, unless preceded by a complex set of rules that modify the syntax tree to
reflect the semantics (thus realizing in a way the conversion to SPARQL Algebra).
            Listing 8. UPDATE request to join adjacent filters (proposed syntax)
DELETE { ?filter1 spa:expression ?expr1;
                  spa:subOp      filter2. }
INSERT { ?filter1 spa:expression [ a sp:and; sp:arg ?expr1, ?expr2 ];
                  spa:subOp      ?subop.                              }
WHERE { ?filter1 a spa:Filter;
                  spa:expression ?expr1;
                  spa:subOp      ?filter2.
         ?filter2 a spa:Filter;
                  spa:expression ?expr2;
                  spa:subOp ?subop.      }




5   Conclusions and Future Work
This paper presented a vocabulary for representing SPARQL queries in RDF. Contrary
to the existing SPIN SPARQL vocabulary that follows the SPARQL abstract syntax
tree, this vocabulary –abbreviated as spa– follows the SPARQL Algebra, that is the
one against which the SPARQL operational semantics are defined. The spa vocabulary
may thus be used for all the use cases in which SPARQL queries have to be analyzed
or transformed according to their semantics. Based on some common use cases, an
empirical evalution is given.
    This a first step in exploring RDF-based syntaxes for queries alternative to SPIN
SPARQL. Having a syntax more amenable to meta-querying and meta-programming
may encourage developers to experiment with these techniques, to the advantage of the
whole Semantic Web programming community. Further evaluation may be carried on
by more extensively identifying a set of use cases and formally evaluating the usage of
this novel syntax.


References
 1. Fürber, C., Hepp, M.: Using SPARQL and SPIN for data quality management on the seman-
    tic web. In: Business Information Systems. pp. 35–46. Springer (2010)
 2. Harris, S., et al.: SPARQL 1.1 Query Language. W3C REC 21 March 2013
 3. Saleem, M., Ali, M.I., Hogan, A., Mehmood, Q., Ngomo, A.C.N.: LSQ: The Linked
    SPARQL Queries Dataset. In: Proc. of ISWC 2015. pp. 261–269. Springer (2015)